Re: [PATCH v5 1/2] drm/sched: Refactor ring mirror list handling.

2019-01-11 Thread Grodzovsky, Andrey


On 01/11/2019 02:11 PM, Koenig, Christian wrote:
> Am 11.01.19 um 16:37 schrieb Grodzovsky, Andrey:
>> On 01/11/2019 04:42 AM, Koenig, Christian wrote:
>>> Am 10.01.19 um 16:56 schrieb Grodzovsky, Andrey:
 [SNIP]
>>> But we will not be adding the cb back in drm_sched_stop anymore, now we
>>> are only going to add back the cb in drm_sched_startr after rerunning
>>> those jobs in drm_sched_resubmit_jobs and assign them a new parent
>>> there
>>> anyway.
>> Yeah, but when we find that we don't need to reset anything anymore
>> then adding the callbacks again won't be possible any more.
>>
>> Christian.
> I am not sure I understand it, can u point me to example of how this
> will happen ? I am attaching my latest patches with waiting only for
> the last job's fence here just so we are on same page regarding the code.
>>> Well the whole idea is to prepare all schedulers, then check once more
>>> if the offending job hasn't completed in the meantime.
>>>
>>> If the job completed we need to be able to rollback everything and
>>> continue as if nothing had happened.
>>>
>>> Christian.
>> Oh, but this piece of functionality - skipping HW ASIC reset in case the
>> guilty job done is totally missing form this patch series and so needs
>> to be added. So what you say actually is that for the case were we skip
>> HW asic reset because the guilty job did complete we also need to skip
>> resubmitting the jobs in drm_sched_resubmit_jobs and hence preserve the
>> old parent fence pointer for reuse ? If so I would like to add all this
>> functionality as a third patch since the first 2 patches are more about
>> resolving race condition with jobs in flight while doing reset - what do
>> you think ?
> Yeah, sounds perfectly fine to me.
>
> Christian.

I realized there is another complication now for XGMI hive use case, we 
currently skip gpu recover for adev in case another gpu recover for 
different adev in same hive is running, under the assumption that we are 
going to reset all devices in hive anyway because that should cover our 
own dev too. But if we chose to skip now HW asic reset if our guilty job 
did finish we will aslo not HW reset any other devices in the hive even 
if one of them might actually had a bad job, wanted to do gpu recover 
but skipped it because our recover was in progress in that time.
My general idea on that is to keep a list of guilty jobs per hive, when 
you start gpu recover you first add you guilty job to the hive and 
trylock hive->reset_lock. Any owner of hive->reset_lock (gpu recovery in 
progress) once he finished his recovery and released hive->reset_lock 
should go over hive->bad_jobs_list and if at least one of them is still 
not signaled (not done) trigger another gpu recovery and so on. If you 
do manage to trylock you also go over the list, clean it and perform 
recovery. This list probably needs to be protected with per hive lock.
I also think we can for now at least finish reviewing the first 2 
patches and submit them since as I said before they are not dealing with 
this topic and fixing existing race conditions. If you are OK with that 
I can send for review the last iteration of the first 2 patches where I 
wait for the last fence in ring mirror list.

Andrey

>
>> Andrey
> Andrey
>
>>> ___
>>> amd-gfx mailing list
>>> amd-gfx@lists.freedesktop.org
>>> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
> ___
> amd-gfx mailing list
> amd-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/amd-gfx

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


pitch alignment questions

2019-01-11 Thread Yu Zhao
On Fri, Jan 11, 2019 at 04:27:44PM +0100, Michel Dänzer wrote:
> On 2019-01-10 6:56 p.m., Przemek Socha wrote:
> >
> > [  147.846148] [drm:amdgpu_display_user_framebuffer_create [amdgpu]] 
> > Invalid 
> > pitch: expecting 10752 but got 10624
> > [  147.846155] [drm:drm_internal_framebuffer_create] could not create 
> > framebuffer"
> 
> Thanks, this confirms that the check is too strict. I've sent a patch
> reverting this as well.
> 
> 
> Yu, I like the idea behind your changes, but unfortunately it's more
> complicated than that. If you want to work on similar checks which
> accurately reflect the hardware constraints, people on the amd-gfx list
> should be able to help with that.

Hi Michel, sorry for the troubles.

Background: after we turned on iommu with amd_iommu=force_isolation,
we saw io page faults from amd gpu (stoney ridge). We tracked it
down to userspace using 32-pixel pitch alignment, which seems smaller
than the minimum alignment supported by the hw. Instead of rejecting
the alignment, we suspect, it uses 64-pixel alignment to do dma. The
larger alignment sometimes causes out of bound memory accesses, thus
the io page faults.

I created the following patch and hoped it could fix the problem:
https://lore.kernel.org/patchwork/patch/1029656/
Well, it does on our stoney ridge based chromebook but also breaks
other platforms.

So my questions to the amd gpu experts here:
1) how do properly validate pitch alignment passed to kernel space?
2) if it's not easy, what would invalid alignment cause at worst?

Thank you.
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH 1/3] drm/amdgpu: enable IH ring 1 and ring 2 v3

2019-01-11 Thread Alex Deucher
On Wed, Jan 9, 2019 at 8:12 AM Christian König
 wrote:
>
> The entries are ignored for now, but it at least stops crashing the
> hardware when somebody tries to push something to the other IH rings.
>
> v2: limit ring size, add TODO comment
> v3: only program rings if they are actually allocated
>
> Signed-off-by: Christian König 
> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_irq.h |   4 +-
>  drivers/gpu/drm/amd/amdgpu/vega10_ih.c  | 143 
>  2 files changed, 121 insertions(+), 26 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_irq.h 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_irq.h
> index f6ce171cb8aa..7e06fa64321a 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_irq.h
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_irq.h
> @@ -87,8 +87,8 @@ struct amdgpu_irq {
> /* status, etc. */
> boolmsi_enabled; /* msi enabled */
>
> -   /* interrupt ring */
> -   struct amdgpu_ih_ring   ih;
> +   /* interrupt rings */
> +   struct amdgpu_ih_ring   ih, ih1, ih2;
> const struct amdgpu_ih_funcs*ih_funcs;
>
> /* gen irq stuff */
> diff --git a/drivers/gpu/drm/amd/amdgpu/vega10_ih.c 
> b/drivers/gpu/drm/amd/amdgpu/vega10_ih.c
> index 562701939d3e..eea5530d2961 100644
> --- a/drivers/gpu/drm/amd/amdgpu/vega10_ih.c
> +++ b/drivers/gpu/drm/amd/amdgpu/vega10_ih.c
> @@ -50,6 +50,22 @@ static void vega10_ih_enable_interrupts(struct 
> amdgpu_device *adev)
> ih_rb_cntl = REG_SET_FIELD(ih_rb_cntl, IH_RB_CNTL, ENABLE_INTR, 1);
> WREG32_SOC15(OSSSYS, 0, mmIH_RB_CNTL, ih_rb_cntl);
> adev->irq.ih.enabled = true;
> +
> +   if (adev->irq.ih1.ring_size) {
> +   ih_rb_cntl = RREG32_SOC15(OSSSYS, 0, mmIH_RB_CNTL_RING1);
> +   ih_rb_cntl = REG_SET_FIELD(ih_rb_cntl, IH_RB_CNTL_RING1,
> +  RB_ENABLE, 1);
> +   WREG32_SOC15(OSSSYS, 0, mmIH_RB_CNTL_RING1, ih_rb_cntl);
> +   adev->irq.ih1.enabled = true;
> +   }
> +
> +   if (adev->irq.ih2.ring_size) {
> +   ih_rb_cntl = RREG32_SOC15(OSSSYS, 0, mmIH_RB_CNTL_RING2);
> +   ih_rb_cntl = REG_SET_FIELD(ih_rb_cntl, IH_RB_CNTL_RING2,
> +  RB_ENABLE, 1);
> +   WREG32_SOC15(OSSSYS, 0, mmIH_RB_CNTL_RING2, ih_rb_cntl);
> +   adev->irq.ih2.enabled = true;
> +   }
>  }
>
>  /**
> @@ -71,6 +87,53 @@ static void vega10_ih_disable_interrupts(struct 
> amdgpu_device *adev)
> WREG32_SOC15(OSSSYS, 0, mmIH_RB_WPTR, 0);
> adev->irq.ih.enabled = false;
> adev->irq.ih.rptr = 0;
> +
> +   if (adev->irq.ih1.ring_size) {
> +   ih_rb_cntl = RREG32_SOC15(OSSSYS, 0, mmIH_RB_CNTL_RING1);
> +   ih_rb_cntl = REG_SET_FIELD(ih_rb_cntl, IH_RB_CNTL_RING1,
> +  RB_ENABLE, 0);
> +   WREG32_SOC15(OSSSYS, 0, mmIH_RB_CNTL_RING1, ih_rb_cntl);
> +   /* set rptr, wptr to 0 */
> +   WREG32_SOC15(OSSSYS, 0, mmIH_RB_RPTR_RING1, 0);
> +   WREG32_SOC15(OSSSYS, 0, mmIH_RB_WPTR_RING1, 0);
> +   adev->irq.ih1.enabled = false;
> +   adev->irq.ih1.rptr = 0;
> +   }
> +
> +   if (adev->irq.ih2.ring_size) {
> +   ih_rb_cntl = RREG32_SOC15(OSSSYS, 0, mmIH_RB_CNTL_RING2);
> +   ih_rb_cntl = REG_SET_FIELD(ih_rb_cntl, IH_RB_CNTL_RING2,
> +  RB_ENABLE, 0);
> +   WREG32_SOC15(OSSSYS, 0, mmIH_RB_CNTL_RING2, ih_rb_cntl);
> +   /* set rptr, wptr to 0 */
> +   WREG32_SOC15(OSSSYS, 0, mmIH_RB_RPTR_RING2, 0);
> +   WREG32_SOC15(OSSSYS, 0, mmIH_RB_WPTR_RING2, 0);
> +   adev->irq.ih2.enabled = false;
> +   adev->irq.ih2.rptr = 0;
> +   }
> +}
> +
> +static uint32_t vega10_ih_rb_cntl(struct amdgpu_ih_ring *ih, uint32_t 
> ih_rb_cntl)
> +{
> +   int rb_bufsz = order_base_2(ih->ring_size / 4);
> +
> +   ih_rb_cntl = REG_SET_FIELD(ih_rb_cntl, IH_RB_CNTL,
> +  MC_SPACE, ih->use_bus_addr ? 1 : 4);
> +   ih_rb_cntl = REG_SET_FIELD(ih_rb_cntl, IH_RB_CNTL,
> +  WPTR_OVERFLOW_CLEAR, 1);
> +   ih_rb_cntl = REG_SET_FIELD(ih_rb_cntl, IH_RB_CNTL,
> +  WPTR_OVERFLOW_ENABLE, 1);
> +   ih_rb_cntl = REG_SET_FIELD(ih_rb_cntl, IH_RB_CNTL, RB_SIZE, rb_bufsz);
> +   /* Ring Buffer write pointer writeback. If enabled, IH_RB_WPTR 
> register
> +* value is written to memory
> +*/
> +   ih_rb_cntl = REG_SET_FIELD(ih_rb_cntl, IH_RB_CNTL,
> +  WPTR_WRITEBACK_ENABLE, 1);
> +   ih_rb_cntl = REG_SET_FIELD(ih_rb_cntl, IH_RB_CNTL, MC_SNOOP, 1);
> +   ih_rb_cntl = REG_SET_FIELD(ih_rb_cntl, IH_RB_CNTL, MC_RO, 0);
> +   ih_rb_cntl = REG_SET_FIELD(ih_rb_cntl, IH_RB_CNTL, 

Re: [PATCH] drm/ttm: stop always moving BOs on the LRU on page fault

2019-01-11 Thread Christian König

Am 11.01.19 um 15:17 schrieb Michel Dänzer:

On 2019-01-11 2:15 p.m., Christian König wrote:

Move the BO on the LRU only when it is actually moved by a DMA
operation.

[...]

@@ -177,6 +175,13 @@ static vm_fault_t ttm_bo_vm_fault(struct vm_fault *vmf)
ret = VM_FAULT_SIGBUS;
goto out_unlock;
}
+
+   if (bo->moving != moving) {

Hmm, could a driver just update the existing fence instead of attaching
a new one?


Mhm, not as far as I know. That would violate similar checks elsewhere.

Christian.
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH v5 1/2] drm/sched: Refactor ring mirror list handling.

2019-01-11 Thread Koenig, Christian
Am 11.01.19 um 16:37 schrieb Grodzovsky, Andrey:
>
> On 01/11/2019 04:42 AM, Koenig, Christian wrote:
>> Am 10.01.19 um 16:56 schrieb Grodzovsky, Andrey:
>>> [SNIP]
>> But we will not be adding the cb back in drm_sched_stop anymore, now we
>> are only going to add back the cb in drm_sched_startr after rerunning
>> those jobs in drm_sched_resubmit_jobs and assign them a new parent
>> there
>> anyway.
> Yeah, but when we find that we don't need to reset anything anymore
> then adding the callbacks again won't be possible any more.
>
> Christian.
 I am not sure I understand it, can u point me to example of how this
 will happen ? I am attaching my latest patches with waiting only for
 the last job's fence here just so we are on same page regarding the code.
>> Well the whole idea is to prepare all schedulers, then check once more
>> if the offending job hasn't completed in the meantime.
>>
>> If the job completed we need to be able to rollback everything and
>> continue as if nothing had happened.
>>
>> Christian.
> Oh, but this piece of functionality - skipping HW ASIC reset in case the
> guilty job done is totally missing form this patch series and so needs
> to be added. So what you say actually is that for the case were we skip
> HW asic reset because the guilty job did complete we also need to skip
> resubmitting the jobs in drm_sched_resubmit_jobs and hence preserve the
> old parent fence pointer for reuse ? If so I would like to add all this
> functionality as a third patch since the first 2 patches are more about
> resolving race condition with jobs in flight while doing reset - what do
> you think ?

Yeah, sounds perfectly fine to me.

Christian.

>
> Andrey
 Andrey

>> ___
>> amd-gfx mailing list
>> amd-gfx@lists.freedesktop.org
>> https://lists.freedesktop.org/mailman/listinfo/amd-gfx

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


[PATCH] drm/amd/display: Fully remove i2caux folder

2019-01-11 Thread sunpeng.li
From: Leo Li 

This is a follow up to:
e28e1490794d ("drm/amd/display: Remove i2caux folder")

Some files were still left, so delete all of them.

CC: David Francis 
CC: Harry Wentland 
Signed-off-by: Leo Li 
---
 drivers/gpu/drm/amd/display/dc/i2caux/Makefile |  99 
 .../dc/i2caux/dce110/i2c_hw_engine_dce110.c| 573 -
 .../amd/display/dc/i2caux/dce112/i2caux_dce112.c   | 129 -
 .../amd/display/dc/i2caux/dce112/i2caux_dce112.h   |  32 --
 .../display/dc/i2caux/dce80/i2c_sw_engine_dce80.c  | 163 --
 drivers/gpu/drm/amd/display/dc/i2caux/i2caux.c | 491 --
 6 files changed, 1487 deletions(-)
 delete mode 100644 drivers/gpu/drm/amd/display/dc/i2caux/Makefile
 delete mode 100644 
drivers/gpu/drm/amd/display/dc/i2caux/dce110/i2c_hw_engine_dce110.c
 delete mode 100644 drivers/gpu/drm/amd/display/dc/i2caux/dce112/i2caux_dce112.c
 delete mode 100644 drivers/gpu/drm/amd/display/dc/i2caux/dce112/i2caux_dce112.h
 delete mode 100644 
drivers/gpu/drm/amd/display/dc/i2caux/dce80/i2c_sw_engine_dce80.c
 delete mode 100644 drivers/gpu/drm/amd/display/dc/i2caux/i2caux.c

diff --git a/drivers/gpu/drm/amd/display/dc/i2caux/Makefile 
b/drivers/gpu/drm/amd/display/dc/i2caux/Makefile
deleted file mode 100644
index 352885c..000
--- a/drivers/gpu/drm/amd/display/dc/i2caux/Makefile
+++ /dev/null
@@ -1,99 +0,0 @@
-#
-# Copyright 2017 Advanced Micro Devices, Inc.
-#
-# Permission is hereby granted, free of charge, to any person obtaining a
-# copy of this software and associated documentation files (the "Software"),
-# to deal in the Software without restriction, including without limitation
-# the rights to use, copy, modify, merge, publish, distribute, sublicense,
-# and/or sell copies of the Software, and to permit persons to whom the
-# Software is furnished to do so, subject to the following conditions:
-#
-# The above copyright notice and this permission notice shall be included in
-# all copies or substantial portions of the Software.
-#
-# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
-# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
-# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
-# THE COPYRIGHT HOLDER(S) OR AUTHOR(S) BE LIABLE FOR ANY CLAIM, DAMAGES OR
-# OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE,
-# ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
-# OTHER DEALINGS IN THE SOFTWARE.
-#
-#
-# Makefile for the 'i2c' sub-component of DAL.
-# It provides the control and status of HW i2c engine of the adapter.
-
-I2CAUX = aux_engine.o engine_base.o i2caux.o i2c_engine.o \
-i2c_generic_hw_engine.o i2c_hw_engine.o i2c_sw_engine.o
-
-AMD_DAL_I2CAUX = $(addprefix $(AMDDALPATH)/dc/i2caux/,$(I2CAUX))
-
-AMD_DISPLAY_FILES += $(AMD_DAL_I2CAUX)
-
-###
-# DCE 8x family
-###
-I2CAUX_DCE80 = i2caux_dce80.o i2c_hw_engine_dce80.o \
-   i2c_sw_engine_dce80.o
-
-AMD_DAL_I2CAUX_DCE80 = $(addprefix 
$(AMDDALPATH)/dc/i2caux/dce80/,$(I2CAUX_DCE80))
-
-AMD_DISPLAY_FILES += $(AMD_DAL_I2CAUX_DCE80)
-
-###
-# DCE 100 family
-###
-I2CAUX_DCE100 = i2caux_dce100.o
-
-AMD_DAL_I2CAUX_DCE100 = $(addprefix 
$(AMDDALPATH)/dc/i2caux/dce100/,$(I2CAUX_DCE100))
-
-AMD_DISPLAY_FILES += $(AMD_DAL_I2CAUX_DCE100)
-
-###
-# DCE 110 family
-###
-I2CAUX_DCE110 = i2caux_dce110.o i2c_sw_engine_dce110.o i2c_hw_engine_dce110.o \
-   aux_engine_dce110.o
-
-AMD_DAL_I2CAUX_DCE110 = $(addprefix 
$(AMDDALPATH)/dc/i2caux/dce110/,$(I2CAUX_DCE110))
-
-AMD_DISPLAY_FILES += $(AMD_DAL_I2CAUX_DCE110)
-
-###
-# DCE 112 family
-###
-I2CAUX_DCE112 = i2caux_dce112.o
-
-AMD_DAL_I2CAUX_DCE112 = $(addprefix 
$(AMDDALPATH)/dc/i2caux/dce112/,$(I2CAUX_DCE112))
-
-AMD_DISPLAY_FILES += $(AMD_DAL_I2CAUX_DCE112)
-
-###
-# DCN 1.0 family
-###
-ifdef CONFIG_DRM_AMD_DC_DCN1_0
-I2CAUX_DCN1 = i2caux_dcn10.o
-
-AMD_DAL_I2CAUX_DCN1 = $(addprefix 
$(AMDDALPATH)/dc/i2caux/dcn10/,$(I2CAUX_DCN1))
-
-AMD_DISPLAY_FILES += $(AMD_DAL_I2CAUX_DCN1)
-endif
-
-###
-# DCE 120 family
-###
-I2CAUX_DCE120 = 

Re: [PATCH v5 1/2] drm/sched: Refactor ring mirror list handling.

2019-01-11 Thread Grodzovsky, Andrey


On 01/11/2019 04:42 AM, Koenig, Christian wrote:
> Am 10.01.19 um 16:56 schrieb Grodzovsky, Andrey:
>> [SNIP]
> But we will not be adding the cb back in drm_sched_stop anymore, now we
> are only going to add back the cb in drm_sched_startr after rerunning
> those jobs in drm_sched_resubmit_jobs and assign them a new parent
> there
> anyway.
 Yeah, but when we find that we don't need to reset anything anymore
 then adding the callbacks again won't be possible any more.

 Christian.
>>> I am not sure I understand it, can u point me to example of how this
>>> will happen ? I am attaching my latest patches with waiting only for
>>> the last job's fence here just so we are on same page regarding the code.
> Well the whole idea is to prepare all schedulers, then check once more
> if the offending job hasn't completed in the meantime.
>
> If the job completed we need to be able to rollback everything and
> continue as if nothing had happened.
>
> Christian.

Oh, but this piece of functionality - skipping HW ASIC reset in case the 
guilty job done is totally missing form this patch series and so needs 
to be added. So what you say actually is that for the case were we skip 
HW asic reset because the guilty job did complete we also need to skip  
resubmitting the jobs in drm_sched_resubmit_jobs and hence preserve the 
old parent fence pointer for reuse ? If so I would like to add all this 
functionality as a third patch since the first 2 patches are more about 
resolving race condition with jobs in flight while doing reset - what do 
you think ?

Andrey
>
>>> Andrey
>>>
> ___
> amd-gfx mailing list
> amd-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/amd-gfx

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: BISECTED- amd-staging-drm-next, xorg-server segfault A6-6310 APU - R4 Mullins.

2019-01-11 Thread Michel Dänzer
On 2019-01-10 6:56 p.m., Przemek Socha wrote:
>
> [  147.846148] [drm:amdgpu_display_user_framebuffer_create [amdgpu]] Invalid 
> pitch: expecting 10752 but got 10624
> [  147.846155] [drm:drm_internal_framebuffer_create] could not create 
> framebuffer"

Thanks, this confirms that the check is too strict. I've sent a patch
reverting this as well.


Yu, I like the idea behind your changes, but unfortunately it's more
complicated than that. If you want to work on similar checks which
accurately reflect the hardware constraints, people on the amd-gfx list
should be able to help with that.


-- 
Earthling Michel Dänzer   |   http://www.amd.com
Libre software enthusiast | Mesa and X developer
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_display.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_display.c
index 70a816dd8b4d..99b646c16311 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_display.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_display.c
@@ -537,8 +537,11 @@ amdgpu_display_user_framebuffer_create(struct drm_device *dev,
 
 	pitch = amdgpu_align_pitch(adev, pitch, cpp, false);
 	if (mode_cmd->pitches[0] != pitch) {
-		DRM_DEBUG_KMS("Invalid pitch: expecting %d but got %d\n",
-			  pitch, mode_cmd->pitches[0]);
+		struct drm_format_name_buf format_name;
+
+		DRM_ERROR("Invalid pitch: expecting %d but got %d, format %s => cpp=%d\n",
+			  pitch, mode_cmd->pitches[0],
+			  drm_get_format_name(mode_cmd->pixel_format, _name), cpp);
 		return ERR_PTR(-EINVAL);
 	}
 


signature.asc
Description: OpenPGP digital signature
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH] Revert "drm/amdgpu: validate user pitch alignment"

2019-01-11 Thread Deucher, Alexander
Reviewed-by: Alex Deucher 


From: amd-gfx  on behalf of Michel 
Dänzer 
Sent: Friday, January 11, 2019 10:22:07 AM
To: amd-gfx@lists.freedesktop.org
Cc: Yu Zhao
Subject: [PATCH] Revert "drm/amdgpu: validate user pitch alignment"

From: Michel Dänzer 

The check turned out to be too strict in some cases.

Signed-off-by: Michel Dänzer 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_display.c | 10 --
 1 file changed, 10 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_display.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_display.c
index 70a816dd8b4d..4e944737b708 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_display.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_display.c
@@ -531,16 +531,6 @@ amdgpu_display_user_framebuffer_create(struct drm_device 
*dev,
 struct drm_gem_object *obj;
 struct amdgpu_framebuffer *amdgpu_fb;
 int ret;
-   struct amdgpu_device *adev = dev->dev_private;
-   int cpp = drm_format_plane_cpp(mode_cmd->pixel_format, 0);
-   int pitch = mode_cmd->pitches[0] / cpp;
-
-   pitch = amdgpu_align_pitch(adev, pitch, cpp, false);
-   if (mode_cmd->pitches[0] != pitch) {
-   DRM_DEBUG_KMS("Invalid pitch: expecting %d but got %d\n",
- pitch, mode_cmd->pitches[0]);
-   return ERR_PTR(-EINVAL);
-   }

 obj = drm_gem_object_lookup(file_priv, mode_cmd->handles[0]);
 if (obj ==  NULL) {
--
2.20.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


[PATCH] Revert "drm/amdgpu: validate user pitch alignment"

2019-01-11 Thread Michel Dänzer
From: Michel Dänzer 

The check turned out to be too strict in some cases.

Signed-off-by: Michel Dänzer 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_display.c | 10 --
 1 file changed, 10 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_display.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_display.c
index 70a816dd8b4d..4e944737b708 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_display.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_display.c
@@ -531,16 +531,6 @@ amdgpu_display_user_framebuffer_create(struct drm_device 
*dev,
struct drm_gem_object *obj;
struct amdgpu_framebuffer *amdgpu_fb;
int ret;
-   struct amdgpu_device *adev = dev->dev_private;
-   int cpp = drm_format_plane_cpp(mode_cmd->pixel_format, 0);
-   int pitch = mode_cmd->pitches[0] / cpp;
-
-   pitch = amdgpu_align_pitch(adev, pitch, cpp, false);
-   if (mode_cmd->pitches[0] != pitch) {
-   DRM_DEBUG_KMS("Invalid pitch: expecting %d but got %d\n",
- pitch, mode_cmd->pitches[0]);
-   return ERR_PTR(-EINVAL);
-   }
 
obj = drm_gem_object_lookup(file_priv, mode_cmd->handles[0]);
if (obj ==  NULL) {
-- 
2.20.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


RE: [PATCH] drm/amd/display: Remove error message in stream create routine

2019-01-11 Thread Zuo, Jerry
OK. Got you. Thanks.

-Original Message-
From: Lyude Paul  
Sent: January 10, 2019 6:07 PM
To: Zuo, Jerry ; Wentland, Harry 
Subject: Re: [PATCH] drm/amd/display: Remove error message in stream create 
routine

On Thu, 2019-01-10 at 22:17 +, Zuo, Jerry wrote:
> Thanks for your comments. The failure in creating stream for sink 
> could be in the following two cases:
> 
> In hotplug scenario, when hotplug back to the same port, first tries 
> to check old connector which gets failed, and then go through the new 
> added connector without issue.
> 
> In S3 topology change, the new sink will not get created in the resume 
> routine that leads to the failure in creating stream. The sink and 
> stream will eventually get created after the resume when triggering hotplug 
> event.
> 
> Both sequence seems to be normal in MST processing. For the hotplug, 
> not quite clear why the old connector still needs to check. Seems only 
> new connector needs that. Please correct me if I have misunderstanding 
> on this part.
I don't think the old connector needs a check, that's likely just a sideaffect 
of the connector not being unreferenced entirely yet by userspace/the kernel- 
but it should be safe to ignore.

Anyway- if you replace "In S3 topology change" with "When an MST topology 
connected to a port is replaced with a different topology while the system is 
suspended", then feel free to add my R-B

Reviewed-by: Lyude Paul 
> 
> Jerry
> 
> -Original Message-
> From: Lyude Paul 
> Sent: January 10, 2019 4:33 PM
> To: Zuo, Jerry ; amd-gfx@lists.freedesktop.org; 
> Wentland, Harry 
> Subject: Re: [PATCH] drm/amd/display: Remove error message in stream 
> create routine
> 
> On Thu, 2019-01-10 at 16:18 -0500, Jerry (Fangzhi) Zuo wrote:
> > MST has its unique sequence that may get failed due to no stream in 
> > atomic check in some cases, e.g., hotplug, S3 resume.
> 
> You might want to elaborate more here, e.g. what's the actual reason 
> for MST not having streams in the case of a hotplug or S3 resume? Have 
> they not been created yet, etc. etc.
> > Remove the ERROR message in the stream create routine, and leave its 
> > caller to decide if additional action is required.
> > 
> > Signed-off-by: Jerry (Fangzhi) Zuo 
> > ---
> >  drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 4 +---
> >  1 file changed, 1 insertion(+), 3 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
> > b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
> > index 53c49c901a6a..a9ed225d2ae0 100644
> > --- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
> > +++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
> > @@ -3052,10 +3052,8 @@ create_stream_for_sink(struct 
> > amdgpu_dm_connector *aconnector,
> >  
> > stream = dc_create_stream_for_sink(sink);
> >  
> > -   if (stream == NULL) {
> > -   DRM_ERROR("Failed to create stream for sink!\n");
> > +   if (stream == NULL)
> > goto finish;
> > -   }
> >  
> > stream->dm_stream_context = aconnector;
> >  
> --
> Cheers,
>   Lyude Paul
> 
--
Cheers,
Lyude Paul

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH] drm/amd/display: Only get the connector state for VRR when toggled

2019-01-11 Thread Li, Sun peng (Leo)


On 2019-01-10 3:12 p.m., Nicholas Kazlauskas wrote:
> [Why]
> This fixes a stuttering issue that occurs when moving a hardware cursor
> when VRR is enabled.
> 
> Previously when VRR is enabled atomic check will grab the connector
> state for every atomic update. This has to lock the connector in order
> to do so. The locking is bad enough by itself for performance, but
> it gets worse with what we do just below that - add all the planes
> for the CRTC to the commit.
> 
> This prevents the cursor fast path from working - there's more than one
> plane now. With state->allow_modeset = true on top of this, it also
> adds and removes all the planes from the DC context triggering a full
> (very slow) update in DC.
> 
> [How]
> We need the connector state to get the VRR min/max capbilities, but we
> only need them when there's a CRTC mode change or when VRR is toggled.
> 
> The condition has been updated accordingly.
> 
> Fixes: 3cc22f281318 ("drm/amdgpu: Set FreeSync state using drm VRR 
> properties")
> 
> Cc: Harry Wentland 
> Cc: Leo Li 
> Signed-off-by: Nicholas Kazlauskas 

Reviewed-by: Leo Li 

> ---
>   drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 2 +-
>   1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c 
> b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
> index 4839d2dc..b7867ec166f3 100644
> --- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
> +++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
> @@ -6083,7 +6083,7 @@ static int amdgpu_dm_atomic_check(struct drm_device 
> *dev,
>   for_each_oldnew_crtc_in_state(state, crtc, old_crtc_state, 
> new_crtc_state, i) {
>   if (!drm_atomic_crtc_needs_modeset(new_crtc_state) &&
>   !new_crtc_state->color_mgmt_changed &&
> - !new_crtc_state->vrr_enabled)
> + old_crtc_state->vrr_enabled == new_crtc_state->vrr_enabled)
>   continue;
>   
>   if (!new_crtc_state->enable)
> 
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH] drm/ttm: stop always moving BOs on the LRU on page fault

2019-01-11 Thread Michel Dänzer
On 2019-01-11 2:15 p.m., Christian König wrote:
> Move the BO on the LRU only when it is actually moved by a DMA
> operation.
> 
> [...]
> 
> @@ -177,6 +175,13 @@ static vm_fault_t ttm_bo_vm_fault(struct vm_fault *vmf)
>   ret = VM_FAULT_SIGBUS;
>   goto out_unlock;
>   }
> +
> + if (bo->moving != moving) {

Hmm, could a driver just update the existing fence instead of attaching
a new one?


-- 
Earthling Michel Dänzer   |   http://www.amd.com
Libre software enthusiast | Mesa and X developer
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH] drm/ttm: stop always moving BOs on the LRU on page fault

2019-01-11 Thread Michel Dänzer
On 2019-01-11 2:15 p.m., Christian König wrote:
> Move the BO on the LRU only when it is actually moved by a DMA
> operation.
> 
> Signed-off-by: Christian König 

Reviewed-by: Michel Dänzer 


-- 
Earthling Michel Dänzer   |   http://www.amd.com
Libre software enthusiast | Mesa and X developer
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


[PATCH] drm/ttm: stop always moving BOs on the LRU on page fault

2019-01-11 Thread Christian König
Move the BO on the LRU only when it is actually moved by a DMA
operation.

Signed-off-by: Christian König 
---
 drivers/gpu/drm/ttm/ttm_bo_vm.c | 19 ---
 1 file changed, 12 insertions(+), 7 deletions(-)

diff --git a/drivers/gpu/drm/ttm/ttm_bo_vm.c b/drivers/gpu/drm/ttm/ttm_bo_vm.c
index a1d977fbade5..e86a29a1e51f 100644
--- a/drivers/gpu/drm/ttm/ttm_bo_vm.c
+++ b/drivers/gpu/drm/ttm/ttm_bo_vm.c
@@ -71,7 +71,7 @@ static vm_fault_t ttm_bo_vm_fault_idle(struct 
ttm_buffer_object *bo,
ttm_bo_get(bo);
up_read(>vma->vm_mm->mmap_sem);
(void) dma_fence_wait(bo->moving, true);
-   ttm_bo_unreserve(bo);
+   reservation_object_unlock(bo->resv);
ttm_bo_put(bo);
goto out_unlock;
}
@@ -131,11 +131,7 @@ static vm_fault_t ttm_bo_vm_fault(struct vm_fault *vmf)
 * for reserve, and if it fails, retry the fault after waiting
 * for the buffer to become unreserved.
 */
-   err = ttm_bo_reserve(bo, true, true, NULL);
-   if (unlikely(err != 0)) {
-   if (err != -EBUSY)
-   return VM_FAULT_NOPAGE;
-
+   if (unlikely(!reservation_object_trylock(bo->resv))) {
if (vmf->flags & FAULT_FLAG_ALLOW_RETRY) {
if (!(vmf->flags & FAULT_FLAG_RETRY_NOWAIT)) {
ttm_bo_get(bo);
@@ -165,6 +161,8 @@ static vm_fault_t ttm_bo_vm_fault(struct vm_fault *vmf)
}
 
if (bdev->driver->fault_reserve_notify) {
+   struct dma_fence *moving = dma_fence_get(bo->moving);
+
err = bdev->driver->fault_reserve_notify(bo);
switch (err) {
case 0:
@@ -177,6 +175,13 @@ static vm_fault_t ttm_bo_vm_fault(struct vm_fault *vmf)
ret = VM_FAULT_SIGBUS;
goto out_unlock;
}
+
+   if (bo->moving != moving) {
+   spin_lock(>glob->lru_lock);
+   ttm_bo_move_to_lru_tail(bo, NULL);
+   spin_unlock(>glob->lru_lock);
+   }
+   dma_fence_put(moving);
}
 
/*
@@ -291,7 +296,7 @@ static vm_fault_t ttm_bo_vm_fault(struct vm_fault *vmf)
 out_io_unlock:
ttm_mem_io_unlock(man);
 out_unlock:
-   ttm_bo_unreserve(bo);
+   reservation_object_unlock(bo->resv);
return ret;
 }
 
-- 
2.14.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH v5 1/2] drm/sched: Refactor ring mirror list handling.

2019-01-11 Thread Koenig, Christian
Am 10.01.19 um 16:56 schrieb Grodzovsky, Andrey:
> [SNIP]
 But we will not be adding the cb back in drm_sched_stop anymore, now we
 are only going to add back the cb in drm_sched_startr after rerunning
 those jobs in drm_sched_resubmit_jobs and assign them a new parent
 there
 anyway.
>>> Yeah, but when we find that we don't need to reset anything anymore
>>> then adding the callbacks again won't be possible any more.
>>>
>>> Christian.
>> I am not sure I understand it, can u point me to example of how this
>> will happen ? I am attaching my latest patches with waiting only for
>> the last job's fence here just so we are on same page regarding the code.

Well the whole idea is to prepare all schedulers, then check once more 
if the offending job hasn't completed in the meantime.

If the job completed we need to be able to rollback everything and 
continue as if nothing had happened.

Christian.

>>
>> Andrey
>>

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx