RE: [PATCH 2/2] drm/amdgpu: remove memory training p2c buffer reservation

2019-12-17 Thread Chen, Guchun
[AMD Official Use Only - Internal Distribution Only] -Original Message- From: amd-gfx On Behalf Of Tianci Yin Sent: Tuesday, December 17, 2019 7:23 PM To: amd-gfx@lists.freedesktop.org Cc: Long, Gang ; Yin, Tianci (Rico) ; Xu, Feifei ; Wang, Kevin(Yang) ; Tuikov, Luben ; Deucher,

Re: [pull] amdgpu, amdkfd, radeon drm-next-5.6

2019-12-17 Thread Daniel Vetter
On Wed, Dec 11, 2019 at 05:30:20PM -0500, Alex Deucher wrote: > Hi Dave, Daniel, > > Kicking off 5.6 with new stuff from AMD. There is a UAPI addition. We > added a new firmware for display, and this just adds the version query > to our existing firmware query interface. UMDs like mesa use

[PATCH 2/2] drm/amdgpu: remove memory training p2c buffer reservation

2019-12-17 Thread Tianci Yin
From: "Tianci.Yin" IP discovery TMR(occupied the top VRAM with size DISCOVERY_TMR_SIZE) has been reserved, and the p2c buffer is in the range of this TMR, so the p2c buffer reservation is unnecessary. Change-Id: Ib1f2f2b4a1f3869c03ffe22e2836cdbee17ba99f Signed-off-by: Tianci.Yin ---

[PATCH 1/2] drm/amdgpu: update the method to get fb_loc of memory training

2019-12-17 Thread Tianci Yin
From: "Tianci.Yin" The method of getting fb_loc changed from parsing VBIOS to taking certain offset from top of VRAM Change-Id: I053b42fdb1d822722fa7980b2cd9f86b3fdce539 --- drivers/gpu/drm/amd/amdgpu/amdgpu.h | 3 +- .../gpu/drm/amd/amdgpu/amdgpu_atomfirmware.c | 36

Re: [pull] amdgpu, amdkfd, radeon drm-next-5.6

2019-12-17 Thread Alex Deucher
On Tue, Dec 17, 2019 at 7:52 AM Daniel Vetter wrote: > > On Wed, Dec 11, 2019 at 05:30:20PM -0500, Alex Deucher wrote: > > Hi Dave, Daniel, > > > > Kicking off 5.6 with new stuff from AMD. There is a UAPI addition. We > > added a new firmware for display, and this just adds the version query >

[PATCH] drm/amdgpu: move umc offset to one new header file for Arcturus

2019-12-17 Thread Guchun Chen
Fixes: 9686563c4c42 drm/amdgpu: Added RAS UMC error query support for Arcturus Code refactor and no functional change. Signed-off-by: Guchun Chen --- drivers/gpu/drm/amd/amdgpu/umc_v6_1.c | 17 +- .../include/asic_reg/umc/umc_6_1_2_offset.h | 32 +++ 2 files

RE: [PATCH 1/2] drm/amdgpu: fix double gpu_recovery for NV of SRIOV

2019-12-17 Thread Deng, Emily
[AMD Official Use Only - Internal Distribution Only] Reviewed-by: Emily Deng >-Original Message- >From: amd-gfx On Behalf Of Monk Liu >Sent: Tuesday, December 17, 2019 6:20 PM >To: amd-gfx@lists.freedesktop.org >Cc: Liu, Monk >Subject: [PATCH 1/2] drm/amdgpu: fix double gpu_recovery

[PATCH] drm/amdgpu: fix KIQ ring test fail in TDR

2019-12-17 Thread Monk Liu
issues: there are two issue may lead to TDR failure for SRIOV 1) gpu_recover() is re-entered by the mailbox interrupt handler mxgpu_nv.c 2) MEC is ruined by the amdkfd_pre_reset after VF FLR done fix: for 1) we need to bypass the gpu_recover() invoke in mailbox interrupt as long as the timeout is

[PATCH 2/2] drm/amdgpu: fix KIQ ring test fail in TDR of SRIOV

2019-12-17 Thread Monk Liu
issues: MEC is ruined by the amdkfd_pre_reset after VF FLR done fix: amdkfd_pre_reset() would ruin MEC after hypervisor finished the VF FLR, the correct sequence is do amdkfd_pre_reset before VF FLR but there is a limitation to block this sequence: if we do pre_reset() before VF FLR, it would go

[PATCH 1/2] drm/amdgpu: fix double gpu_recovery for NV of SRIOV

2019-12-17 Thread Monk Liu
issues: gpu_recover() is re-entered by the mailbox interrupt handler mxgpu_nv.c fix: we need to bypass the gpu_recover() invoke in mailbox interrupt as long as the timeout is not infinite (thus the TDR will be triggered automatically after time out, no need to invoke gpu_recover() through mailbox

RE: [PATCH 2/2] drm/amdgpu: fix KIQ ring test fail in TDR of SRIOV

2019-12-17 Thread Deng, Emily
[AMD Official Use Only - Internal Distribution Only] Reviewed-by: Emily Deng >-Original Message- >From: amd-gfx On Behalf Of Monk Liu >Sent: Tuesday, December 17, 2019 6:20 PM >To: amd-gfx@lists.freedesktop.org >Cc: Liu, Monk >Subject: [PATCH 2/2] drm/amdgpu: fix KIQ ring test fail in

RE: [PATCH v2 1/5] drm/amdgpu: reverts commit b01245ff54db66073b104ac9d9fbefb7b264b36d.

2019-12-17 Thread Ma, Le
[AMD Official Use Only - Internal Distribution Only] Hi Andry Please check the 3 minor comments in this patch. With that addressed, the V2s series is Reviewed-by: Le Ma mailto:le...@amd.com>> Regards, Ma Le -Original Message- From: Andrey Grodzovsky Sent: Saturday, December

Re: [PATCH v3] drm/amd/display: Fix AppleDongle can't be detected

2019-12-17 Thread Harry Wentland
On 2019-12-11 2:33 a.m., Louis Li wrote: > [Why] > External monitor cannot be displayed consistently, if connecting > via this Apple dongle (A1621, USB Type-C to HDMI). > Experiments prove that the dongle needs 200ms at least to be ready > for communication, after it drives HPDsignal high, and

Re: [PATCH 1/3] drm/amdgpu: wait for all rings to drain before runtime suspending

2019-12-17 Thread Andrey Grodzovsky
Reviewed-by: Andrey Grodzovsky Andrey On 12/16/19 12:18 PM, Alex Deucher wrote: Add a safety check to runtime suspend to make sure all outstanding fences have signaled before we suspend. Doesn't fix any known issue. We already do this via the fence driver suspend function, but we just force

Re: [PATCH 2/2] drm/amdgpu: fix KIQ ring test fail in TDR of SRIOV

2019-12-17 Thread shaoyunl
I think amdkfd side depends on this call to stop the user queue, without this call, the user queue can submit to HW during the reset which could cause hang again ... Do we know the root cause why this function would ruin MEC ? From the logic, I think this function should be called before FLR

Re: [pull] amdgpu, amdkfd, radeon drm-next-5.6

2019-12-17 Thread Alex Deucher
On Tue, Dec 17, 2019 at 8:47 AM Alex Deucher wrote: > > On Tue, Dec 17, 2019 at 7:52 AM Daniel Vetter wrote: > > > > On Wed, Dec 11, 2019 at 05:30:20PM -0500, Alex Deucher wrote: > > > Hi Dave, Daniel, > > > > > > Kicking off 5.6 with new stuff from AMD. There is a UAPI addition. We > > >

[PATCH 3/5] drm/amdgpu/smu: add metrics table lock for navi

2019-12-17 Thread Alex Deucher
To protect access to the metrics table. Bug: https://gitlab.freedesktop.org/drm/amd/issues/900 Signed-off-by: Alex Deucher --- drivers/gpu/drm/amd/powerplay/navi10_ppt.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/drivers/gpu/drm/amd/powerplay/navi10_ppt.c

[PATCH 4/5] drm/amdgpu/smu: add metrics table lock for renoir

2019-12-17 Thread Alex Deucher
To protect access to the metrics table. Bug: https://gitlab.freedesktop.org/drm/amd/issues/900 Signed-off-by: Alex Deucher --- drivers/gpu/drm/amd/powerplay/renoir_ppt.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/amd/powerplay/renoir_ppt.c

[PATCH 1/5] drm/amdgpu/smu: add metrics table lock

2019-12-17 Thread Alex Deucher
This table is used for lots of things, add it's own lock. Bug: https://gitlab.freedesktop.org/drm/amd/issues/900 Signed-off-by: Alex Deucher --- drivers/gpu/drm/amd/powerplay/amdgpu_smu.c | 1 + drivers/gpu/drm/amd/powerplay/inc/amdgpu_smu.h | 1 + 2 files changed, 2 insertions(+) diff

[PATCH 2/5] drm/amdgpu/smu: add metrics table lock for arcturus

2019-12-17 Thread Alex Deucher
To protect access to the metrics table. Bug: https://gitlab.freedesktop.org/drm/amd/issues/900 Signed-off-by: Alex Deucher --- drivers/gpu/drm/amd/powerplay/arcturus_ppt.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/drivers/gpu/drm/amd/powerplay/arcturus_ppt.c

[PATCH 5/5] drm/amdgpu/smu: add metrics table lock for vega20

2019-12-17 Thread Alex Deucher
To protect access to the metrics table. Bug: https://gitlab.freedesktop.org/drm/amd/issues/900 Signed-off-by: Alex Deucher --- drivers/gpu/drm/amd/powerplay/vega20_ppt.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/drivers/gpu/drm/amd/powerplay/vega20_ppt.c

Re: [PATCH] drm/amdgpu: move umc offset to one new header file for Arcturus

2019-12-17 Thread Deucher, Alexander
[AMD Official Use Only - Internal Distribution Only] Reviewed-by: Alex Deucher From: Chen, Guchun Sent: Tuesday, December 17, 2019 4:08 AM To: Clements, John ; Zhang, Hawking ; Deucher, Alexander ; amd-gfx@lists.freedesktop.org Cc: Chen, Guchun Subject:

Re: [pull] amdgpu, amdkfd, radeon drm-next-5.6

2019-12-17 Thread Daniel Vetter
On Tue, Dec 17, 2019 at 09:17:51AM -0500, Alex Deucher wrote: > On Tue, Dec 17, 2019 at 8:47 AM Alex Deucher wrote: > > > > On Tue, Dec 17, 2019 at 7:52 AM Daniel Vetter wrote: > > > > > > On Wed, Dec 11, 2019 at 05:30:20PM -0500, Alex Deucher wrote: > > > > Hi Dave, Daniel, > > > > > > > >

Re: [CI-NOTIFY]: TCWG Bisect tcwg_kernel/llvm-release-aarch64-next-allmodconfig - Build # 48 - Successful!

2019-12-17 Thread Nick Desaulniers
Bhawanpreet, I suspect you're missing the header to include udelay in drivers/gpu/drm/amd/amdgpu/../display/modules/hdcp/hdcp2_execution.c. Can you please send a fix for this? On Tue, Dec 17, 2019 at 7:07 AM wrote: > > Successfully identified regression in *linux* in CI configuration >

Re: [PATCH 3/5] drm/amdgpu/smu: add metrics table lock for navi

2019-12-17 Thread Pierre-Eric Pelloux-Prayer
Hi Alex, Isn't this patch missing something like this: pr_info("Failed to export SMU metrics table!\n"); + mutex_unlock(>metrics_lock); return ret; to release the lock in case of error? Regards, Pierre-Eric On 17/12/2019 15:55, Alex Deucher wrote: > To protect access to the

Re: [pull] amdgpu, amdkfd, radeon drm-next-5.6

2019-12-17 Thread Daniel Vetter
On Tue, Dec 17, 2019 at 12:41:06PM -0500, Alex Deucher wrote: > On Tue, Dec 17, 2019 at 11:46 AM Daniel Vetter wrote: > > > > On Tue, Dec 17, 2019 at 09:17:51AM -0500, Alex Deucher wrote: > > > On Tue, Dec 17, 2019 at 8:47 AM Alex Deucher > > > wrote: > > > > > > > > On Tue, Dec 17, 2019 at

Re: [PATCH 1/5] drm/amdgpu/smu: add metrics table lock

2019-12-17 Thread Wang, Kevin(Yang)
[AMD Official Use Only - Internal Distribution Only] the swSMU should be add metrics lock to protect the maintenance data of the metrics table. The series patches are Reviewed-by: Kevin Wang Best Regards, Kevin From: amd-gfx on behalf of Alex Deucher Sent:

[PATCH 2/2] drm/amdgpu: attempt xgmi perfmon re-arm on failed arm

2019-12-17 Thread Jonathan Kim
The DF routines to arm xGMI performance will attempt to re-arm both on performance monitoring start and read on initial failure to arm. Signed-off-by: Jonathan Kim --- drivers/gpu/drm/amd/amdgpu/df_v3_6.c | 153 --- 1 file changed, 117 insertions(+), 36 deletions(-)

[PATCH 1/2] drm/amdgpu: add perfmons accessible during df c-states

2019-12-17 Thread Jonathan Kim
During DF C-State, Perfmon counters outside of range 1D700-1D7FF will encounter SLVERR affecting xGMI performance monitoring. PerfmonCtr[7:4] is being added to avoid SLVERR during read since it falls within this range. PerfmonCtl[7:4] is being added in order to arm PerfmonCtr[7:4]. Since

Re: [pull] amdgpu, amdkfd, radeon drm-next-5.6

2019-12-17 Thread Alex Deucher
On Tue, Dec 17, 2019 at 11:46 AM Daniel Vetter wrote: > > On Tue, Dec 17, 2019 at 09:17:51AM -0500, Alex Deucher wrote: > > On Tue, Dec 17, 2019 at 8:47 AM Alex Deucher wrote: > > > > > > On Tue, Dec 17, 2019 at 7:52 AM Daniel Vetter wrote: > > > > > > > > On Wed, Dec 11, 2019 at 05:30:20PM

Re: [CI-NOTIFY]: TCWG Bisect tcwg_kernel/llvm-release-aarch64-next-allmodconfig - Build # 48 - Successful!

2019-12-17 Thread Nathan Chancellor
On Tue, Dec 17, 2019 at 09:19:37AM -0800, 'Nick Desaulniers' via Clang Built Linux wrote: > Bhawanpreet, I suspect you're missing the header to include udelay in > drivers/gpu/drm/amd/amdgpu/../display/modules/hdcp/hdcp2_execution.c. > Can you please send a fix for this? > arm allyesconfig is

[PATCH 2/2] drm/amdgpu/display: use msleep rather than udelay for HDCP

2019-12-17 Thread Alex Deucher
ARM has a 2000us limit for udelay. Switch to msleep. This code executes in a worker thread so shouldn't be an atomic context. Signed-off-by: Alex Deucher --- drivers/gpu/drm/amd/display/modules/hdcp/hdcp2_execution.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git

[PATCH 1/2] drm/amdgpu/display: include delay.h

2019-12-17 Thread Alex Deucher
For udelay. This is needed for some platforms. Signed-off-by: Alex Deucher --- drivers/gpu/drm/amd/display/modules/hdcp/hdcp2_execution.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/drivers/gpu/drm/amd/display/modules/hdcp/hdcp2_execution.c

Re: [PATCH 2/2] drm/amdgpu: fix KIQ ring test fail in TDR of SRIOV

2019-12-17 Thread Felix Kuehling
I agree. Removing the call to pre-reset probably breaks GPU reset for KFD. We call the KFD suspend function in pre-reset, which uses the HIQ to stop any user mode queues still running. If that is not possible because the HIQ is hanging, it should fail with a timeout. There may be something we

Re: [PATCH 3/5] drm/amdgpu/smu: add metrics table lock for navi

2019-12-17 Thread Deucher, Alexander
[AMD Official Use Only - Internal Distribution Only] yeah, they need some fixes. Alex From: Pelloux-prayer, Pierre-eric Sent: Tuesday, December 17, 2019 1:56 PM To: Alex Deucher ; amd-gfx@lists.freedesktop.org Cc: Deucher, Alexander Subject: Re: [PATCH 3/5]

Re: [PATCH 1/2] drm/amdgpu/display: include delay.h

2019-12-17 Thread Kazlauskas, Nicholas
On 2019-12-17 3:47 p.m., Alex Deucher wrote: For udelay. This is needed for some platforms. Signed-off-by: Alex Deucher Reviewed-by: Nicholas Kazlauskas I wonder if it makes more sense to include this in os_types.h to avoid these errors in the future. Nicholas Kazlauskas ---

[PATCH v2 1/5] drm/amdgpu/smu: add metrics table lock

2019-12-17 Thread Alex Deucher
This table is used for lots of things, add it's own lock. Bug: https://gitlab.freedesktop.org/drm/amd/issues/900 Signed-off-by: Alex Deucher --- drivers/gpu/drm/amd/powerplay/amdgpu_smu.c | 1 + drivers/gpu/drm/amd/powerplay/inc/amdgpu_smu.h | 1 + 2 files changed, 2 insertions(+) diff

[PATCH v2 4/5] drm/amdgpu/smu: add metrics table lock for renoir (v2)

2019-12-17 Thread Alex Deucher
To protect access to the metrics table. v2: unlock on error Bug: https://gitlab.freedesktop.org/drm/amd/issues/900 Signed-off-by: Alex Deucher --- drivers/gpu/drm/amd/powerplay/renoir_ppt.c | 6 -- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git

[PATCH v2 3/5] drm/amdgpu/smu: add metrics table lock for navi (v2)

2019-12-17 Thread Alex Deucher
To protect access to the metrics table. v2: unlock on error Bug: https://gitlab.freedesktop.org/drm/amd/issues/900 Signed-off-by: Alex Deucher --- drivers/gpu/drm/amd/powerplay/navi10_ppt.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/drivers/gpu/drm/amd/powerplay/navi10_ppt.c

[PATCH v2 2/5] drm/amdgpu/smu: add metrics table lock for arcturus (v2)

2019-12-17 Thread Alex Deucher
To protect access to the metrics table. v2: unlock on error Bug: https://gitlab.freedesktop.org/drm/amd/issues/900 Signed-off-by: Alex Deucher --- drivers/gpu/drm/amd/powerplay/arcturus_ppt.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/drivers/gpu/drm/amd/powerplay/arcturus_ppt.c

[PATCH v2 5/5] drm/amdgpu/smu: add metrics table lock for vega20 (v2)

2019-12-17 Thread Alex Deucher
To protect access to the metrics table. v2: unlock on error Bug: https://gitlab.freedesktop.org/drm/amd/issues/900 Signed-off-by: Alex Deucher --- drivers/gpu/drm/amd/powerplay/vega20_ppt.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/drivers/gpu/drm/amd/powerplay/vega20_ppt.c

Re: [PATCH 2/2] drm/amdgpu: attempt xgmi perfmon re-arm on failed arm

2019-12-17 Thread Felix Kuehling
On 2019-12-17 12:28, Jonathan Kim wrote: The DF routines to arm xGMI performance will attempt to re-arm both on performance monitoring start and read on initial failure to arm. Signed-off-by: Jonathan Kim --- drivers/gpu/drm/amd/amdgpu/df_v3_6.c | 153 --- 1 file

Re: [PATCH 2/2] drm/amdkfd: expose num_cp_queues data field to topology node

2019-12-17 Thread Felix Kuehling
See comment inline. Other than that, the series looks good to me. On 2019-12-16 2:02, Huang Rui wrote: Thunk driver would like to know the num_cp_queues data, however this data relied on different asic specific. So it's better to get it from kfd driver. Signed-off-by: Huang Rui ---

Re: [PATCH v2 1/5] drm/amdgpu/smu: add metrics table lock

2019-12-17 Thread Wang, Kevin(Yang)
[AMD Official Use Only - Internal Distribution Only] The series patches are Reviewed-by: Kevin Wang From: amd-gfx on behalf of Alex Deucher Sent: Wednesday, December 18, 2019 5:45 AM To: amd-gfx@lists.freedesktop.org Cc: Deucher, Alexander Subject: [PATCH

[PATCH] drm/amdgpu: no SMC firmware reloading for non-RAS baco reset

2019-12-17 Thread Evan Quan
For non-RAS baco reset, there is no need to reset the SMC. Thus the firmware reloading should be avoided. Change-Id: I73f6284541d0ca0e82761380a27e32484fb0061c Signed-off-by: Evan Quan --- drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c | 3 ++- drivers/gpu/drm/amd/amdgpu/psp_v11_0.c | 14

[PATCH] drm/amdgpu: correctly report gpu recover status

2019-12-17 Thread Evan Quan
Knowing whether gpu recovery was performed successfully or not is important for our BACO development. Change-Id: I0e3ca4dcb65a053eb26bc55ad7431e4a42e160de Signed-off-by: Evan Quan --- drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c | 4 +--- 1 file changed, 1 insertion(+), 3 deletions(-) diff --git

RE: [PATCH v2 5/5] drm/amdgpu/smu: add metrics table lock for vega20 (v2)

2019-12-17 Thread Quan, Evan
It's fine with me to check them in as a temporary workaround. Series is reviewed-by: Evan Quan > -Original Message- > From: amd-gfx On Behalf Of Alex > Deucher > Sent: Wednesday, December 18, 2019 5:46 AM > To: amd-gfx@lists.freedesktop.org > Cc: Deucher, Alexander > Subject: [PATCH

Re: [PATCH 1/2] drm/amdgpu: update the method to get fb_loc of memory training

2019-12-17 Thread Wang, Kevin(Yang)
[AMD Official Use Only - Internal Distribution Only] From: Tianci Yin Sent: Wednesday, December 18, 2019 10:21 AM To: amd-gfx@lists.freedesktop.org Cc: Tuikov, Luben ; Koenig, Christian ; Deucher, Alexander ; Zhang, Hawking ; Xu, Feifei ; Yuan, Xiaojie ;

[PATCH 1/2] drm/amdgpu: update the method to get fb_loc of memory training

2019-12-17 Thread Tianci Yin
From: "Tianci.Yin" The method of getting fb_loc changed from parsing VBIOS to taking certain offset from top of VRAM Change-Id: I053b42fdb1d822722fa7980b2cd9f86b3fdce539 --- drivers/gpu/drm/amd/amdgpu/amdgpu.h | 3 +- .../gpu/drm/amd/amdgpu/amdgpu_atomfirmware.c | 36

[PATCH 2/2] drm/amdgpu: remove memory training p2c buffer reservation(V2)

2019-12-17 Thread Tianci Yin
From: "Tianci.Yin" IP discovery TMR(occupied the top VRAM with size DISCOVERY_TMR_SIZE) has been reserved, and the p2c buffer is in the range of this TMR, so the p2c buffer reservation is unnecessary. Change-Id: Ib1f2f2b4a1f3869c03ffe22e2836cdbee17ba99f Signed-off-by: Tianci.Yin ---

Re: [PATCH 2/2] drm/amdgpu: remove memory training p2c buffer reservation

2019-12-17 Thread Yin, Tianci (Rico)
Hi Guchun, Thanks very much for your suggestion. I will refine it and send it out later. Rico From: Chen, Guchun Sent: Tuesday, December 17, 2019 22:11 To: Yin, Tianci (Rico) ; amd-gfx@lists.freedesktop.org Cc: Long, Gang ; Yin, Tianci (Rico) ; Xu, Feifei ;