Re: [PATCH] drm/amdgpu: check ring type for secure IBs

2020-04-23 Thread Christian König
Am 23.04.20 um 22:47 schrieb Alex Deucher: We don't support secure operation on compute rings at the moment so reject them. Signed-off-by: Alex Deucher Reviewed-by: Christian König --- drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c | 6 ++ 1 file changed, 6 insertions(+) diff --git a/drive

RE: [PATCH 8/8] drm/amdgpu: for nv12 always need smu ip

2020-04-23 Thread Quan, Evan
Please check whether this is needed also for the following code: if (adev->firmware.load_type == AMDGPU_FW_LOAD_DIRECT && !amdgpu_sriov_vf(adev)) amdgpu_device_ip_block_add(adev, &smu_v11_0_ip_block); Other than that, this one and patch1-

RE: [PATCH 7/8] drm/amdgpu: skip sysfs node not belong to one vf mode

2020-04-23 Thread Quan, Evan
-Original Message- From: amd-gfx On Behalf Of Monk Liu Sent: Thursday, April 23, 2020 3:02 PM To: amd-gfx@lists.freedesktop.org Cc: Liu, Monk Subject: [PATCH 7/8] drm/amdgpu: skip sysfs node not belong to one vf mode Signed-off-by: Monk Liu --- drivers/gpu/drm/amd/amdgpu/amdgpu_pm.c

RE: [PATCH 6/8] drm/amdgpu: enable one vf mode for nv12

2020-04-23 Thread Quan, Evan
-Original Message- From: amd-gfx On Behalf Of Monk Liu Sent: Thursday, April 23, 2020 3:02 PM To: amd-gfx@lists.freedesktop.org Cc: Liu, Monk Subject: [PATCH 6/8] drm/amdgpu: enable one vf mode for nv12 Signed-off-by: Monk Liu --- drivers/gpu/drm/amd/powerplay/amdgpu_smu.c | 12 +++-

RE: [PATCH 5/8] drm/amdgpu: clear the messed up checking logic

2020-04-23 Thread Quan, Evan
Please make this the last one of the series. Other than that, this is acked-by: Evan Quan -Original Message- From: amd-gfx On Behalf Of Monk Liu Sent: Thursday, April 23, 2020 3:02 PM To: amd-gfx@lists.freedesktop.org Cc: Liu, Monk Subject: [PATCH 5/8] drm/amdgpu: clear the messed up ch

RE: [PATCH] drm/amdgpu: check ring type for secure IBs

2020-04-23 Thread Liu, Aaron
[AMD Official Use Only - Internal Distribution Only] Reviewed-by: Aaron Liu -- Best Regards Aaron Liu -Original Message- From: amd-gfx On Behalf Of Alex Deucher Sent: Friday, April 24, 2020 4:47 AM To: amd-gfx@lists.freedesktop.org Cc: Deucher, Alexander Subject: [PATCH] drm/amdgpu: c

Re: [PATCH] drm/amdgpu: check ring type for secure IBs

2020-04-23 Thread Andrey Grodzovsky
Reviewed-by: Andrey Grodzovsky Andrey On 4/23/20 4:47 PM, Alex Deucher wrote: We don't support secure operation on compute rings at the moment so reject them. Signed-off-by: Alex Deucher --- drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c | 6 ++ 1 file changed, 6 insertions(+) diff --git a/d

[PATCH] drm/amdgpu: check ring type for secure IBs

2020-04-23 Thread Alex Deucher
We don't support secure operation on compute rings at the moment so reject them. Signed-off-by: Alex Deucher --- drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c | 6 ++ 1 file changed, 6 insertions(+) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c index

Re: [PATCH] PCI/P2PDMA: Add additional AMD ZEN root ports to the whitelist

2020-04-23 Thread Bjorn Helgaas
On Mon, Apr 06, 2020 at 03:42:01PM -0400, Alex Deucher wrote: > According to the hw architect, pre-ZEN parts support > p2p writes and ZEN parts support both p2p reads and writes. > > Add entries for Zen parts Raven (0x15d0) and Renoir (0x1630). > > Cc: Christian König > Acked-by: Christian König

Re: [PATCH] PCI/P2PDMA: Add additional AMD ZEN root ports to the whitelist

2020-04-23 Thread Alex Deucher
+ Bjorn Can chance I can get this picked up for -next? Thanks, Alex On Mon, Apr 6, 2020 at 3:42 PM Alex Deucher wrote: > > According to the hw architect, pre-ZEN parts support > p2p writes and ZEN parts support both p2p reads and writes. > > Add entries for Zen parts Raven (0x15d0) and Renoir

Re: [PATCH][next] drm/amd/display: remove redundant assignment to variable ret

2020-04-23 Thread Alex Deucher
On Thu, Apr 23, 2020 at 10:18 AM Colin King wrote: > > From: Colin Ian King > > The variable ret is being initialized with a value that is never read > and it is being updated later with a new value. The initialization is > redundant and can be removed. > > Addresses-Coverity: ("Unused value") >

RE: [PATCH][next] drm/amd/display: fix incorrect assignment due to a typo

2020-04-23 Thread Koo, Anthony
Hi Colin, Sorry for any confusion of this code. I think in this case, it seems like the comment is wrong (but original implementation is somewhat wrong as well). Probably the original code implementation makes it unclear. There are three scenarios: 1. Variable refresh active, targeting a fixed

Re: [PATCH 1/2] drm/amdgpu: stop cp resume when compute ring test failed

2020-04-23 Thread Alex Deucher
On Thu, Apr 23, 2020 at 10:55 AM Christian König wrote: > > Yeah, we certainly could try this again. But maybe split that up into > individual patches for gfx7/8/9. > > In other words make it easy to revert if this still doesn't work well on > gfx7 or some other generation. Yeah, unless there is

Re: [PATCH 1/2] drm/amdgpu: stop cp resume when compute ring test failed

2020-04-23 Thread Christian König
Yeah, we certainly could try this again. But maybe split that up into individual patches for gfx7/8/9. In other words make it easy to revert if this still doesn't work well on gfx7 or some other generation. Christian. Am 23.04.20 um 15:43 schrieb Zhang, Hawking: [AMD Official Use Only - Int

[PATCH][next] drm/amd/display: remove redundant assignment to variable ret

2020-04-23 Thread Colin King
From: Colin Ian King The variable ret is being initialized with a value that is never read and it is being updated later with a new value. The initialization is redundant and can be removed. Addresses-Coverity: ("Unused value") Signed-off-by: Colin Ian King --- drivers/gpu/drm/amd/display/dc/c

[PATCH][next] drm/amd/display: fix incorrect assignment due to a typo

2020-04-23 Thread Colin King
From: Colin Ian King The assignment to infopacket->sb[7] looks incorrect, the comment states it is the minimum refresh rate yet it is being assigned a value from the maximum refresh rate max_refresh_in_uhz. Fix this by using min_refresh_in_uhz instead. Addresses-Coverity: ("Copy-paste error") Fi

RE: [PATCH 1/2] drm/amdgpu: stop cp resume when compute ring test failed

2020-04-23 Thread Zhang, Hawking
[AMD Official Use Only - Internal Distribution Only] Would you mind to enable this and try it again? The recent gpu reset testing on vega20 looks very positive. Regards, Hawking -Original Message- From: Christian König Sent: Thursday, April 23, 2020 20:31 To: Zhang, Hawking ; amd-gfx@l

Re: [PATCH 1/2] drm/amdgpu: stop cp resume when compute ring test failed

2020-04-23 Thread Christian König
Am 23.04.20 um 11:01 schrieb Hawking Zhang: driver should stop cp resume once compute ring test failed Mhm intentionally ignored those errors because the compute rings sometimes doesn't come up again after a GPU reset. We even have the necessary logic in the SW scheduler to redirect the job

RE: [PATCH] drm/amdgpu: protect ring overrun

2020-04-23 Thread Tao, Yintian
Hi Christian Thanks. I will remove the initialization of r. Best Regards Yintian Tao -Original Message- From: Christian König Sent: 2020年4月23日 20:22 To: Tao, Yintian ; Koenig, Christian ; Liu, Monk ; Liu, Shaoyun Cc: amd-gfx@lists.freedesktop.org Subject: Re: [PATCH] drm/amdgpu: pro

Re: [PATCH 4/8] drm/amdgpu: provide RREG32_SOC15_NO_KIQ, will be used later

2020-04-23 Thread Christian König
Am 23.04.20 um 09:01 schrieb Monk Liu: Signed-off-by: Monk Liu Yeah, I also stumbled over that recently. Patch is Acked-by: Christian König . --- drivers/gpu/drm/amd/amdgpu/soc15_common.h | 3 +++ 1 file changed, 3 insertions(+) diff --git a/drivers/gpu/drm/amd/amdgpu/soc15_common.h

Re: [PATCH] drm/amdgpu: protect ring overrun

2020-04-23 Thread Christian König
Am 23.04.20 um 11:06 schrieb Yintian Tao: Wait for the oldest sequence on the ring to be signaled in order to make sure there will be no command overrun. v2: fix coding stype and remove abs operation One nit pick below, with that fixed the patch is Reviewed-by: Christian König Signed-off

Re: [PATCH] drm/amdgpu: Add autodump debugfs node for gpu reset

2020-04-23 Thread Christian König
Am 23.04.20 um 12:44 schrieb Zhao, Jiange: [AMD Official Use Only - Internal Distribution Only] Hi Christian, Here are some explanations, (1) registered means that an app is listening to this node, completion means that this app has finished a dump. Yeah and both are the same thing. E.g. whe

RE: [PATCH] drm/amdgpu: Add autodump debugfs node for gpu reset

2020-04-23 Thread Zhao, Jiange
[AMD Official Use Only - Internal Distribution Only] Hi Christian, Here are some explanations, (1) registered means that an app is listening to this node, completion means that this app has finished a dump. (2) after a dump, listening app would close this node. If it wants to get another reset

Re: [PATCH] drm/amdgpu: Add autodump debugfs node for gpu reset

2020-04-23 Thread Pierre-Eric Pelloux-Prayer
Hi, The build fails for me with this patch applied (poll_wait, POLLIN, POLLRDNORM and POLLWRNORM are undeclared). Adding "#include " to amdgpu_debugfs.c fixes it. Pierre-Eric On 23/04/2020 09:19, jia...@amd.com wrote: > From: Jiange Zhao > > When GPU got timeout, it would notify an interested

RE: [PATCH 2/2] drm/amdgpu: drop the unused local variable

2020-04-23 Thread Tao, Yintian
Hi Hawking Can you help also remove the same local variable kiq for gfx_v10_0_ring_emit_rreg? Thanks in advance. After that , Reviewed-by: Yintian Tao Best Regards Yintian Tao -Original Message- From: amd-gfx On Behalf Of Hawking Zhang Sent: 2020年4月23日 17:02 To: amd-gfx@lists.freede

[PATCH] drm/amdgpu: protect ring overrun

2020-04-23 Thread Yintian Tao
Wait for the oldest sequence on the ring to be signaled in order to make sure there will be no command overrun. v2: fix coding stype and remove abs operation Signed-off-by: Yintian Tao --- drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c | 10 +- drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c | 22

[PATCH 2/2] drm/amdgpu: drop the unused local variable

2020-04-23 Thread Hawking Zhang
local variable kiq won't be used in function gfx_v8_0_ring_emit_rreg Change-Id: I6229987c8ce43ff0d55e1fae15ede9cb0827f76d Signed-off-by: Hawking Zhang --- drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c | 1 - 1 file changed, 1 deletion(-) diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c b/drivers/gpu

RE: [PATCH 2/2] drm/amdgpu: limit smu_set_mp1_state to pp_one_vf or bare-metal

2020-04-23 Thread Liu, Monk
Drop this one _ Monk Liu|GPU Virtualization Team |AMD -Original Message- From: Monk Liu Sent: Thursday, April 23, 2020 4:13 PM To: amd-gfx@lists.freedesktop.org Cc: Liu, Monk Subject: [PATCH 2/2] drm/amdgpu: limit smu_set_mp1_state to pp_one_vf or

[PATCH 1/2] drm/amdgpu: stop cp resume when compute ring test failed

2020-04-23 Thread Hawking Zhang
driver should stop cp resume once compute ring test failed Change-Id: I4cd3328f38e0755d0c877484936132d204c9fe50 Signed-off-by: Hawking Zhang --- drivers/gpu/drm/amd/amdgpu/gfx_v7_0.c | 4 +++- drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c | 4 +++- drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c | 4 +++- 3 files

Re: [PATCH] drm/amdgpu: Add autodump debugfs node for gpu reset

2020-04-23 Thread Christian König
Am 23.04.20 um 09:19 schrieb jia...@amd.com: From: Jiange Zhao When GPU got timeout, it would notify an interested part of an opportunity to dump info before actual GPU reset. A usermode app would open 'autodump' node under debugfs system and poll() for readable/writable. When a GPU reset is d

[PATCH 1/2] drm/amdgpu: extent threshold of waiting FLR_COMPLETE

2020-04-23 Thread Monk Liu
to 5s to satisfy WHOLE GPU reset which need 3+ seconds to finish Signed-off-by: Monk Liu --- drivers/gpu/drm/amd/amdgpu/mxgpu_ai.h | 2 +- drivers/gpu/drm/amd/amdgpu/mxgpu_nv.h | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/mxgpu_ai.h b/drivers

[PATCH 2/2] drm/amdgpu: limit smu_set_mp1_state to pp_one_vf or bare-metal

2020-04-23 Thread Monk Liu
Signed-off-by: Monk Liu --- drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c index 3d601d5..810141f 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu

Re: [PATCH] drm/amdgpu: Add autodump debugfs node for gpu reset

2020-04-23 Thread Christian König
Yes, an open file descriptor holds a reference to the driver module. So it shouldn't be possible to unload the driver while it is open. Christian. Am 23.04.20 um 09:54 schrieb Liu, Monk: Oh, looks if the daemon is opening the node KMD don't have a chance to enter the path of shutdown/unload d

RE: [PATCH] drm/amdgpu: Add autodump debugfs node for gpu reset

2020-04-23 Thread Liu, Monk
Oh, looks if the daemon is opening the node KMD don't have a chance to enter the path of shutdown/unload driver, thus no chance to return "kmd unloading" to the app... _ Monk Liu|GPU Virtualization Team |AMD -Original Message- From: amd-gfx On Behal

[PATCH] drm/amdgpu/display: Fix dc_sink refcnt leak in dc_link_detect_helper

2020-04-23 Thread Xiyu Yang
dc_link_detect_helper() invokes dc_sink_retain(), which increases the refcount of the "prev_sink". When dc_link_detect_helper() returns, local variable "prev_sink" becomes invalid, so the refcount should be decreased to keep refcount balanced. The reference counting issue happens in one exception

[PATCH] drm/amdgpu/display: Fix dc_sink refcnt leak when detecting link

2020-04-23 Thread Xiyu Yang
emulated_link_detect() invokes dc_sink_retain(), which increases the refcount of the "prev_sink". When emulated_link_detect() returns, local variable "prev_sink" becomes invalid, so the refcount should be decreased to keep refcount balanced. The reference counting issue happens in all paths of em

RE: [PATCH] drm/amdgpu: Add autodump debugfs node for gpu reset

2020-04-23 Thread Liu, Monk
Hi Christian Do you think we need to kill the daemon app if we do KMD unloading ? usually user need to close the app first and then the KMD could be unloaded If we don't want to manually shutdown the daemon app we can do a "KILL" signal send to that process, or we can implement "read" and let

Re: [PATCH] drm/amdgpu: protect ring overrun

2020-04-23 Thread Christian König
Am 23.04.20 um 06:22 schrieb Yintian Tao: Wait for the oldest sequence on the ring to be signaled in order to make sure there will be no command overrun. One technical problem and a few style suggestions below. Apart from that looks good to me. Signed-off-by: Yintian Tao --- drivers/gpu

Re: [PATCH hmm 5/5] mm/hmm: remove the customizable pfn format from hmm_range_fault

2020-04-23 Thread Christoph Hellwig
On Wed, Apr 22, 2020 at 09:39:11AM -0300, Jason Gunthorpe wrote: > > Nice callchain from hell.. Unfortunately such "code listings" tend to > > get out of date very quickly, so I'm not sure it is worth keeping in > > the code. What would be really worthile is consolidating the two > > different se

[PATCH] drm/amdgpu: Add autodump debugfs node for gpu reset

2020-04-23 Thread jianzh
From: Jiange Zhao When GPU got timeout, it would notify an interested part of an opportunity to dump info before actual GPU reset. A usermode app would open 'autodump' node under debugfs system and poll() for readable/writable. When a GPU reset is due, amdgpu would notify usermode app through wa

RE: [PATCH 1/8] drm/amdgpu: ignore TA ucode for SRIOV

2020-04-23 Thread Tao, Yintian
Series is Acked-by: Yintian Tao -Original Message- From: amd-gfx On Behalf Of Monk Liu Sent: 2020年4月23日 15:02 To: amd-gfx@lists.freedesktop.org Cc: Liu, Monk Subject: [PATCH 1/8] drm/amdgpu: ignore TA ucode for SRIOV Signed-off-by: Monk Liu --- drivers/gpu/drm/amd/amdgpu/psp_v11_0.c

[PATCH 8/8] drm/amdgpu: for nv12 always need smu ip

2020-04-23 Thread Monk Liu
because nv12 SRIOV support one vf mode Signed-off-by: Monk Liu --- drivers/gpu/drm/amd/amdgpu/nv.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/nv.c b/drivers/gpu/drm/amd/amdgpu/nv.c index 995bdec..9c42316 100644 --- a/drivers/gpu/drm/amd/amdgp

[PATCH 2/8] drm/amdgpu: skip cg/pg set for SRIOV

2020-04-23 Thread Monk Liu
Signed-off-by: Monk Liu --- drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c | 7 +++ 1 file changed, 7 insertions(+) diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c b/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c index 8a579ce..909ef08 100644 --- a/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c +++ b/drivers/g

[PATCH 5/8] drm/amdgpu: clear the messed up checking logic

2020-04-23 Thread Monk Liu
for MI100 + ASICS, we always support SW_SMU for bare-metal and for SRIOV one_vf_mode Signed-off-by: Monk Liu --- drivers/gpu/drm/amd/powerplay/amdgpu_smu.c | 11 +++ 1 file changed, 3 insertions(+), 8 deletions(-) diff --git a/drivers/gpu/drm/amd/powerplay/amdgpu_smu.c b/drivers/gpu/dr

[PATCH 4/8] drm/amdgpu: provide RREG32_SOC15_NO_KIQ, will be used later

2020-04-23 Thread Monk Liu
Signed-off-by: Monk Liu --- drivers/gpu/drm/amd/amdgpu/soc15_common.h | 3 +++ 1 file changed, 3 insertions(+) diff --git a/drivers/gpu/drm/amd/amdgpu/soc15_common.h b/drivers/gpu/drm/amd/amdgpu/soc15_common.h index c893c64..56d02aa 100644 --- a/drivers/gpu/drm/amd/amdgpu/soc15_common.h +++ b/d

[PATCH 7/8] drm/amdgpu: skip sysfs node not belong to one vf mode

2020-04-23 Thread Monk Liu
Signed-off-by: Monk Liu --- drivers/gpu/drm/amd/amdgpu/amdgpu_pm.c | 48 -- 1 file changed, 28 insertions(+), 20 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_pm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_pm.c index 49e2e43..c762deb 100644 --- a/drivers/

[PATCH 6/8] drm/amdgpu: enable one vf mode for nv12

2020-04-23 Thread Monk Liu
Signed-off-by: Monk Liu --- drivers/gpu/drm/amd/powerplay/amdgpu_smu.c | 12 +++- drivers/gpu/drm/amd/powerplay/navi10_ppt.c | 6 +++- drivers/gpu/drm/amd/powerplay/smu_v11_0.c | 49 +- 3 files changed, 52 insertions(+), 15 deletions(-) diff --git a/drivers/gpu/

[PATCH 3/8] drm/amdgpu: sriov is forbidden to call disable DPM

2020-04-23 Thread Monk Liu
Signed-off-by: Monk Liu --- drivers/gpu/drm/amd/powerplay/amdgpu_smu.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/drivers/gpu/drm/amd/powerplay/amdgpu_smu.c b/drivers/gpu/drm/amd/powerplay/amdgpu_smu.c index 88b4e56..2bb1e0c 100644 --- a/drivers/gpu/drm/amd/powerplay/amdgpu_smu.c +++

[PATCH 1/8] drm/amdgpu: ignore TA ucode for SRIOV

2020-04-23 Thread Monk Liu
Signed-off-by: Monk Liu --- drivers/gpu/drm/amd/amdgpu/psp_v11_0.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/drivers/gpu/drm/amd/amdgpu/psp_v11_0.c b/drivers/gpu/drm/amd/amdgpu/psp_v11_0.c index 0afd610..b4b0242 100644 --- a/drivers/gpu/drm/amd/amdgpu/psp_v11_0.c +++ b/drivers/gpu/dr