RE: [PATCH 11/11] drm/amdgpu: stop removing BOs from the LRU during CS

2019-05-14 Thread Liang, Prike
Hi Christian , I just wonder when encounter ENOMEM error during pin amdgpu BOs can we retry validate again as below. With the following simply patch the Abaqus pinned issue not observed. diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c index

Re: [PATCH 2/2] mm/hmm: Only set FAULT_FLAG_ALLOW_RETRY for non-blocking

2019-05-14 Thread Alex Deucher
On Tue, May 14, 2019 at 5:12 PM Kuehling, Felix wrote: > > > On 2019-05-13 4:21 p.m., Deucher, Alexander wrote: > > [CAUTION: External Email] > > I reverted all the amdgpu HMM patches for 5.2 because they also > > depended on this patch: > >

Re: [PATCH 1/2] mm/hmm: support automatic NUMA balancing

2019-05-14 Thread Kuehling, Felix
On 2019-05-13 5:27 p.m., Andrew Morton wrote: > [CAUTION: External Email] > > On Fri, 10 May 2019 19:53:23 + "Kuehling, Felix" > wrote: > >> From: Philip Yang >> >> While the page is migrating by NUMA balancing, HMM failed to detect this >> condition and still return the old page.

Re: [PATCH 2/2] mm/hmm: Only set FAULT_FLAG_ALLOW_RETRY for non-blocking

2019-05-14 Thread Kuehling, Felix
On 2019-05-13 4:21 p.m., Deucher, Alexander wrote: > [CAUTION: External Email] > I reverted all the amdgpu HMM patches for 5.2 because they also > depended on this patch: > https://cgit.freedesktop.org/~agd5f/linux/commit/?h=drm-next-5.2-wip=ce05ef71564f7cbe270cd4337c36ee720ea534db > which did

Re: [PATCH 1/3] drm/amd/display: Add ASICREV_IS_PICASSO

2019-05-14 Thread Harry Wentland
On 2019-05-14 2:20 p.m., Kazlauskas, Nicholas wrote: > [CAUTION: External Email] > > On 5/14/19 1:49 PM, Harry Wentland wrote: >> >> [WHY] >> We only want to load DMCU FW on Picasso and Raven 2, not on Raven 1. >> >> Signed-off-by: Harry Wentland >> --- >>

Re: [PATCH 4/4] drm/amdkfd: Check against device cgroup

2019-05-14 Thread Kasiviswanathan, Harish
Hi Roman, Thanks for your feedback. I will rework and send new patch soon. Best Regards, Harish From: Roman Gushchin Sent: Tuesday, May 14, 2019 1:37 PM To: Kasiviswanathan, Harish Cc: cgro...@vger.kernel.org; amd-gfx@lists.freedesktop.org Subject: Re: [PATCH 4/4] drm/amdkfd: Check

Re: [PATCH 11/11] drm/amdgpu: stop removing BOs from the LRU during CS

2019-05-14 Thread Marek Olšák
This series fixes the OOM errors. However, if I torture the kernel driver more, I can get it to deadlock and end up with unkillable processes. I can also get an OOM error. I just ran the test 5 times: AMD_DEBUG=testgdsmm glxgears & AMD_DEBUG=testgdsmm glxgears & AMD_DEBUG=testgdsmm glxgears &

Re: [PATCH 4/4] drm/amdkfd: Check against device cgroup

2019-05-14 Thread Roman Gushchin
On Tue, May 14, 2019 at 01:58:40AM +, Roman Gushchin wrote: > On Wed, May 01, 2019 at 02:59:29PM +, Kasiviswanathan, Harish wrote: > > Participate in device cgroup. All kfd devices are exposed via /dev/kfd. > > So use /dev/dri/renderN node. > > > > Before exposing the device to a task

Re: [PATCH 1/3] drm/amd/display: Add ASICREV_IS_PICASSO

2019-05-14 Thread Kazlauskas, Nicholas
On 5/14/19 1:49 PM, Harry Wentland wrote: > > [WHY] > We only want to load DMCU FW on Picasso and Raven 2, not on Raven 1. > > Signed-off-by: Harry Wentland > --- > drivers/gpu/drm/amd/display/include/dal_asic_id.h | 7 --- > 1 file changed, 4 insertions(+), 3 deletions(-) > > diff

Re: [PATCH 3/3] drm/amd/display: Drop DCN1_01 guards

2019-05-14 Thread Harry Wentland
On 2019-05-14 2:02 p.m., Kazlauskas, Nicholas wrote: > On 5/14/19 1:49 PM, Harry Wentland wrote: >> >> [WHY] >> These were only needed for bringup. They're not needed anymore. >> >> Signed-off-by: Harry Wentland > > Series is: > > Reviewed-by: Nicholas Kazlauskas > > I think a lot of those

Re: [PATCH 3/3] drm/amd/display: Drop DCN1_01 guards

2019-05-14 Thread Kazlauskas, Nicholas
On 5/14/19 1:49 PM, Harry Wentland wrote: > > [WHY] > These were only needed for bringup. They're not needed anymore. > > Signed-off-by: Harry Wentland Series is: Reviewed-by: Nicholas Kazlauskas I think a lot of those DCN guards around checking ASIC revision aren't strictly necessary

[PATCH 1/3] drm/amd/display: Add ASICREV_IS_PICASSO

2019-05-14 Thread Harry Wentland
[WHY] We only want to load DMCU FW on Picasso and Raven 2, not on Raven 1. Signed-off-by: Harry Wentland --- drivers/gpu/drm/amd/display/include/dal_asic_id.h | 7 --- 1 file changed, 4 insertions(+), 3 deletions(-) diff --git a/drivers/gpu/drm/amd/display/include/dal_asic_id.h

[PATCH 3/3] drm/amd/display: Drop DCN1_01 guards

2019-05-14 Thread Harry Wentland
[WHY] These were only needed for bringup. They're not needed anymore. Signed-off-by: Harry Wentland --- drivers/gpu/drm/amd/display/Kconfig | 6 -- .../gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 2 -- .../display/dc/bios/command_table_helper2.c | 5 -

[PATCH 2/3] drm/amd/display: Don't load DMCU for Raven 1

2019-05-14 Thread Harry Wentland
[WHY] Some early Raven boards had a bad SBIOS that doesn't play nicely with the DMCU FW. We thought the issues were fixed by ignoring errors on DMCU load but that doesn't seem to be the case. We've still seen reports of users unable to boot their systems at all. [HOW] Disable DMCU load on Raven

Re:[PATCH 11/11] drm/amdgpu: stop removing BOs from the LRU during CS

2019-05-14 Thread Zhou, David(ChunMing)
how to refresh LRU to keep the order align with bo list passed from user space? you can verify it by some games, performance could be different much between multiple runnings. -David Original Message Subject: Re: [PATCH 11/11] drm/amdgpu: stop removing BOs from the LRU during

RE: [PATCH 2/2] drm/amd/powerplay: Enable "disable dpm" feature for vega20 power functions.

2019-05-14 Thread Zhang, Hawking
Hi Jack, Let's hold on patch #2 for now. I'd prefer to gracefully deal with dpm_disabled in amdgpu_pm interface level (or at least amdgpu_smu level) so that we don't need to maintain the case for each ASIC since ppt is asic specific ones. Regards, Hawking -Original Message- From: Gui,

Re: [PATCH 2/2] drm/amd/powerplay: support sw smu hotspot and memory temperature retrieval

2019-05-14 Thread Deucher, Alexander
Series is: Reviewed-by: Alex Deucher From: amd-gfx on behalf of Evan Quan Sent: Tuesday, May 14, 2019 7:19 AM To: amd-gfx@lists.freedesktop.org Cc: Quan, Evan Subject: [PATCH 2/2] drm/amd/powerplay: support sw smu hotspot and memory temperature retrieval

RE: [PATCH] drm/amdgpu/sriov: Need to initialize the HDP_NONSURFACE_BAStE

2019-05-14 Thread Deucher, Alexander
> -Original Message- > From: amd-gfx On Behalf Of > Christian König > Sent: Tuesday, May 14, 2019 6:34 AM > To: Zhou, Tiecheng ; amd- > g...@lists.freedesktop.org > Cc: Deng, Emily > Subject: Re: [PATCH] drm/amdgpu/sriov: Need to initialize the > HDP_NONSURFACE_BAStE > > [CAUTION:

Re: [PATCH] drm/amdgpu/sriov: Need to initialize the HDP_NONSURFACE_BAStE

2019-05-14 Thread Deucher, Alexander
Reviewed-by: Alex Deucher From: amd-gfx on behalf of Tiecheng Zhou Sent: Monday, May 13, 2019 11:34 PM To: amd-gfx@lists.freedesktop.org Cc: Zhou, Tiecheng; Deng, Emily Subject: [PATCH] drm/amdgpu/sriov: Need to initialize the HDP_NONSURFACE_BAStE [CAUTION:

RE: [PATCH 1/2] drm/amd/powerplay: Enable "disable dpm" feature to support swSMU debug

2019-05-14 Thread Zhang, Hawking
Reviewed-by: Hawking Zhang Regards, Hawking _ From: Gui, Jack Sent: 2019年5月13日 11:32 To: Zhang, Hawking ; amd-gfx@lists.freedesktop.org Subject: RE: [PATCH 1/2] drm/amd/powerplay: Enable "disable dpm" feature to support swSMU debug << File:

Re: [PATCH 11/11] drm/amdgpu: stop removing BOs from the LRU during CS

2019-05-14 Thread Christian König
Hui? What do you mean with that? Christian. Am 14.05.19 um 15:12 schrieb Zhou, David(ChunMing): my only concern is how to fresh LRU when bo is from bo list. -David Original Message Subject: [PATCH 11/11] drm/amdgpu: stop removing BOs from the LRU during CS From: Christian

RE: [PATCH] drm/amdgpu: Report firmware versions with sysfs v2

2019-05-14 Thread Russell, Kent
I don't think that anyone understands all of the internals of FW version handling  Right now we're just returning the same values as what debugfs returns. The SMI tool will do the version interpretation (like turning 0x00282b00 into 40.43 for the SMU, for example). Or knowing that the CP

Re:[PATCH 11/11] drm/amdgpu: stop removing BOs from the LRU during CS

2019-05-14 Thread Zhou, David(ChunMing)
my only concern is how to fresh LRU when bo is from bo list. -David Original Message Subject: [PATCH 11/11] drm/amdgpu: stop removing BOs from the LRU during CS From: Christian König To: "Olsak, Marek" ,"Zhou, David(ChunMing)" ,"Liang, Prike"

[PATCH 07/11] drm/ttm: immediately move BOs to the new LRU

2019-05-14 Thread Christian König
Move BOs which are currently in the system domain to the new LRU before allocating backing space. This makes sure that we always have enough entries on the LRU to allow for other processes to wait for an operation to complete. Signed-off-by: Christian König --- drivers/gpu/drm/ttm/ttm_bo.c |

[PATCH 09/11] drm/ttm: convert EDEADLK into EAGAIN

2019-05-14 Thread Christian König
Let userspace try again if we really run into a deadlock during eviction. This has a low chance of live locking, but with guaranteed forward process. Signed-off-by: Christian König --- drivers/gpu/drm/ttm/ttm_bo.c | 2 ++ 1 file changed, 2 insertions(+) diff --git

[PATCH 10/11] drm/amd/display: use ttm_eu_reserve_buffers instead of amdgpu_bo_reserve v2

2019-05-14 Thread Christian König
From: Chunming Zhou add ticket for display bo, so that it can preempt busy bo. v2: fix stupid rebase error Change-Id: I9f031cdcc8267de00e819ae303baa0a52df8ebb9 Signed-off-by: Chunming Zhou Reviewed-by: Christian König --- .../gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 21

[PATCH 11/11] drm/amdgpu: stop removing BOs from the LRU during CS

2019-05-14 Thread Christian König
This avoids OOM situations when we have lots of threads submitting at the same time. Signed-off-by: Christian König --- drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c

[PATCH 05/11] drm/ttm: remove manual placement preference

2019-05-14 Thread Christian König
If drivers don't prefer a system memory placement they should not but it into the placement list first. Signed-off-by: Christian König --- drivers/gpu/drm/ttm/ttm_bo.c | 18 +- 1 file changed, 9 insertions(+), 9 deletions(-) diff --git a/drivers/gpu/drm/ttm/ttm_bo.c

[PATCH 06/11] drm/ttm: cleanup ttm_bo_mem_space

2019-05-14 Thread Christian König
We tried this once before, but that turned out to be more complicated than thought. With all the right prerequisites it looks like we can do this now. Signed-off-by: Christian König --- drivers/gpu/drm/ttm/ttm_bo.c | 128 ++- 1 file changed, 67 insertions(+), 61

[PATCH 08/11] drm/ttm: put new BOs immediately on the LRU

2019-05-14 Thread Christian König
This way they are available for eviction immediately. Signed-off-by: Christian König --- drivers/gpu/drm/ttm/ttm_bo.c | 13 + 1 file changed, 5 insertions(+), 8 deletions(-) diff --git a/drivers/gpu/drm/ttm/ttm_bo.c b/drivers/gpu/drm/ttm/ttm_bo.c index 233bfb86068b..a301c876ae31

[PATCH 04/11] drm/ttm: return immediately in case of a signal

2019-05-14 Thread Christian König
When a signal arrives we should return immediately for handling it and not try other placements first. Signed-off-by: Christian König --- drivers/gpu/drm/ttm/ttm_bo.c | 7 +++ 1 file changed, 3 insertions(+), 4 deletions(-) diff --git a/drivers/gpu/drm/ttm/ttm_bo.c

[PATCH 03/11] drm/ttm: remove the backing store if no placement is given

2019-05-14 Thread Christian König
Pipeline removal of the BOs backing store when no placement is given during validation. Signed-off-by: Christian König --- drivers/gpu/drm/ttm/ttm_bo.c | 12 1 file changed, 12 insertions(+) diff --git a/drivers/gpu/drm/ttm/ttm_bo.c b/drivers/gpu/drm/ttm/ttm_bo.c index

[PATCH 02/11] drm/ttm: fix busy memory to fail other user v8

2019-05-14 Thread Christian König
From: Chunming Zhou heavy gpu job could occupy memory long time, which lead other user fail to get memory. basically pick up Christian idea: 1. Reserve the BO in DC using a ww_mutex ticket (trivial). 2. If we then run into this EBUSY condition in TTM check if the BO we need memory for (or

[PATCH 01/11] drm/ttm: Make LRU removal optional.

2019-05-14 Thread Christian König
We are already doing this for DMA-buf imports and also for amdgpu VM BOs for quite a while now. If this doesn't run into any problems we are probably going to stop removing BOs from the LRU altogether. Signed-off-by: Christian König --- .../gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c | 9

[PATCH 2/2] drm/amd/powerplay: support sw smu hotspot and memory temperature retrieval

2019-05-14 Thread Evan Quan
Support hotspot and memory temperature retrieval on sw smu routine. Change-Id: If2ed1e2835f4b158a4a6d93aee8b358af18b9bfc Signed-off-by: Evan Quan --- drivers/gpu/drm/amd/powerplay/inc/smu_v11_0.h | 3 + drivers/gpu/drm/amd/powerplay/smu_v11_0.c | 74 --- 2 files changed, 66

[PATCH 1/2] drm/amd/powerplay: support uclk activity retrieve on sw smu routine

2019-05-14 Thread Evan Quan
Support realtime uclk activity report. Change-Id: I89cf7c95233060ee106e9fcef3b8e6707cd60466 Signed-off-by: Evan Quan --- drivers/gpu/drm/amd/powerplay/smu_v11_0.c | 15 ++- 1 file changed, 14 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/amd/powerplay/smu_v11_0.c

Re: [PATCH] drm/amdgpu/sriov: Need to initialize the HDP_NONSURFACE_BAStE

2019-05-14 Thread Christian König
Am 14.05.19 um 12:59 schrieb Tiecheng Zhou: it requires to initialize HDP_NONSURFACE_BASE, so as to avoid using the value left by a previous VM under sriov scenario. v2: it should not hurt baremetal, generalize it for both sriov and baremetal Signed-off-by: Emily Deng Signed-off-by: Tiecheng

[PATCH] drm/amdgpu/sriov: Need to initialize the HDP_NONSURFACE_BAStE

2019-05-14 Thread Tiecheng Zhou
it requires to initialize HDP_NONSURFACE_BASE, so as to avoid using the value left by a previous VM under sriov scenario. v2: it should not hurt baremetal, generalize it for both sriov and baremetal Signed-off-by: Emily Deng Signed-off-by: Tiecheng Zhou ---

Re: [PATCH] drm/amdgpu/sriov: Need to initialize the HDP_NONSURFACE_BAStE

2019-05-14 Thread Christian König
Am 14.05.19 um 12:24 schrieb Tiecheng Zhou: it requires to initialize HDP_NONSURFACE_BASE, so as to avoid using the value left by a previous VM under sriov scenario. Signed-off-by: Emily Deng Signed-off-by: Tiecheng Zhou Would it hurt us to also do this on bare metal? Apart from that looks

Re: [PATCH] drm/amdgpu: Report firmware versions with sysfs v2

2019-05-14 Thread Christian König
I don't claim that I understand all of the internals of fw version handling, but this looks really nice to me from a kernel perspective. Feel free to add my Reviewed-by: Christian König . Regards, Christian. Am 14.05.19 um 12:19 schrieb Russell, Kent: Looks fine to me, but hoping to get an

[PATCH] drm/amdgpu/sriov: Need to initialize the HDP_NONSURFACE_BAStE

2019-05-14 Thread Tiecheng Zhou
it requires to initialize HDP_NONSURFACE_BASE, so as to avoid using the value left by a previous VM under sriov scenario. Signed-off-by: Emily Deng Signed-off-by: Tiecheng Zhou --- drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c | 5 + 1 file changed, 5 insertions(+) diff --git

RE: [PATCH] drm/amdgpu: Report firmware versions with sysfs v2

2019-05-14 Thread Russell, Kent
Looks fine to me, but hoping to get an RB from Christian now that we're moving them into a subfolder. Kent -Original Message- From: amd-gfx On Behalf Of Messinger, Ori Sent: Thursday, May 9, 2019 4:35 PM To: amd-gfx@lists.freedesktop.org Cc: Messinger, Ori Subject: [PATCH]

Re: [PATCH libdrm 1/7] addr cs chunk for syncobj timeline

2019-05-14 Thread zhoucm1
Thank you, Lionel. -David On 2019年05月14日 17:49, Lionel Landwerlin wrote: [CAUTION: External Email] With the small nits, patches 2 & 4 are : Reviewed-by: Lionel Landwerlin The other patches are a bit amdgpu specific so maybe you might want someone more familiar with amdgpu to review them.

Re: [PATCH libdrm 1/7] addr cs chunk for syncobj timeline

2019-05-14 Thread Lionel Landwerlin
With the small nits, patches 2 & 4 are : Reviewed-by: Lionel Landwerlin The other patches are a bit amdgpu specific so maybe you might want someone more familiar with amdgpu to review them. Still I didn't see anything wrong with them so remaining patches are : Acked-by: Lionel Landwerlin

Re: [PATCH libdrm 4/7] add timeline signal/transfer ioctls v2

2019-05-14 Thread Lionel Landwerlin
On 13/05/2019 10:53, Chunming Zhou wrote: v2: use one transfer ioctl Signed-off-by: Chunming Zhou --- xf86drm.c | 33 + xf86drm.h | 6 ++ 2 files changed, 39 insertions(+) diff --git a/xf86drm.c b/xf86drm.c index 17e3d880..acd16fab 100644 ---

Re: [PATCH libdrm 2/7] add timeline wait/query ioctl v2

2019-05-14 Thread Lionel Landwerlin
On 13/05/2019 10:53, Chunming Zhou wrote: v2: drop export/import Signed-off-by: Chunming Zhou --- xf86drm.c | 44 xf86drm.h | 6 ++ 2 files changed, 50 insertions(+) diff --git a/xf86drm.c b/xf86drm.c index 2c19376b..17e3d880 100644 ---