Re: [Linaro-mm-sig] [PATCH 04/18] dma-fence: prime lockdep annotations

2020-06-11 Thread Daniel Vetter
On Fri, Jun 12, 2020 at 1:35 AM Felix Kuehling wrote: > > Am 2020-06-11 um 10:15 a.m. schrieb Jason Gunthorpe: > > On Thu, Jun 11, 2020 at 10:34:30AM +0200, Daniel Vetter wrote: > >>> I still have my doubts about allowing fence waiting from within shrinkers. > >>> IMO ideally they should use a

[PATCH 1/1] drm/amdkfd: Add eviction debug messages

2020-06-11 Thread Felix Kuehling
Use WARN to print messages with backtrace when evictions are triggered. This can help determine the root cause of evictions and help spot driver bugs triggering evictions unintentionally, or help with performance tuning by avoiding conditions that cause evictions in a specific workload. The

[PATCH] drm/amdkfd: Use correct major in devcgroup check

2020-06-11 Thread Lorenz Brun
The existing code used the major version number of the DRM driver instead of the device major number of the DRM subsystem for validating access for a devices cgroup. This meant that accesses allowed by the devices cgroup weren't permitted and certain accesses denied by the devices cgroup were

Re: [Linaro-mm-sig] [PATCH 04/18] dma-fence: prime lockdep annotations

2020-06-11 Thread Felix Kuehling
Am 2020-06-11 um 10:15 a.m. schrieb Jason Gunthorpe: > On Thu, Jun 11, 2020 at 10:34:30AM +0200, Daniel Vetter wrote: >>> I still have my doubts about allowing fence waiting from within shrinkers. >>> IMO ideally they should use a trywait approach, in order to allow memory >>> allocation during

Re: [PATCH] mm: Track mmu notifiers in fs_reclaim_acquire/release

2020-06-11 Thread Jason Gunthorpe
On Wed, Jun 10, 2020 at 09:41:01PM +0200, Daniel Vetter wrote: > fs_reclaim_acquire/release nicely catch recursion issues when > allocating GFP_KERNEL memory against shrinkers (which gpu drivers tend > to use to keep the excessive caches in check). For mmu notifier > recursions we do have lockdep

Re: [PATCH] drm/amdkfd: Use correct major in devcgroup check

2020-06-11 Thread Felix Kuehling
Am 2020-06-11 um 4:11 p.m. schrieb Lorenz Brun: > The existing code used the major version number of the DRM driver > instead of the device major number of the DRM subsystem for > validating access for a devices cgroup. > > This meant that accesses allowed by the devices cgroup weren't > permitted

Re: [PATCH 1/6] drm/ttm: Add unampping of the entire device address space

2020-06-11 Thread Andrey Grodzovsky
On 6/10/20 5:16 PM, Daniel Vetter wrote: On Wed, Jun 10, 2020 at 10:30 PM Thomas Hellström (Intel) wrote: On 6/10/20 5:30 PM, Daniel Vetter wrote: On Wed, Jun 10, 2020 at 04:05:04PM +0200, Christian König wrote: Am 10.06.20 um 15:54 schrieb Andrey Grodzovsky: On 6/10/20 6:15 AM, Thomas

Re: [PATCH 1/6] drm/ttm: Add unampping of the entire device address space

2020-06-11 Thread Andrey Grodzovsky
On 6/11/20 2:35 AM, Thomas Hellström (Intel) wrote: On 6/10/20 11:19 PM, Andrey Grodzovsky wrote: On 6/10/20 4:30 PM, Thomas Hellström (Intel) wrote: On 6/10/20 5:30 PM, Daniel Vetter wrote: On Wed, Jun 10, 2020 at 04:05:04PM +0200, Christian König wrote: Am 10.06.20 um 15:54 schrieb

Re: [Intel-gfx] [PATCH 03/18] dma-fence: basic lockdep annotations

2020-06-11 Thread Daniel Vetter
On Thu, Jun 11, 2020 at 4:29 PM Tvrtko Ursulin wrote: > > > On 11/06/2020 12:29, Daniel Vetter wrote: > > On Thu, Jun 11, 2020 at 12:36 PM Tvrtko Ursulin > > wrote: > >> On 10/06/2020 16:17, Daniel Vetter wrote: > >>> On Wed, Jun 10, 2020 at 4:22 PM Tvrtko Ursulin > >>> wrote: > > >

Re: [Intel-gfx] [PATCH 03/18] dma-fence: basic lockdep annotations

2020-06-11 Thread Tvrtko Ursulin
On 11/06/2020 12:29, Daniel Vetter wrote: > On Thu, Jun 11, 2020 at 12:36 PM Tvrtko Ursulin > wrote: >> On 10/06/2020 16:17, Daniel Vetter wrote: >>> On Wed, Jun 10, 2020 at 4:22 PM Tvrtko Ursulin >>> wrote: On 04/06/2020 09:12, Daniel Vetter wrote: > Design is similar to the

Re: [Linaro-mm-sig] [PATCH 04/18] dma-fence: prime lockdep annotations

2020-06-11 Thread Jason Gunthorpe
On Thu, Jun 11, 2020 at 10:34:30AM +0200, Daniel Vetter wrote: > > I still have my doubts about allowing fence waiting from within shrinkers. > > IMO ideally they should use a trywait approach, in order to allow memory > > allocation during command submission for drivers that > > publish fences

RE: [PATCH] drm/amdgpu: correct ras query as part of ctx query

2020-06-11 Thread Chen, Guchun
[AMD Public Use] Hi Dennis, Sorry for confusion brought by the commit message. I will update patch v2 later. Regards, Guchun -Original Message- From: Li, Dennis Sent: Thursday, June 11, 2020 6:57 PM To: Chen, Guchun ; amd-gfx@lists.freedesktop.org; Zhang, Hawking ; Zhou1, Tao ; Pan,

Re: [PATCH] drm/amd/amdgpu: Add SQ_DEBUG_STS_GLOBAL* registers/bits

2020-06-11 Thread Alex Deucher
On Thu, Jun 11, 2020 at 7:58 AM Tom St Denis wrote: > > Even though they are technically MMIO registers I put the bits with the sqind > block > for organizational purposes. > > Requested for UMR debugging. > > Signed-off-by: Tom St Denis Reviewed-by: Alex Deucher > --- >

Re: [PATCH] drm/amdgpu: remove distinction between explicit and implicit sync (v2)

2020-06-11 Thread Chunming Zhou
I didn't check the patch details, if it is for existing implicit sync of shared buffer, feel free go ahead. But if you add some description for its usage, that will be more clear to others. -David 在 2020/6/11 15:19, Marek Olšák 写道: Hi David, Explicit sync has nothing to do with this. This

[PATCH] drm/amd/amdgpu: Add SQ_DEBUG_STS_GLOBAL* registers/bits

2020-06-11 Thread Tom St Denis
Even though they are technically MMIO registers I put the bits with the sqind block for organizational purposes. Requested for UMR debugging. Signed-off-by: Tom St Denis --- .../include/asic_reg/gc/gc_10_1_0_offset.h| 3 ++- .../include/asic_reg/gc/gc_10_1_0_sh_mask.h | 16

Re: [PATCH] drm/amdgpu/jpeg: fix race condition issue for jpeg start

2020-06-11 Thread Leo Liu
Reviewed-by: Leo Liu On 2020-06-10 12:36 p.m., James Zhu wrote: Fix race condition issue when multiple jpeg starts are called. Signed-off-by: James Zhu --- drivers/gpu/drm/amd/amdgpu/amdgpu_jpeg.c | 16 drivers/gpu/drm/amd/amdgpu/amdgpu_jpeg.h | 2 ++ 2 files changed,

Re: [Intel-gfx] [PATCH 03/18] dma-fence: basic lockdep annotations

2020-06-11 Thread Daniel Vetter
On Thu, Jun 11, 2020 at 12:36 PM Tvrtko Ursulin wrote: > > > On 10/06/2020 16:17, Daniel Vetter wrote: > > On Wed, Jun 10, 2020 at 4:22 PM Tvrtko Ursulin > > wrote: > >> > >> > >> On 04/06/2020 09:12, Daniel Vetter wrote: > >>> Design is similar to the lockdep annotations for workers, but with >

RE: [PATCH] drm/amdgpu: correct ras query as part of ctx query

2020-06-11 Thread Li, Dennis
[AMD Official Use Only - Internal Distribution Only] Hi, Guchun, The ras_manager obj will save the error counters in every querying, therefore the previous querying shouldn't affect the result of current querying. Please check the function: amdgpu_ras_error_query. Best Regards Dennis Li

Re: [Intel-gfx] [PATCH 03/18] dma-fence: basic lockdep annotations

2020-06-11 Thread Tvrtko Ursulin
On 10/06/2020 16:17, Daniel Vetter wrote: > On Wed, Jun 10, 2020 at 4:22 PM Tvrtko Ursulin > wrote: >> >> >> On 04/06/2020 09:12, Daniel Vetter wrote: >>> Design is similar to the lockdep annotations for workers, but with >>> some twists: >>> >>> - We use a read-lock for the

[PATCH] drm/amdgpu: correct ras query as part of ctx query

2020-06-11 Thread Guchun Chen
Almost error count registers are automatically cleared after reading once, so both CE and UE count needs to be read in one loop. Signed-off-by: Guchun Chen --- drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c | 16 +++- drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c | 14 +-

Re: [PATCH] dma-fence: basic lockdep annotations

2020-06-11 Thread Maarten Lankhorst
Op 05-06-2020 om 15:29 schreef Daniel Vetter: > Design is similar to the lockdep annotations for workers, but with > some twists: > > - We use a read-lock for the execution/worker/completion side, so that > this explicit annotation can be more liberally sprinkled around. > With read locks

Re: [Intel-gfx] [PATCH 03/18] dma-fence: basic lockdep annotations

2020-06-11 Thread Daniel Stone
Hi, On Thu, 11 Jun 2020 at 09:44, Dave Airlie wrote: > On Thu, 11 Jun 2020 at 18:01, Chris Wilson wrote: > > Introducing a global lockmap that cannot capture the rules correctly, > > Can you document the rules all drivers should be following then, > because from here it looks to get refactored

Re: [PATCH 03/18] dma-fence: basic lockdep annotations

2020-06-11 Thread Dave Airlie
On Thu, 11 Jun 2020 at 18:01, Chris Wilson wrote: > > Quoting Daniel Vetter (2020-06-04 09:12:09) > > Design is similar to the lockdep annotations for workers, but with > > some twists: > > > > - We use a read-lock for the execution/worker/completion side, so that > > this explicit annotation

Re: [Linaro-mm-sig] [PATCH 04/18] dma-fence: prime lockdep annotations

2020-06-11 Thread Daniel Vetter
On Thu, Jun 11, 2020 at 09:30:12AM +0200, Thomas Hellström (Intel) wrote: > > On 6/4/20 10:12 AM, Daniel Vetter wrote: > > Two in one go: > > - it is allowed to call dma_fence_wait() while holding a > >dma_resv_lock(). This is fundamental to how eviction works with ttm, > >so required. >

Re: [PATCH 1/6] drm/ttm: Add unampping of the entire device address space

2020-06-11 Thread Daniel Vetter
On Thu, Jun 11, 2020 at 08:12:37AM +0200, Thomas Hellström (Intel) wrote: > > On 6/10/20 11:16 PM, Daniel Vetter wrote: > > On Wed, Jun 10, 2020 at 10:30 PM Thomas Hellström (Intel) > > wrote: > > > > > > On 6/10/20 5:30 PM, Daniel Vetter wrote: > > > > On Wed, Jun 10, 2020 at 04:05:04PM +0200,

Re: [PATCH 03/18] dma-fence: basic lockdep annotations

2020-06-11 Thread Chris Wilson
Quoting Daniel Vetter (2020-06-04 09:12:09) > Design is similar to the lockdep annotations for workers, but with > some twists: > > - We use a read-lock for the execution/worker/completion side, so that > this explicit annotation can be more liberally sprinkled around. > With read locks

Re: [Linaro-mm-sig] [PATCH 04/18] dma-fence: prime lockdep annotations

2020-06-11 Thread Intel
On 6/4/20 10:12 AM, Daniel Vetter wrote: Two in one go: - it is allowed to call dma_fence_wait() while holding a dma_resv_lock(). This is fundamental to how eviction works with ttm, so required. - it is allowed to call dma_fence_wait() from memory reclaim contexts, specifically from

Re: [PATCH] drm/amdgpu: remove distinction between explicit and implicit sync (v2)

2020-06-11 Thread Marek Olšák
Hi David, Explicit sync has nothing to do with this. This is for implicit sync, which is required by DRI3. This fix allows removing existing inefficiencies from drivers, so it's a good thing. Marek On Wed., Jun. 10, 2020, 03:56 Chunming Zhou, wrote: > > 在 2020/6/10 15:41, Christian König 写道:

Re: [PATCH 1/6] drm/ttm: Add unampping of the entire device address space

2020-06-11 Thread Intel
On 6/10/20 11:19 PM, Andrey Grodzovsky wrote: On 6/10/20 4:30 PM, Thomas Hellström (Intel) wrote: On 6/10/20 5:30 PM, Daniel Vetter wrote: On Wed, Jun 10, 2020 at 04:05:04PM +0200, Christian König wrote: Am 10.06.20 um 15:54 schrieb Andrey Grodzovsky: On 6/10/20 6:15 AM, Thomas Hellström

RE: [PATCH] drm/amdgpu/sriov: Need to clear kiq position

2020-06-11 Thread Liu, Monk
Acked-by: Monk.Liu _ Monk Liu|GPU Virtualization Team |AMD -Original Message- From: amd-gfx On Behalf Of Emily Deng Sent: Thursday, June 11, 2020 2:02 PM To: amd-gfx@lists.freedesktop.org Cc: Deng, Emily Subject: [PATCH] drm/amdgpu/sriov: Need to

[PATCH] drm/amdgpu/sriov: Need to clear kiq position

2020-06-11 Thread Emily Deng
As will clear vf fw during unload driver, to avoid idle fail. Need to clear KIQ portion also. Signed-off-by: Emily Deng --- drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c | 6 ++ 1 file changed, 6 insertions(+) diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c