Re: [PATCH] drm/amdkfd: Remove Align VRAM allocations to 1MB on APU ASIC

2022-07-14 Thread Kuehling, Felix
, 2022 11:21 PM To: Kuehling, Felix ; amd-gfx@lists.freedesktop.org Cc: Phillips, Daniel ; Ji, Ruili ; Liu, Aaron Subject: RE: [PATCH] drm/amdkfd: Remove Align VRAM allocations to 1MB on APU ASIC [AMD Official Use Only - General] Thanks Felix comment, I will further debug this issue

RE: [PATCH] drm/amdgpu: Ignore KFD eviction fences invalidating preemptible DMABuf imports

2023-04-27 Thread Kuehling, Felix
egards, Felix -Original Message- From: Huang, JinHuiEric Sent: Thursday, April 27, 2023 14:58 To: Kuehling, Felix ; Koenig, Christian ; Christian König ; amd-gfx@lists.freedesktop.org Subject: Re: [PATCH] drm/amdgpu: Ignore KFD eviction fences invalidating preemptible DMABuf imports Hi

Re: Stack out of bounds in KFD on Arcturus

2019-10-17 Thread Kuehling, Felix
I don't see why this problem would be specific to Arcturus. I don't see any excessive allocations on the stack either. Also the code involved here hasn't changed recently. Are you using some weird kernel config with a smaller stack? Is it specific to a compiler version or some optimization flag

Re: [PATCH] drm/amdkfd: kfd open return failed if device is locked

2019-10-18 Thread Kuehling, Felix
On 2019-10-18 10:27 a.m., Yang, Philip wrote: > If device is locked for suspend and resume, kfd open should return > failed -EAGAIN without creating process, otherwise the application exit > to release the process will hang to wait for resume is done if the suspend > and resume is stuck somewhere.

Re: [PATCH v2] drm/amdkfd: kfd open return failed if device is locked

2019-10-18 Thread Kuehling, Felix
On 2019-10-18 1:36 p.m., Yang, Philip wrote: > If device is locked for suspend and resume, kfd open should return > failed -EAGAIN without creating process, otherwise the application exit > to release the process will hang to wait for resume is done if the suspend > and resume is stuck somewhere. T

Re: [PATCH 2/2] Revert "drm/amdgpu: disable c-states on xgmi perfmons"

2019-10-18 Thread Kuehling, Felix
You can squash the two reverts into a single commit so you avoid reintroducing a broken intermediate state. Mention both reverted commits in the squashed commit description. Checkpatch.pl prefers a different format for quoting reverted commits. Run checkpatch.pl on your commit to see a proper e

Re: [PATCH] drm/amdgpu: revert calling smu msg in df callbacks

2019-10-18 Thread Kuehling, Felix
On 2019-10-18 4:29 p.m., Kim, Jonathan wrote: > reverting the following changes: > commit 7dd2eb31fcd5 ("drm/amdgpu: fix compiler warnings for df perfmons") > commit 54275cd1649f ("drm/amdgpu: disable c-states on xgmi perfmons") > > perf events use spin-locks. embedded smu messages have potential

Re: Stack out of bounds in KFD on Arcturus

2019-10-18 Thread Kuehling, Felix
k on this myself. I'll create a ticket and see if I can find someone to investigate. Thanks,   Felix > > Andrey > > On 10/17/19 5:29 PM, Kuehling, Felix wrote: >> I don't see why this problem would be specific to Arcturus. I don't see >> any excessive

Re: [PATCH v2] drm/amdkfd: kfd open return failed if device is locked

2019-10-21 Thread Kuehling, Felix
ters or send an updated runlist to the HWS. When the process is resumed at the end of the reset/suspend/eviction, that's when any newly created queues would get mapped to the hardware. Regards,   Felix > > Regards, > Oak > > -Original Message- > From: amd-gfx On

Re: [PATCH] drm/amdkfd: don't use dqm lock during device reset/suspend/resume

2019-10-21 Thread Kuehling, Felix
On 2019-10-21 5:04 p.m., Yang, Philip wrote: > If device reset/suspend/resume failed for some reason, dqm lock is > hold forever and this causes deadlock. Below is a kernel backtrace when > application open kfd after suspend/resume failed. > > Instead of holding dqm lock in pre_reset and releasing

Re: [PATCH] drm/amdkfd: don't use dqm lock during device reset/suspend/resume

2019-10-22 Thread Kuehling, Felix
suspended. But I'd like to see some safeguards in place to make sure those assumptions are never violated. Regards,   Felix > > Regards, > Oak > > -Original Message- > From: amd-gfx On Behalf Of Kuehling, > Felix > Sent: Monday, October 21, 2019 9:04 PM

Re: [PATCH v2] drm/amdkfd: don't use dqm lock during device reset/suspend/resume

2019-10-22 Thread Kuehling, Felix
On 2019-10-22 14:28, Yang, Philip wrote: > If device reset/suspend/resume failed for some reason, dqm lock is > hold forever and this causes deadlock. Below is a kernel backtrace when > application open kfd after suspend/resume failed. > > Instead of holding dqm lock in pre_reset and releasing dqm

Re: [PATCH] drm/amdkfd: bug fix for out of bounds mem on gpu cache filling info

2019-10-24 Thread Kuehling, Felix
On 2019-10-24 14:46, Sierra Guiza, Alejandro (Alex) wrote: > The bitmap in cu_info structure is defined as a 4x4 size array. In > Acturus, this matrix is initialized as a 4x2. Based on the 8 shaders. > In the gpu cache filling initialization, the access to the bitmap matrix > was done as an 8x1 ins

Re: [PATCH] drm/amdkfd: Delete duplicated queue bit map reservation

2019-10-28 Thread Kuehling, Felix
On 2019-10-24 5:14 p.m., Zhao, Yong wrote: > The KIQ is on the second MEC and its reservation is covered in the > latter logic, so no need to reserve its bit twice. > > Change-Id: Ieee390953a60c7d43de5a9aec38803f1f583a4a9 > Signed-off-by: Yong Zhao Reviewed-by: Felix Kuehling > --- > drivers

Re: [PATCH v2 12/15] drm/amdgpu: Call find_vma under mmap_sem

2019-10-29 Thread Kuehling, Felix
On 2019-10-28 4:10 p.m., Jason Gunthorpe wrote: > From: Jason Gunthorpe > > find_vma() must be called under the mmap_sem, reorganize this code to > do the vma check after entering the lock. > > Further, fix the unlocked use of struct task_struct's mm, instead use > the mm from hmm_mirror which has

Re: [PATCH v2 02/15] mm/mmu_notifier: add an interval tree notifier

2019-10-29 Thread Kuehling, Felix
I haven't had enough time to fully understand the deferred logic in this change. I spotted one problem, see comments inline. On 2019-10-28 4:10 p.m., Jason Gunthorpe wrote: > From: Jason Gunthorpe > > Of the 13 users of mmu_notifiers, 8 of them use only > invalidate_range_start/end() and immedia

Re: [PATCH v2 13/15] drm/amdgpu: Use mmu_range_insert instead of hmm_mirror

2019-10-29 Thread Kuehling, Felix
On 2019-10-28 4:10 p.m., Jason Gunthorpe wrote: > From: Jason Gunthorpe > > Remove the interval tree in the driver and rely on the tree maintained by > the mmu_notifier for delivering mmu_notifier invalidation callbacks. > > For some reason amdgpu has a very complicated arrangement where it tries

Re: [PATCH] drm/amdgpu: remove PT BOs when unmapping

2019-10-30 Thread Kuehling, Felix
On 2019-10-30 9:52 a.m., Christian König wrote: > Am 29.10.19 um 21:06 schrieb Huang, JinHuiEric: >> The issue is PT BOs are not freed when unmapping VA, >> which causes vram usage accumulated is huge in some >> memory stress test, such as kfd big buffer stress test. >> Function amdgpu_vm_bo_update

Re: [PATCH] drm/amdkfd: Simplify the mmap offset related bit operations

2019-11-01 Thread Kuehling, Felix
NAK. This won't work for several reasons. The mmap_offset is used as offset parameter in the mmap system call. If you check the man page of mmap, you'll see that "offset must be a multiple of the page size". Therefore the PAGE_SHIFT is necessary. In the case of doorbell offsets, the offset is p

Re: [PATCH] drm/amdkfd: Simplify the mmap offset related bit operations

2019-11-01 Thread Kuehling, Felix
On 2019-11-01 4:48 p.m., Zhao, Yong wrote: > The new code is much cleaner and results in better readability. > > Change-Id: I0c1f7cca7e24ddb7b4ffe1cb0fa71943828ae373 > Signed-off-by: Yong Zhao > --- > drivers/gpu/drm/amd/amdkfd/kfd_chardev.c | 13 +++-- > drivers/gpu/drm/amd/amdkfd/kfd_

Re: [PATCH] drm/amdgpu: change read of GPU clock counter on Vega10 VF

2019-11-05 Thread Kuehling, Felix
On 2019-11-05 5:03 p.m., Huang, JinHuiEric wrote: > Using unified VBIOS has performance drop in sriov environment. > The fix is switching to another register instead. > > Signed-off-by: Eric Huang > --- > drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c | 18 +++--- > 1 file changed, 15 inser

Re: [PATCH] drm/amdkfd: Simplify the mmap offset related bit operations

2019-11-05 Thread Kuehling, Felix
    Signed-off-by: Yong Zhao     Signed-off-by: Felix Kuehling     Acked-by: Oded Gabbay     Signed-off-by: Oded Gabbay Regards,   Felix > > Regards, > Yong > ---- > *From:* Kuehling, Felix > *Sent:*

Re: [PATCH] drm/amdgpu: change read of GPU clock counter on Vega10 VF

2019-11-05 Thread Kuehling, Felix
On 2019-11-05 5:26 p.m., Huang, JinHuiEric wrote: > Using unified VBIOS has performance drop in sriov environment. > The fix is switching to another register instead. > > Signed-off-by: Eric Huang Reviewed-by: Felix Kuehling > --- > drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c | 19

Re: [PATCH 2/3] drm/amdkfd: only keep release_mem function for Hawaii

2019-11-06 Thread Kuehling, Felix
On 2019-10-30 20:17, Zhao, Yong wrote: > release_mem won't be used at all on GFX9 and GFX10, so delete it. Hawaii was GFXv7. So we're not using the release_mem packet on GFXv8 either. Why arbitrarily limit this change to GFXv9 and 10? Regards,   Felix > > Change-Id: I13787a8a29b83e7516c582a740

Re: [PATCH 1/3] drm/amdkfd: Adjust function sequences to avoid unnecessary declarations

2019-11-06 Thread Kuehling, Felix
On 2019-10-30 20:17, Zhao, Yong wrote: > This is cleaner. > > Change-Id: I8cdecad387d8c547a088c6050f77385ee1135be1 > Signed-off-by: Yong Zhao Reviewed-by: Felix Kuehling > --- > .../gpu/drm/amd/amdkfd/kfd_kernel_queue_v9.c | 19 +++ > 1 file changed, 7 insertions(+), 12 deleti

Re: [PATCH 3/3] drm/amdkfd: Use kernel queue v9 functions for v10

2019-11-06 Thread Kuehling, Felix
On 2019-10-30 20:17, Zhao, Yong wrote: > The kernel queue functions for v9 and v10 are the same except > pm_map_process_v* which have small difference, so they should be reused. > This eliminates the need of reapplying several patches which were > applied on v9 but not on v10, such as bigger GWS an

Re: [PATCH] drm/amdkfd: Simplify the mmap offset related bit operations

2019-11-06 Thread Kuehling, Felix
On 2019-11-05 18:18, Zhao, Yong wrote: > The new code uses straightforward bit shifts and thus has better > readability. You're missing the MMAP-related code for mmio remapping. In kfd_ioctl_alloc_memory_of_gpu:     /* MMIO is mapped through kfd device * Generate a kfd mmap offse

Re: [PATCH 2/3] drm/amdkfd: only keep release_mem function for Hawaii

2019-11-07 Thread Kuehling, Felix
<mailto:amd-gfx-boun...@lists.freedesktop.org> On Behalf Of Zhao, Yong Sent: Thursday, November 7, 2019 11:57 AM To: Kuehling, Felix <mailto:felix.kuehl...@amd.com>; amd-gfx@lists.freedesktop.org<mailto:amd-gfx@lists.freedesktop.org> Subject: Re: [PATCH 2/3] drm/amdkfd: only ke

Re: [PATCH] drm/amdkfd: Simplify the mmap offset related bit operations

2019-11-07 Thread Kuehling, Felix
On 2019-11-07 12:33, Zhao, Yong wrote: > The new code uses straightforward bit shifts and thus has better readability. > > Change-Id: I0c1f7cca7e24ddb7b4ffe1cb0fa71943828ae373 > Signed-off-by: Yong Zhao Reviewed-by: Felix Kuehling > --- > drivers/gpu/drm/amd/amdkfd/kfd_chardev.c | 17 +++

Re: [PATCH 2/3] drm/amdkfd: only keep release_mem function for Hawaii

2019-11-07 Thread Kuehling, Felix
x Deucher <mailto:alexdeuc...@gmail.com> Sent: Thursday, November 7, 2019 1:32 PM To: Kuehling, Felix <mailto:felix.kuehl...@amd.com> Cc: Zhao, Yong <mailto:yong.z...@amd.com>; Russell, Kent <mailto:kent.russ...@amd.com>; amd-gfx@lists.freedesktop.org<mailto:amd-gfx@list

Re: [PATCH 2/3] drm/amdkfd: only keep release_mem function for Hawaii

2019-11-07 Thread Kuehling, Felix
eel terribly strongly about. With that said, the change is Reviewed-by: Felix Kuehling <mailto:felix.kuehl...@amd.com> Regards, Felix Regards, Yong From: Kuehling, Felix <mailto:felix.kuehl...@amd.com> Sent: Thursday, November 7, 2019 2

Re: [PATCH 2/3] drm/amdkfd: only keep release_mem function for Hawaii

2019-11-07 Thread Kuehling, Felix
On 2019-11-07 13:32, Alex Deucher wrote: > On Thu, Nov 7, 2019 at 12:47 PM Kuehling, Felix > wrote: >> No, please lets not add a new nomenclature for PM4 packet versions. GFX >> versions are agreed on between hardware, firmware, and software and it's >> generally

Re: [PATCH] drm/amdkfd: Use kernel queue v9 functions for v10 (ver2)

2019-11-07 Thread Kuehling, Felix
Are you sure that setting the SQ_SHADER_TBA_HI__TRAP_EN bit on GFXv9 is completely harmless? If the field is not defined, maybe setting the bit makes the address invalid. It's probably worth running that through a PSDB, which would cover Vega10, Vega20 and Arcturus. If it actually works, the pa

RE: [PATCH] drm/amdgpu: use HMM mirror callback to replace mmu notifier v4

2018-09-26 Thread Kuehling, Felix
: Kuehling, Felix ; Yang, Philip ; amd-gfx@lists.freedesktop.org; Jerome Glisse Subject: Re: [PATCH] drm/amdgpu: use HMM mirror callback to replace mmu notifier v4 Am 14.09.2018 um 22:21 schrieb Felix Kuehling: > On 2018-09-14 01:52 PM, Christian König wrote: >> Am 14.09.2018 um 19:47 schri

RE: [PATCH] drm/amdgpu: use HMM mirror callback to replace mmu notifier v4

2018-09-27 Thread Kuehling, Felix
eeds to update the userptr addresses. If the page tables are still being updated, it will block there even without holding the amdgpu_mn_read_lock. Regards, Felix From: Koenig, Christian Sent: Thursday, September 27, 2018 3:00 AM To: Kuehling, Felix Cc: Yang, Philip ; amd-gfx@lists.freedesktop.

RE: [PATCH] drm/amdgpu: use HMM mirror callback to replace mmu notifier v4

2018-09-27 Thread Kuehling, Felix
e. I don’t see why this requires holding the read-lock until invalidate_range_end. amdgpu_ttm_tt_affect_userptr gets called while the mn read-lock is held in invalidate_range_start notifier. Regards, Felix From: Koenig, Christian Sent: Thursday, September 27, 2018 5:27 AM To: Kuehling, Felix Cc:

RE: [PATCH] drm/amdgpu: use HMM mirror callback to replace mmu notifier v4

2018-09-27 Thread Kuehling, Felix
bles just in the moment > between the check of amdgpu_ttm_tt_userptr_needs_pages() and adding the fence > to the reservation object. I’m not planning to change that. I don’t think there is any need to change it. Regards, Felix From: Koenig, Christian Sent: Thursday, September 27, 2018 7

RE: [PATCH] drm/amdgpu: use HMM mirror callback to replace mmu notifier v4

2018-09-27 Thread Kuehling, Felix
hole argument is that you don’t need to hold the read lock until the invalidate_range_end. Just read_lock and read_unlock in the invalidate_range_start function. Regards, Felix From: Koenig, Christian Sent: Thursday, September 27, 2018 9:22 AM To: Kuehling, Felix Cc: Yang, Philip ; amd-gf

RE: [PATCH] drm/amdgpu: use HMM mirror callback to replace mmu notifier v4

2018-09-27 Thread Kuehling, Felix
er 27, 2018 9:59 AM To: Kuehling, Felix Cc: Yang, Philip ; amd-gfx@lists.freedesktop.org; Jerome Glisse Subject: RE: [PATCH] drm/amdgpu: use HMM mirror callback to replace mmu notifier v4 Yeah I understand that, but again that won't work. In this case you can end up accessing pages which

RE: [PATCH] drm/amdgpu: use HMM mirror callback to replace mmu notifier v4

2018-09-27 Thread Kuehling, Felix
I think the answer is here: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/Documentation/vm/hmm.rst#n216 Regards, Felix From: Koenig, Christian Sent: Thursday, September 27, 2018 10:30 AM To: Kuehling, Felix Cc: j.gli...@gmail.com; Yang, Philip ; amd-gfx

Re: [PATCH] drm/amdkfd: Fix incorrect use of process->mm

2018-10-03 Thread Kuehling, Felix
Hi Alex, If it's not too late, I'd like to get this into 4.19. Sorry I missed this fix earlier. Regards, Felix ____ From: Kuehling, Felix Sent: Tuesday, October 2, 2018 6:41:12 PM To: amd-gfx@lists.freedesktop.org Cc: oded.gab...@gmail.com; Kuehl

Re: [PATCH 3/3] drm/amdkfd: Add proper prefix to functions

2018-10-18 Thread Kuehling, Felix
On 2018-10-18 6:03 p.m., Deucher, Alexander wrote: > > Series is: > > Reviewed-by: Alex Deucher > Reviewed-by: Felix Kuehling as well. > > *From:* amd-gfx on behalf of > Lin, Amber > *Sent:* Thursday, October 18, 2018

Re: [PATCH 3/3] drm/amdkfd: Use functions from amdgpu for setting up page table base

2018-10-18 Thread Kuehling, Felix
On 2018-10-18 5:59 p.m., wrote: > > Please include a patch description on 2 and 3, with that fixed, series is: > > Reviewed-by: Alex Deucher > Reviewed-by: Felix Kuehling > > *From:* Zhao, Yong > *Sent:* Thursday, October

Re: [PATCH] drm/amdgpu: fix sdma v4 ring is disabled accidently

2018-10-19 Thread Kuehling, Felix
[+Christian] Should the buffer funcs also use the paging ring? I think that would be important for being able to clear page tables or migrating a BO while handling a page fault. Regards,   Felix On 2018-10-19 3:13 p.m., Yang, Philip wrote: > For sdma v4, there is bug caused by > commit d4e869b6b

Re: [PATCH v2] drm/amdkfd: Add proper prefix to functions

2018-10-19 Thread Kuehling, Felix
On 2018-10-19 11:15 a.m., Lin, Amber wrote: > Add amdgpu_amdkfd_ prefix to amdgpu functions served for amdkfd usage. > > v2: fix indentation > > Signed-off-by: Amber Lin Reviewed-by: Felix Kuehling > --- > drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c | 18 +- > drivers/gpu/

Re: [PATCH] drm/amdgpu: fix a missing-check bug

2018-10-22 Thread Kuehling, Felix
The BIOS signature check does not guarantee integrity of the BIOS image either way. As I understand it, the signature is just a magic number. It's not a cryptographic signature. The check is just a sanity check. Therefore this change doesn't add any meaningful protection against the scenario you de

Re: [PATCH 1/3] drm/amdkfd: Remove unnecessary register setting when invalidating tlb in kfd

2018-10-23 Thread Kuehling, Felix
Patch 1 is Reviewed-by: Felix Kuehling Patch 2: I'm not sure we need the "lock" parameter and the invalidation engine parameter. If we're serious about consolidating TLB invalidation between amdgpu and KFD, I think we should use the same invalidation engine and the same lock. Then you also don't

Re: [PATCH 1/2] drm/amdgpu: Reorganize *_flush_gpu_tlb() for kfd to use

2018-10-23 Thread Kuehling, Felix
It occurred to me that the flush_type is a hardware-specific value, but you're using it in a hardware-abstracted interface. If the meaning of the flush type values changes in future HW-generations, we'll need to define an abstract enum that gets translated to the respective HW values in the HW-spec

Re: [PATCH 2/2] drm/amdkfd: page_table_base already have the flags needed

2018-10-23 Thread Kuehling, Felix
The series is Reviewed-by: Felix Kuehling On 2018-10-23 1:00 p.m., Zhao, Yong wrote: > > How about those two patches? > > > Yong > > > *From:* Zhao, Yong > *Sent:* Monday, October 22, 2018 2:33:26 PM > *To:* amd-gfx@lists.f

Re: [PATCH] drm/amdgpu: fix VM leaf walking

2018-10-25 Thread Kuehling, Felix
On 2018-10-25 10:38 a.m., Christian König wrote: > Make sure we don't try to go down further after the leave walk already > ended. This fixes a crash with a new VM test. > > Signed-off-by: Christian König Reviewed-by: Felix Kuehling Regards,   Felix > --- > drivers/gpu/drm/amd/amdgpu/amdgpu_v

Re: [PATCH] drm/amdgpu/amdkfd: clean up mmhub and gfxhub includes

2018-10-25 Thread Kuehling, Felix
On 2018-10-25 2:27 p.m., Alex Deucher wrote: > On Mon, Oct 22, 2018 at 6:25 PM Alex Deucher wrote: >> Use the appropriate mmhub and gfxhub headers rather than adding >> them to the gmc9 header. >> >> Signed-off-by: Alex Deucher > Ping? Reviewed-by: Felix Kuehling > > Alex > >> --- >> drivers

Re: [PATCH] drm/amdkfd: fix interrupt spin lock

2018-11-02 Thread Kuehling, Felix
On 2018-11-02 9:48 a.m., Christian König wrote: > Vega10 has multiple interrupt rings, I don't think I've seen your code that implements multiple interrupt rings. So it's a bit hard to comment. As I understand it, the only way this could happen is, if the two interrupt rings are handled by differe

Re: [PATCH] drm/amdkfd: fix interrupt spin lock

2018-11-05 Thread Kuehling, Felix
On 2018-11-04 2:20 p.m., Christian König wrote: > Am 02.11.18 um 19:59 schrieb Kuehling, Felix: >> On 2018-11-02 9:48 a.m., Christian König wrote: >>> Vega10 has multiple interrupt rings, >> I don't think I've seen your code that implements multiple interru

[PATCH 0/9] KFD upstreaming Nov 2018, part 1

2018-11-05 Thread Kuehling, Felix
These are some recent patches that are easy to upstream (part 1). For part 2 (hopefully still this month) I'll need to advance the merging of KFD into amdgpu a little further to avoid upstreaming duplicated data structures that no longer need to be duplicated. Eric Huang (1): drm/amdkfd: change

[PATCH 1/9] drm/amdkfd: Replace mqd with mqd_mgr as the variable name for mqd_manager

2018-11-05 Thread Kuehling, Felix
From: Yong Zhao This will make reading code much easier. This fixes a few spots missed in a previous commit with the same title. Signed-off-by: Yong Zhao Reviewed-by: Felix Kuehling Signed-off-by: Felix Kuehling --- drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c | 12 ++-- 1 f

[PATCH 7/9] drm/amdkfd: Fix and simplify sync object handling for KFD

2018-11-05 Thread Kuehling, Felix
The adev parameter in amdgpu_sync_fence and amdgpu_sync_resv is only needed for updating sync->last_vm_update. This breaks if different adevs are passed to calls for the same sync object. Always pass NULL for calls from KFD because sync objects used for KFD don't belong to any particular device, a

[PATCH 5/9] drm/amdgpu: Remove explicit wait after VM validate

2018-11-05 Thread Kuehling, Felix
From: Harish Kasiviswanathan PD or PT might have to be moved during validation and this move has to be completed before updating it. If page table updates are done using SDMA then this serializing is done by SDMA command submission. And if PD/PT updates are done by CPU, then explicit waiting for

[PATCH 6/9] drm/amdgpu: KFD Restore process: Optimize waiting

2018-11-05 Thread Kuehling, Felix
From: Harish Kasiviswanathan Instead of waiting for each KFD BO after validation just wait for the last BO moving fence. Signed-off-by: Harish Kasiviswanathan Reviewed-by: Felix Kuehling Signed-off-by: Felix Kuehling --- drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c | 13 - 1

[PATCH 2/9] drm/amdkfd: Added Vega12 and Polaris12 for KFD.

2018-11-05 Thread Kuehling, Felix
From: Gang Ba Add Vega12 and Polaris12 device info and device IDs to KFD. Signed-off-by: Gang Ba Reviewed-by: Felix Kuehling Signed-off-by: Felix Kuehling --- drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c | 2 + drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c | 3 +- drivers/gpu/dr

[PATCH 3/9] drm/amdkfd: Adjust the debug message in KFD ISR

2018-11-05 Thread Kuehling, Felix
From: Yong Zhao This makes debug message get printed even when there is early return. Signed-off-by: Yong Zhao Reviewed-by: Felix Kuehling Signed-off-by: Felix Kuehling --- drivers/gpu/drm/amd/amdkfd/kfd_int_process_v9.c | 14 +++--- 1 file changed, 7 insertions(+), 7 deletions(-) d

[PATCH 8/9] drm/amdgpu: Fix KFD doorbell SG BO mapping

2018-11-05 Thread Kuehling, Felix
This change prepares for adding SG BOs that will be used for mapping doorbells into GPUVM address space. This type of BO would be mistaken for an invalid userptr BO. Improve that check to test that it's actually a userptr BO so that SG BOs that are still in the CPU domain can be validated and mapp

[PATCH 4/9] drm/amdkfd: Workaround PASID missing in gfx9 interrupt payload under non HWS

2018-11-05 Thread Kuehling, Felix
From: Yong Zhao This is a known gfx9 HW issue, and this change can perfectly workaround the issue. Signed-off-by: Yong Zhao Reviewed-by: Felix Kuehling Signed-off-by: Felix Kuehling --- drivers/gpu/drm/amd/amdkfd/kfd_int_process_v9.c | 25 ++--- 1 file changed, 22 inserti

[PATCH 9/9] drm/amdkfd: change system memory overcommit limit

2018-11-05 Thread Kuehling, Felix
From: Eric Huang It is to improve system limit by: 1. replacing userptrlimit with a total memory limit that conunts TTM memory usage and userptr usage. 2. counting acc size for all BOs. Signed-off-by: Eric Huang Reviewed-by: Felix Kuehling Signed-off-by: Felix Kuehling --- drivers/gpu/drm/am

Re: [PATCH] drm/amdgpu: disable page queue on Vega10 SR-IOV VF

2018-11-07 Thread Kuehling, Felix
[+Philip] On 2018-11-07 12:25 a.m., Zhang, Jerry(Junwei) wrote: > On 11/7/18 1:15 PM, Trigger Huang wrote: >> Currently, SDMA page queue is not used under SR-IOV VF, and this >> queue will >> cause ring test failure in amdgpu module reload case. So just disable >> it. >> >> Signed-off-by: Trigger

Re: [PATCH] drm/amdgpu: fix huge page handling on Vega10

2018-11-12 Thread Kuehling, Felix
On 2018-11-12 12:09 p.m., Christian König wrote: > We accidentially set the huge flag on the parent instead of the childs. > This caused some VM faults under memory pressure. Reviewed-by: Felix Kuehling I got a bit confused when re-reading this code. Maybe part of it is that cursor.entry is not

RE: [PATCH] drm/amdgpu : Use XGMI mapping when devices on the same hive v2

2018-11-15 Thread Kuehling, Felix
This change is not suitable for amd-staging-drm-next. PCIe P2P was not enabled on amd-staging-drm-next because it's not reliable yet. This change enables it even in situations that are not safe (including small BAR systems). Why are you porting this change to amd-staging-drm-next? Does anyone de

RE: [PATCH] drm/amdgpu: enable paging queue doorbell support

2018-11-15 Thread Kuehling, Felix
You changed the doorbell routing in NBIO. I think that won't work for SR-IOV, because it's not controlled by the guest OS there. We may need to disable paging queue doorbell on Vega10 and Vega12 with SRIOV. For Vega20 we plan to change the doorbell layout before it goes to production (Oak starte

RE: [PATCH] drm/amdgpu : Use XGMI mapping when devices on the same hive v2

2018-11-15 Thread Kuehling, Felix
, Kent Sent: Thursday, November 15, 2018 1:04 PM To: Kuehling, Felix ; amd-gfx@lists.freedesktop.org Cc: Liu, Shaoyun Subject: Re: [PATCH] drm/amdgpu : Use XGMI mapping when devices on the same hive v2 It was merged to 4.19 on Sept 21. It got missed on the 4.20 rebase. Kent KENT RUSSELL Sr

RE: [PATCH] drm/amdgpu : Use XGMI mapping when devices on the same hive v3

2018-11-15 Thread Kuehling, Felix
Sorry, something is still missing here. The new variable vram_base_offset isn't used anywhere. We have some other changes in amd-kfd-staging to use that vram_base_offset that are probably missing on amd-staging-drm-next. This change won't have any effect as is. Regards, Felix -Original M

RE: [PATCH] drm/amdgpu: Fix Kernel Oops triggered by kfdtest

2018-11-15 Thread Kuehling, Felix
Apologies. We already have a fix for this on our internal amd-kfd-staging branch, but it's missing from amd-staging-drm-next. I'll cherry-pick our fix to amd-staging-drm-next and nominate it for drm-fixes. Regards, Felix -Original Message- From: amd-gfx On Behalf Of Joerg Roedel Sent

[PATCH] drm/amdgpu: Fix oops when pp_funcs->switch_power_profile is unset

2018-11-15 Thread Kuehling, Felix
On Vega20 and other pre-production GPUs, powerplay is not enabled yet. Check for NULL pointers before calling pp_funcs function pointers. Also affects Kaveri. CC: Joerg Roedel Signed-off-by: Felix Kuehling --- drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c | 7 +-- 1 file changed, 5 insertions

Re: [PATCH] drm/amdgpu: enable paging queue doorbell support v3

2018-11-16 Thread Kuehling, Felix
Looks good to me. Reviewed-by: Felix Kuehling I hope Alex or Christian can also review this in case I'm missing something about how doorbells are used in amdgpu. Regards,   Felix On 2018-11-16 2:08 p.m., Yang, Philip wrote: > Because increase SDMA_DOORBELL_RANGE to add new SDMA doorbell for pag

Re: [PATCH 7/9] drm/amdkfd: Fix and simplify sync object handling for KFD

2018-11-16 Thread Kuehling, Felix
OK with this I'll go ahead and push this upstream as well. Thanks,   Felix On 2018-11-05 8:40 p.m., Kuehling, Felix wrote: > The adev parameter in amdgpu_sync_fence and amdgpu_sync_resv is only > needed for updating sync->last_vm_update. This breaks if different > adevs are pas

Re: [PATCH] drm/amdgpu/gfx: use proper offset define for MEC doorbells

2018-11-16 Thread Kuehling, Felix
On 2018-11-16 3:30 p.m., Alex Deucher wrote: > Looks like a copy paste typo. > > Signed-off-by: Alex Deucher Reviewed-by: Felix Kuehling > --- > drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c

Re: [PATCH] drm/amdgpu: enable paging queue doorbell support v3

2018-11-16 Thread Kuehling, Felix
On 2018-11-16 4:35 p.m., Alex Deucher wrote: >> + ring->doorbell_index += 0x400; > I don't quite understand how this works. Why don't we have to adjust > the doorbell range registers in the nbio code? NBIO only looks at the lower 12 bits of the doorbell address. So adding 0

Re: [PATCH] drm/amdgpu: Fix oops when pp_funcs->switch_power_profile is unset

2018-11-19 Thread Kuehling, Felix
lex Deucher > > > *From:* amd-gfx on behalf of > Kuehling, Felix > *Sent:* Thursday, November 15, 2018 4:56:51 PM > *To:* amd-gfx@lists.freedesktop.org > *Cc:* Kuehling, Felix; Joerg Roedel > *Subject:* [PATCH] drm/amdgpu: Fix oops when > p

Re: [PATCH] drm/amdgpu: enable paging queue doorbell support v2

2018-11-19 Thread Kuehling, Felix
Hi Christian, On 2018-11-19 6:24 a.m., Christian König wrote: > Am 15.11.18 um 20:10 schrieb Yang, Philip: >> paging queues doorbell index use existing assignment >> sDMA_HI_PRI_ENGINE0/1 >> index, and increase SDMA_DOORBELL_RANGE size from 2 dwords to 4 >> dwords to >> enable the new doorbell ind

[PATCH 2/4] drm/amdkfd: Add NULL-pointer check

2018-11-20 Thread Kuehling, Felix
top_dev->gpu is NULL for CPUs. Avoid dereferencing it if NULL. Signed-off-by: Felix Kuehling --- drivers/gpu/drm/amd/amdkfd/kfd_topology.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_topology.c b/drivers/gpu/drm/amd/amdkfd/kfd_topology.c in

[PATCH 0/4] KFD upstreaming Nov 2018, part 2

2018-11-20 Thread Kuehling, Felix
This round adds support for more ROCm memory manager features: * VRAM limit checking to avoid overcommitment * DMABuf import for graphics interoperability * Support for mapping doorbells into GPUVM address space Felix Kuehling (4): drm/amdgpu: Add KFD VRAM limit checking drm/amdkfd: Add NULL-p

[PATCH 3/4] drm/amdkfd: Add DMABuf import functionality

2018-11-20 Thread Kuehling, Felix
This is used for interoperability between ROCm compute and graphics APIs. It allows importing graphics driver BOs into the ROCm SVM address space for zero-copy GPU access. The API is split into two steps (query and import) to allow user mode to manage the virtual address space allocation for the i

[PATCH 1/4] drm/amdgpu: Add KFD VRAM limit checking

2018-11-20 Thread Kuehling, Felix
We don't want KFD processes evicting each other over VRAM usage. Therefore prevent overcommitting VRAM among KFD applications with a per-GPU limit. Also leave enough room for page tables on top of the application memory usage. Signed-off-by: Felix Kuehling Reviewed-by: Eric Huang --- drivers/gp

[PATCH 4/4] drm/amdkfd: Add support for doorbell BOs

2018-11-20 Thread Kuehling, Felix
This allows user mode to map doorbell pages into GPUVM address space. That way GPUs can submit to user mode queues (self-dispatch). Signed-off-by: Felix Kuehling --- drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c | 59 ++-- drivers/gpu/drm/amd/amdkfd/kfd_chardev.c |

Re: [PATCH] drm/amdgpu: Add delay after enable RLC ucode

2018-11-22 Thread Kuehling, Felix
On 2018-11-22 12:03 p.m., Liu, Shaoyun wrote: > Driver shouldn't try to access any GFX registers until RLC is idle. > During the test, it took 12 seconds for RLC to clear the BUSY bit > in RLC_GPM_STAT register which is un-acceptable for driver. > As per RLC engineer, it would take RLC Ucode less t

Re: [PATCH] mm: convert totalram_pages, totalhigh_pages and managed_pages to atomic.

2018-11-23 Thread Kuehling, Felix
On 2018-10-22 1:23 p.m., Arun KS wrote: > Remove managed_page_count_lock spinlock and instead use atomic > variables. > > Suggested-by: Michal Hocko > Suggested-by: Vlastimil Babka > Signed-off-by: Arun KS Acked-by: Felix Kuehling Regards,   Felix > > --- > As discussed here, > https://patch

Re: [PATCH] drm/amdgpu: Add delay after enable RLC ucode

2018-11-23 Thread Kuehling, Felix
On 2018-11-22 1:22 p.m., Liu, Shaoyun wrote: > Driver shouldn't try to access any GFX registers until RLC is idle. > During the test, it took 12 seconds for RLC to clear the BUSY bit > in RLC_GPM_STAT register which is un-acceptable for driver. > As per RLC engineer, it would take RLC Ucode less th

[PATCH 2/2] drm/amdgpu: Avoid endless loop in GPUVM fragment processing

2018-11-26 Thread Kuehling, Felix
Don't bounce back to the root level for fragment processing, because huge pages are not supported at that level. This is unlikely to happen with the default VM size on Vega, but can be exposed by limiting the VM size with the amdgpu.vm_size module parameter. Signed-off-by: Felix Kuehling --- dri

[PATCH 1/2] drm/amdgpu: Cast to uint64_t before left shift

2018-11-26 Thread Kuehling, Felix
Avoid potential integer overflows with left shift in huge-page mapping code by casting the operand to uin64_t first. Signed-off-by: Felix Kuehling --- drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 8 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amd

RE: [PATCH 02/11] drm/amdgpu: send IVs to the KFD only after processing them v2

2018-11-30 Thread Kuehling, Felix
Won't this break VM fault handling in KFD? I don't see a way with the current code that you can leave some VM faults for KFD to process. If we could consider VM faults with VMIDs 8-15 as not handled in amdgpu and leave them for KFD to process, then this could work. As far as I can tell, the onl

Re: [PATCH 02/11] drm/amdgpu: send IVs to the KFD only after processing them v2

2018-12-03 Thread Kuehling, Felix
_CP_BAD_OPCODE_ERROR (183) GFX_9_0__SRCID__SQ_INTERRUPT_ID (239) 239 is used for signaling events from shaders and can be very frequent. Triggering an error message on those interrupts would be bad. Regards,   Felix > > Regards, > Christian. > > Am 30.11.18 um 17:31 schrieb Kuehling,

Re: [PATCH] drm/amdgpu: Update XGMI node print

2018-12-03 Thread Kuehling, Felix
Shaoyun, FYI Acked-by: Felix Kuehling On 2018-12-03 3:28 p.m., Deucher, Alexander wrote: > > Acked-by: Alex Deucher > > > *From:* amd-gfx on behalf of > Andrey Grodzovsky > *Sent:* Monday, December 3, 2018 3:03:41 PM >

Re: [Intel-gfx] [PATCH RFC 2/5] cgroup: Add mechanism to register vendor specific DRM devices

2018-12-03 Thread Kuehling, Felix
On 2018-11-28 4:14 a.m., Joonas Lahtinen wrote: > Quoting Ho, Kenny (2018-11-27 17:41:17) >> On Tue, Nov 27, 2018 at 4:46 AM Joonas Lahtinen >> wrote: >>> I think a more abstract property "% of GPU (processing power)" might >>> be a more universal approach. One can then implement that through >>

Re: [PATCH 0/4] KFD upstreaming Nov 2018, part 2

2018-12-03 Thread Kuehling, Felix
Ping. Any comments, R-b, A-b? On 2018-11-20 10:07 p.m., Kuehling, Felix wrote: > This round adds support for more ROCm memory manager features: > * VRAM limit checking to avoid overcommitment > * DMABuf import for graphics interoperability > * Support for mapping doorbells into G

Re: [PATCH 2/2] drm/amdgpu: replace get_user_pages with HMM address mirror helpers v2

2018-12-03 Thread Kuehling, Felix
See comments inline. I didn't review the amdgpu_cs and amdgpu_gem parts as I don't know them very well. On 2018-12-03 3:19 p.m., Yang, Philip wrote: > Use HMM helper function hmm_vma_fault() to get physical pages backing > userptr and start CPU page table update track of those pages. Then use > hm

Re: [PATCH 08/10] drm/amdgpu: add support for processing IH ring 1 & 2

2018-12-05 Thread Kuehling, Felix
Depending on the interrupt ring, the IRQ dispatch and processing functions will run in interrupt context or in a worker thread. Is there a way for the processing functions to find out which context it's running in? That may influence decisions whether to process interrupts in the same thread or sc

Re: [PATCH 09/10] drm/amdgpu: add support for self irq on Vega10

2018-12-05 Thread Kuehling, Felix
On 2018-12-05 4:15 a.m., Christian König wrote: > This finally enables processing of ring 1 & 2. > > Signed-off-by: Christian König > --- > drivers/gpu/drm/amd/amdgpu/vega10_ih.c | 68 -- > 1 file changed, 63 insertions(+), 5 deletions(-) > > diff --git a/drivers/gpu/drm/

Re: [PATCH 01/10] drm/amdgpu: send IVs to the KFD only after processing them v3

2018-12-05 Thread Kuehling, Felix
Patches 1-3 are Reviewed-by: Felix Kuehling I applied all 10 patches and tested them with kfdtest on Fiji and Vega10. It seems to not break anything obvious. I think I found a problem in patch 9 and have a question about patch 8 regarding the context in which interrupt processing functions would

Re: [PATCH 4/9] drm/amdgpu: Add a bitmask in amdgpu_ctx_mgr

2018-12-06 Thread Kuehling, Felix
On 2018-12-06 6:32 a.m., Rex Zhu wrote: > used to manager the reserverd vm space. > > Signed-off-by: Rex Zhu > --- > drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c | 8 ++-- > drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.h | 4 +++- > drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c | 6 +- > 3 files changed, 1

RE: [PATCH 04/16 v2] drm/amd/display: Add tracing to dc

2018-12-06 Thread Kuehling, Felix
This change seems to be breaking the build for me. I'm getting errors like this: CC [M] drivers/gpu/drm/amd/amdgpu/../display/amdgpu_dm/amdgpu_dm.o In file included from ../include/trace/events/tlb.h:9:0, from ../arch/x86/include/asm/mmu_context.h:10, from ../i

Re: [PATCH 04/16 v2] drm/amd/display: Add tracing to dc

2018-12-07 Thread Kuehling, Felix
On 2018-12-07 9:46 a.m., Wentland, Harry wrote: > On 2018-12-07 9:41 a.m., Wentland, Harry wrote: >> On 2018-12-07 12:40 a.m., Kuehling, Felix wrote: >>> This change seems to be breaking the build for me. I'm getting errors like >>> this: >>> >>>

Re: [PATCH 1/2] drm/amdgpu: add some additional vega10 pci ids

2018-12-07 Thread Kuehling, Felix
Can you add them amdkfd/kfd_device.c as well while you're at it. Thanks,   Felix On 2018-12-07 4:03 p.m., Alex Deucher wrote: > New vega ids. > > Signed-off-by: Alex Deucher > Cc: sta...@vger.kernel.org > --- > drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c | 6 ++ > 1 file changed, 6 insertions(+

  1   2   3   4   5   6   >