, 2022 11:21 PM
To: Kuehling, Felix ; amd-gfx@lists.freedesktop.org
Cc: Phillips, Daniel ; Ji, Ruili ;
Liu, Aaron
Subject: RE: [PATCH] drm/amdkfd: Remove Align VRAM allocations to 1MB on APU
ASIC
[AMD Official Use Only - General]
Thanks Felix comment, I will further debug this issue
egards,
Felix
-Original Message-
From: Huang, JinHuiEric
Sent: Thursday, April 27, 2023 14:58
To: Kuehling, Felix ; Koenig, Christian
; Christian König ;
amd-gfx@lists.freedesktop.org
Subject: Re: [PATCH] drm/amdgpu: Ignore KFD eviction fences invalidating
preemptible DMABuf imports
Hi
I don't see why this problem would be specific to Arcturus. I don't see
any excessive allocations on the stack either. Also the code involved
here hasn't changed recently.
Are you using some weird kernel config with a smaller stack? Is it
specific to a compiler version or some optimization flag
On 2019-10-18 10:27 a.m., Yang, Philip wrote:
> If device is locked for suspend and resume, kfd open should return
> failed -EAGAIN without creating process, otherwise the application exit
> to release the process will hang to wait for resume is done if the suspend
> and resume is stuck somewhere.
On 2019-10-18 1:36 p.m., Yang, Philip wrote:
> If device is locked for suspend and resume, kfd open should return
> failed -EAGAIN without creating process, otherwise the application exit
> to release the process will hang to wait for resume is done if the suspend
> and resume is stuck somewhere. T
You can squash the two reverts into a single commit so you avoid
reintroducing a broken intermediate state. Mention both reverted commits
in the squashed commit description. Checkpatch.pl prefers a different
format for quoting reverted commits. Run checkpatch.pl on your commit to
see a proper e
On 2019-10-18 4:29 p.m., Kim, Jonathan wrote:
> reverting the following changes:
> commit 7dd2eb31fcd5 ("drm/amdgpu: fix compiler warnings for df perfmons")
> commit 54275cd1649f ("drm/amdgpu: disable c-states on xgmi perfmons")
>
> perf events use spin-locks. embedded smu messages have potential
k on this myself. I'll create a ticket and see
if I can find someone to investigate.
Thanks,
Felix
>
> Andrey
>
> On 10/17/19 5:29 PM, Kuehling, Felix wrote:
>> I don't see why this problem would be specific to Arcturus. I don't see
>> any excessive
ters or send an
updated runlist to the HWS.
When the process is resumed at the end of the reset/suspend/eviction,
that's when any newly created queues would get mapped to the hardware.
Regards,
Felix
>
> Regards,
> Oak
>
> -Original Message-
> From: amd-gfx On
On 2019-10-21 5:04 p.m., Yang, Philip wrote:
> If device reset/suspend/resume failed for some reason, dqm lock is
> hold forever and this causes deadlock. Below is a kernel backtrace when
> application open kfd after suspend/resume failed.
>
> Instead of holding dqm lock in pre_reset and releasing
suspended. But I'd
like to see some safeguards in place to make sure those assumptions are
never violated.
Regards,
Felix
>
> Regards,
> Oak
>
> -Original Message-
> From: amd-gfx On Behalf Of Kuehling,
> Felix
> Sent: Monday, October 21, 2019 9:04 PM
On 2019-10-22 14:28, Yang, Philip wrote:
> If device reset/suspend/resume failed for some reason, dqm lock is
> hold forever and this causes deadlock. Below is a kernel backtrace when
> application open kfd after suspend/resume failed.
>
> Instead of holding dqm lock in pre_reset and releasing dqm
On 2019-10-24 14:46, Sierra Guiza, Alejandro (Alex) wrote:
> The bitmap in cu_info structure is defined as a 4x4 size array. In
> Acturus, this matrix is initialized as a 4x2. Based on the 8 shaders.
> In the gpu cache filling initialization, the access to the bitmap matrix
> was done as an 8x1 ins
On 2019-10-24 5:14 p.m., Zhao, Yong wrote:
> The KIQ is on the second MEC and its reservation is covered in the
> latter logic, so no need to reserve its bit twice.
>
> Change-Id: Ieee390953a60c7d43de5a9aec38803f1f583a4a9
> Signed-off-by: Yong Zhao
Reviewed-by: Felix Kuehling
> ---
> drivers
On 2019-10-28 4:10 p.m., Jason Gunthorpe wrote:
> From: Jason Gunthorpe
>
> find_vma() must be called under the mmap_sem, reorganize this code to
> do the vma check after entering the lock.
>
> Further, fix the unlocked use of struct task_struct's mm, instead use
> the mm from hmm_mirror which has
I haven't had enough time to fully understand the deferred logic in this
change. I spotted one problem, see comments inline.
On 2019-10-28 4:10 p.m., Jason Gunthorpe wrote:
> From: Jason Gunthorpe
>
> Of the 13 users of mmu_notifiers, 8 of them use only
> invalidate_range_start/end() and immedia
On 2019-10-28 4:10 p.m., Jason Gunthorpe wrote:
> From: Jason Gunthorpe
>
> Remove the interval tree in the driver and rely on the tree maintained by
> the mmu_notifier for delivering mmu_notifier invalidation callbacks.
>
> For some reason amdgpu has a very complicated arrangement where it tries
On 2019-10-30 9:52 a.m., Christian König wrote:
> Am 29.10.19 um 21:06 schrieb Huang, JinHuiEric:
>> The issue is PT BOs are not freed when unmapping VA,
>> which causes vram usage accumulated is huge in some
>> memory stress test, such as kfd big buffer stress test.
>> Function amdgpu_vm_bo_update
NAK. This won't work for several reasons.
The mmap_offset is used as offset parameter in the mmap system call. If
you check the man page of mmap, you'll see that "offset must be a
multiple of the page size". Therefore the PAGE_SHIFT is necessary.
In the case of doorbell offsets, the offset is p
On 2019-11-01 4:48 p.m., Zhao, Yong wrote:
> The new code is much cleaner and results in better readability.
>
> Change-Id: I0c1f7cca7e24ddb7b4ffe1cb0fa71943828ae373
> Signed-off-by: Yong Zhao
> ---
> drivers/gpu/drm/amd/amdkfd/kfd_chardev.c | 13 +++--
> drivers/gpu/drm/amd/amdkfd/kfd_
On 2019-11-05 5:03 p.m., Huang, JinHuiEric wrote:
> Using unified VBIOS has performance drop in sriov environment.
> The fix is switching to another register instead.
>
> Signed-off-by: Eric Huang
> ---
> drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c | 18 +++---
> 1 file changed, 15 inser
Signed-off-by: Yong Zhao
Signed-off-by: Felix Kuehling
Acked-by: Oded Gabbay
Signed-off-by: Oded Gabbay
Regards,
Felix
>
> Regards,
> Yong
> ----
> *From:* Kuehling, Felix
> *Sent:*
On 2019-11-05 5:26 p.m., Huang, JinHuiEric wrote:
> Using unified VBIOS has performance drop in sriov environment.
> The fix is switching to another register instead.
>
> Signed-off-by: Eric Huang
Reviewed-by: Felix Kuehling
> ---
> drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c | 19
On 2019-10-30 20:17, Zhao, Yong wrote:
> release_mem won't be used at all on GFX9 and GFX10, so delete it.
Hawaii was GFXv7. So we're not using the release_mem packet on GFXv8
either. Why arbitrarily limit this change to GFXv9 and 10?
Regards,
Felix
>
> Change-Id: I13787a8a29b83e7516c582a740
On 2019-10-30 20:17, Zhao, Yong wrote:
> This is cleaner.
>
> Change-Id: I8cdecad387d8c547a088c6050f77385ee1135be1
> Signed-off-by: Yong Zhao
Reviewed-by: Felix Kuehling
> ---
> .../gpu/drm/amd/amdkfd/kfd_kernel_queue_v9.c | 19 +++
> 1 file changed, 7 insertions(+), 12 deleti
On 2019-10-30 20:17, Zhao, Yong wrote:
> The kernel queue functions for v9 and v10 are the same except
> pm_map_process_v* which have small difference, so they should be reused.
> This eliminates the need of reapplying several patches which were
> applied on v9 but not on v10, such as bigger GWS an
On 2019-11-05 18:18, Zhao, Yong wrote:
> The new code uses straightforward bit shifts and thus has better
> readability.
You're missing the MMAP-related code for mmio remapping. In
kfd_ioctl_alloc_memory_of_gpu:
/* MMIO is mapped through kfd device
* Generate a kfd mmap offse
<mailto:amd-gfx-boun...@lists.freedesktop.org>
On Behalf Of Zhao, Yong
Sent: Thursday, November 7, 2019 11:57 AM
To: Kuehling, Felix <mailto:felix.kuehl...@amd.com>;
amd-gfx@lists.freedesktop.org<mailto:amd-gfx@lists.freedesktop.org>
Subject: Re: [PATCH 2/3] drm/amdkfd: only ke
On 2019-11-07 12:33, Zhao, Yong wrote:
> The new code uses straightforward bit shifts and thus has better readability.
>
> Change-Id: I0c1f7cca7e24ddb7b4ffe1cb0fa71943828ae373
> Signed-off-by: Yong Zhao
Reviewed-by: Felix Kuehling
> ---
> drivers/gpu/drm/amd/amdkfd/kfd_chardev.c | 17 +++
x Deucher <mailto:alexdeuc...@gmail.com>
Sent: Thursday, November 7, 2019 1:32 PM
To: Kuehling, Felix <mailto:felix.kuehl...@amd.com>
Cc: Zhao, Yong <mailto:yong.z...@amd.com>; Russell, Kent
<mailto:kent.russ...@amd.com>;
amd-gfx@lists.freedesktop.org<mailto:amd-gfx@list
eel terribly strongly about.
With that said, the change is
Reviewed-by: Felix Kuehling
<mailto:felix.kuehl...@amd.com>
Regards,
Felix
Regards,
Yong
From: Kuehling, Felix <mailto:felix.kuehl...@amd.com>
Sent: Thursday, November 7, 2019 2
On 2019-11-07 13:32, Alex Deucher wrote:
> On Thu, Nov 7, 2019 at 12:47 PM Kuehling, Felix
> wrote:
>> No, please lets not add a new nomenclature for PM4 packet versions. GFX
>> versions are agreed on between hardware, firmware, and software and it's
>> generally
Are you sure that setting the SQ_SHADER_TBA_HI__TRAP_EN bit on GFXv9 is
completely harmless? If the field is not defined, maybe setting the bit
makes the address invalid. It's probably worth running that through a
PSDB, which would cover Vega10, Vega20 and Arcturus.
If it actually works, the pa
: Kuehling, Felix ; Yang, Philip
; amd-gfx@lists.freedesktop.org; Jerome Glisse
Subject: Re: [PATCH] drm/amdgpu: use HMM mirror callback to replace mmu
notifier v4
Am 14.09.2018 um 22:21 schrieb Felix Kuehling:
> On 2018-09-14 01:52 PM, Christian König wrote:
>> Am 14.09.2018 um 19:47 schri
eeds to update the userptr addresses. If
the page tables are still being updated, it will block there even without
holding the amdgpu_mn_read_lock.
Regards,
Felix
From: Koenig, Christian
Sent: Thursday, September 27, 2018 3:00 AM
To: Kuehling, Felix
Cc: Yang, Philip ; amd-gfx@lists.freedesktop.
e.
I don’t see why this requires holding the read-lock until invalidate_range_end.
amdgpu_ttm_tt_affect_userptr gets called while the mn read-lock is held in
invalidate_range_start notifier.
Regards,
Felix
From: Koenig, Christian
Sent: Thursday, September 27, 2018 5:27 AM
To: Kuehling, Felix
Cc:
bles just in the moment
> between the check of amdgpu_ttm_tt_userptr_needs_pages() and adding the fence
> to the reservation object.
I’m not planning to change that. I don’t think there is any need to change it.
Regards,
Felix
From: Koenig, Christian
Sent: Thursday, September 27, 2018 7
hole argument is that you don’t need to hold the read lock until the
invalidate_range_end. Just read_lock and read_unlock in the
invalidate_range_start function.
Regards,
Felix
From: Koenig, Christian
Sent: Thursday, September 27, 2018 9:22 AM
To: Kuehling, Felix
Cc: Yang, Philip ; amd-gf
er 27, 2018 9:59 AM
To: Kuehling, Felix
Cc: Yang, Philip ; amd-gfx@lists.freedesktop.org; Jerome
Glisse
Subject: RE: [PATCH] drm/amdgpu: use HMM mirror callback to replace mmu
notifier v4
Yeah I understand that, but again that won't work.
In this case you can end up accessing pages which
I think the answer is here:
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/Documentation/vm/hmm.rst#n216
Regards,
Felix
From: Koenig, Christian
Sent: Thursday, September 27, 2018 10:30 AM
To: Kuehling, Felix
Cc: j.gli...@gmail.com; Yang, Philip ;
amd-gfx
Hi Alex,
If it's not too late, I'd like to get this into 4.19. Sorry I missed this fix
earlier.
Regards,
Felix
____
From: Kuehling, Felix
Sent: Tuesday, October 2, 2018 6:41:12 PM
To: amd-gfx@lists.freedesktop.org
Cc: oded.gab...@gmail.com; Kuehl
On 2018-10-18 6:03 p.m., Deucher, Alexander wrote:
>
> Series is:
>
> Reviewed-by: Alex Deucher
>
Reviewed-by: Felix Kuehling
as well.
>
> *From:* amd-gfx on behalf of
> Lin, Amber
> *Sent:* Thursday, October 18, 2018
On 2018-10-18 5:59 p.m., wrote:
>
> Please include a patch description on 2 and 3, with that fixed, series is:
>
> Reviewed-by: Alex Deucher
>
Reviewed-by: Felix Kuehling
>
> *From:* Zhao, Yong
> *Sent:* Thursday, October
[+Christian]
Should the buffer funcs also use the paging ring? I think that would be
important for being able to clear page tables or migrating a BO while
handling a page fault.
Regards,
Felix
On 2018-10-19 3:13 p.m., Yang, Philip wrote:
> For sdma v4, there is bug caused by
> commit d4e869b6b
On 2018-10-19 11:15 a.m., Lin, Amber wrote:
> Add amdgpu_amdkfd_ prefix to amdgpu functions served for amdkfd usage.
>
> v2: fix indentation
>
> Signed-off-by: Amber Lin
Reviewed-by: Felix Kuehling
> ---
> drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c | 18 +-
> drivers/gpu/
The BIOS signature check does not guarantee integrity of the BIOS image
either way. As I understand it, the signature is just a magic number.
It's not a cryptographic signature. The check is just a sanity check.
Therefore this change doesn't add any meaningful protection against the
scenario you de
Patch 1 is Reviewed-by: Felix Kuehling
Patch 2: I'm not sure we need the "lock" parameter and the invalidation
engine parameter. If we're serious about consolidating TLB invalidation
between amdgpu and KFD, I think we should use the same invalidation
engine and the same lock. Then you also don't
It occurred to me that the flush_type is a hardware-specific value, but
you're using it in a hardware-abstracted interface. If the meaning of
the flush type values changes in future HW-generations, we'll need to
define an abstract enum that gets translated to the respective HW values
in the HW-spec
The series is Reviewed-by: Felix Kuehling
On 2018-10-23 1:00 p.m., Zhao, Yong wrote:
>
> How about those two patches?
>
>
> Yong
>
>
> *From:* Zhao, Yong
> *Sent:* Monday, October 22, 2018 2:33:26 PM
> *To:* amd-gfx@lists.f
On 2018-10-25 10:38 a.m., Christian König wrote:
> Make sure we don't try to go down further after the leave walk already
> ended. This fixes a crash with a new VM test.
>
> Signed-off-by: Christian König
Reviewed-by: Felix Kuehling
Regards,
Felix
> ---
> drivers/gpu/drm/amd/amdgpu/amdgpu_v
On 2018-10-25 2:27 p.m., Alex Deucher wrote:
> On Mon, Oct 22, 2018 at 6:25 PM Alex Deucher wrote:
>> Use the appropriate mmhub and gfxhub headers rather than adding
>> them to the gmc9 header.
>>
>> Signed-off-by: Alex Deucher
> Ping?
Reviewed-by: Felix Kuehling
>
> Alex
>
>> ---
>> drivers
On 2018-11-02 9:48 a.m., Christian König wrote:
> Vega10 has multiple interrupt rings,
I don't think I've seen your code that implements multiple interrupt
rings. So it's a bit hard to comment. As I understand it, the only way
this could happen is, if the two interrupt rings are handled by
differe
On 2018-11-04 2:20 p.m., Christian König wrote:
> Am 02.11.18 um 19:59 schrieb Kuehling, Felix:
>> On 2018-11-02 9:48 a.m., Christian König wrote:
>>> Vega10 has multiple interrupt rings,
>> I don't think I've seen your code that implements multiple interru
These are some recent patches that are easy to upstream (part 1). For
part 2 (hopefully still this month) I'll need to advance the merging
of KFD into amdgpu a little further to avoid upstreaming duplicated
data structures that no longer need to be duplicated.
Eric Huang (1):
drm/amdkfd: change
From: Yong Zhao
This will make reading code much easier. This fixes a few spots missed in a
previous commit with the same title.
Signed-off-by: Yong Zhao
Reviewed-by: Felix Kuehling
Signed-off-by: Felix Kuehling
---
drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c | 12 ++--
1 f
The adev parameter in amdgpu_sync_fence and amdgpu_sync_resv is only
needed for updating sync->last_vm_update. This breaks if different
adevs are passed to calls for the same sync object.
Always pass NULL for calls from KFD because sync objects used for
KFD don't belong to any particular device, a
From: Harish Kasiviswanathan
PD or PT might have to be moved during validation and this move has to be
completed before updating it. If page table updates are done using SDMA
then this serializing is done by SDMA command submission.
And if PD/PT updates are done by CPU, then explicit waiting for
From: Harish Kasiviswanathan
Instead of waiting for each KFD BO after validation just wait for the
last BO moving fence.
Signed-off-by: Harish Kasiviswanathan
Reviewed-by: Felix Kuehling
Signed-off-by: Felix Kuehling
---
drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c | 13 -
1
From: Gang Ba
Add Vega12 and Polaris12 device info and device IDs to KFD.
Signed-off-by: Gang Ba
Reviewed-by: Felix Kuehling
Signed-off-by: Felix Kuehling
---
drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c | 2 +
drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c | 3 +-
drivers/gpu/dr
From: Yong Zhao
This makes debug message get printed even when there is early return.
Signed-off-by: Yong Zhao
Reviewed-by: Felix Kuehling
Signed-off-by: Felix Kuehling
---
drivers/gpu/drm/amd/amdkfd/kfd_int_process_v9.c | 14 +++---
1 file changed, 7 insertions(+), 7 deletions(-)
d
This change prepares for adding SG BOs that will be used for mapping
doorbells into GPUVM address space.
This type of BO would be mistaken for an invalid userptr BO. Improve
that check to test that it's actually a userptr BO so that SG BOs that
are still in the CPU domain can be validated and mapp
From: Yong Zhao
This is a known gfx9 HW issue, and this change can perfectly workaround
the issue.
Signed-off-by: Yong Zhao
Reviewed-by: Felix Kuehling
Signed-off-by: Felix Kuehling
---
drivers/gpu/drm/amd/amdkfd/kfd_int_process_v9.c | 25 ++---
1 file changed, 22 inserti
From: Eric Huang
It is to improve system limit by:
1. replacing userptrlimit with a total memory limit that
conunts TTM memory usage and userptr usage.
2. counting acc size for all BOs.
Signed-off-by: Eric Huang
Reviewed-by: Felix Kuehling
Signed-off-by: Felix Kuehling
---
drivers/gpu/drm/am
[+Philip]
On 2018-11-07 12:25 a.m., Zhang, Jerry(Junwei) wrote:
> On 11/7/18 1:15 PM, Trigger Huang wrote:
>> Currently, SDMA page queue is not used under SR-IOV VF, and this
>> queue will
>> cause ring test failure in amdgpu module reload case. So just disable
>> it.
>>
>> Signed-off-by: Trigger
On 2018-11-12 12:09 p.m., Christian König wrote:
> We accidentially set the huge flag on the parent instead of the childs.
> This caused some VM faults under memory pressure.
Reviewed-by: Felix Kuehling
I got a bit confused when re-reading this code. Maybe part of it is that
cursor.entry is not
This change is not suitable for amd-staging-drm-next. PCIe P2P was not enabled
on amd-staging-drm-next because it's not reliable yet. This change enables it
even in situations that are not safe (including small BAR systems).
Why are you porting this change to amd-staging-drm-next? Does anyone de
You changed the doorbell routing in NBIO. I think that won't work for SR-IOV,
because it's not controlled by the guest OS there. We may need to disable
paging queue doorbell on Vega10 and Vega12 with SRIOV. For Vega20 we plan to
change the doorbell layout before it goes to production (Oak starte
, Kent
Sent: Thursday, November 15, 2018 1:04 PM
To: Kuehling, Felix ; amd-gfx@lists.freedesktop.org
Cc: Liu, Shaoyun
Subject: Re: [PATCH] drm/amdgpu : Use XGMI mapping when devices on the same
hive v2
It was merged to 4.19 on Sept 21. It got missed on the 4.20 rebase.
Kent
KENT RUSSELL
Sr
Sorry, something is still missing here. The new variable vram_base_offset isn't
used anywhere. We have some other changes in amd-kfd-staging to use that
vram_base_offset that are probably missing on amd-staging-drm-next. This change
won't have any effect as is.
Regards,
Felix
-Original M
Apologies. We already have a fix for this on our internal amd-kfd-staging
branch, but it's missing from amd-staging-drm-next. I'll cherry-pick our fix to
amd-staging-drm-next and nominate it for drm-fixes.
Regards,
Felix
-Original Message-
From: amd-gfx On Behalf Of Joerg Roedel
Sent
On Vega20 and other pre-production GPUs, powerplay is not enabled yet.
Check for NULL pointers before calling pp_funcs function pointers.
Also affects Kaveri.
CC: Joerg Roedel
Signed-off-by: Felix Kuehling
---
drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c | 7 +--
1 file changed, 5 insertions
Looks good to me. Reviewed-by: Felix Kuehling
I hope Alex or Christian can also review this in case I'm missing
something about how doorbells are used in amdgpu.
Regards,
Felix
On 2018-11-16 2:08 p.m., Yang, Philip wrote:
> Because increase SDMA_DOORBELL_RANGE to add new SDMA doorbell for pag
OK with
this I'll go ahead and push this upstream as well.
Thanks,
Felix
On 2018-11-05 8:40 p.m., Kuehling, Felix wrote:
> The adev parameter in amdgpu_sync_fence and amdgpu_sync_resv is only
> needed for updating sync->last_vm_update. This breaks if different
> adevs are pas
On 2018-11-16 3:30 p.m., Alex Deucher wrote:
> Looks like a copy paste typo.
>
> Signed-off-by: Alex Deucher
Reviewed-by: Felix Kuehling
> ---
> drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c
On 2018-11-16 4:35 p.m., Alex Deucher wrote:
>> + ring->doorbell_index += 0x400;
> I don't quite understand how this works. Why don't we have to adjust
> the doorbell range registers in the nbio code?
NBIO only looks at the lower 12 bits of the doorbell address. So adding
0
lex Deucher
>
>
> *From:* amd-gfx on behalf of
> Kuehling, Felix
> *Sent:* Thursday, November 15, 2018 4:56:51 PM
> *To:* amd-gfx@lists.freedesktop.org
> *Cc:* Kuehling, Felix; Joerg Roedel
> *Subject:* [PATCH] drm/amdgpu: Fix oops when
> p
Hi Christian,
On 2018-11-19 6:24 a.m., Christian König wrote:
> Am 15.11.18 um 20:10 schrieb Yang, Philip:
>> paging queues doorbell index use existing assignment
>> sDMA_HI_PRI_ENGINE0/1
>> index, and increase SDMA_DOORBELL_RANGE size from 2 dwords to 4
>> dwords to
>> enable the new doorbell ind
top_dev->gpu is NULL for CPUs. Avoid dereferencing it if NULL.
Signed-off-by: Felix Kuehling
---
drivers/gpu/drm/amd/amdkfd/kfd_topology.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_topology.c
b/drivers/gpu/drm/amd/amdkfd/kfd_topology.c
in
This round adds support for more ROCm memory manager features:
* VRAM limit checking to avoid overcommitment
* DMABuf import for graphics interoperability
* Support for mapping doorbells into GPUVM address space
Felix Kuehling (4):
drm/amdgpu: Add KFD VRAM limit checking
drm/amdkfd: Add NULL-p
This is used for interoperability between ROCm compute and graphics
APIs. It allows importing graphics driver BOs into the ROCm SVM
address space for zero-copy GPU access.
The API is split into two steps (query and import) to allow user mode
to manage the virtual address space allocation for the i
We don't want KFD processes evicting each other over VRAM usage.
Therefore prevent overcommitting VRAM among KFD applications with
a per-GPU limit. Also leave enough room for page tables on top
of the application memory usage.
Signed-off-by: Felix Kuehling
Reviewed-by: Eric Huang
---
drivers/gp
This allows user mode to map doorbell pages into GPUVM address space.
That way GPUs can submit to user mode queues (self-dispatch).
Signed-off-by: Felix Kuehling
---
drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c | 59 ++--
drivers/gpu/drm/amd/amdkfd/kfd_chardev.c |
On 2018-11-22 12:03 p.m., Liu, Shaoyun wrote:
> Driver shouldn't try to access any GFX registers until RLC is idle.
> During the test, it took 12 seconds for RLC to clear the BUSY bit
> in RLC_GPM_STAT register which is un-acceptable for driver.
> As per RLC engineer, it would take RLC Ucode less t
On 2018-10-22 1:23 p.m., Arun KS wrote:
> Remove managed_page_count_lock spinlock and instead use atomic
> variables.
>
> Suggested-by: Michal Hocko
> Suggested-by: Vlastimil Babka
> Signed-off-by: Arun KS
Acked-by: Felix Kuehling
Regards,
Felix
>
> ---
> As discussed here,
> https://patch
On 2018-11-22 1:22 p.m., Liu, Shaoyun wrote:
> Driver shouldn't try to access any GFX registers until RLC is idle.
> During the test, it took 12 seconds for RLC to clear the BUSY bit
> in RLC_GPM_STAT register which is un-acceptable for driver.
> As per RLC engineer, it would take RLC Ucode less th
Don't bounce back to the root level for fragment processing, because
huge pages are not supported at that level. This is unlikely to happen
with the default VM size on Vega, but can be exposed by limiting the
VM size with the amdgpu.vm_size module parameter.
Signed-off-by: Felix Kuehling
---
dri
Avoid potential integer overflows with left shift in huge-page mapping
code by casting the operand to uin64_t first.
Signed-off-by: Felix Kuehling
---
drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 8
1 file changed, 4 insertions(+), 4 deletions(-)
diff --git a/drivers/gpu/drm/amd/amdgpu/amd
Won't this break VM fault handling in KFD? I don't see a way with the current
code that you can leave some VM faults for KFD to process. If we could consider
VM faults with VMIDs 8-15 as not handled in amdgpu and leave them for KFD to
process, then this could work.
As far as I can tell, the onl
_CP_BAD_OPCODE_ERROR (183)
GFX_9_0__SRCID__SQ_INTERRUPT_ID (239)
239 is used for signaling events from shaders and can be very frequent.
Triggering an error message on those interrupts would be bad.
Regards,
Felix
>
> Regards,
> Christian.
>
> Am 30.11.18 um 17:31 schrieb Kuehling,
Shaoyun, FYI
Acked-by: Felix Kuehling
On 2018-12-03 3:28 p.m., Deucher, Alexander wrote:
>
> Acked-by: Alex Deucher
>
>
> *From:* amd-gfx on behalf of
> Andrey Grodzovsky
> *Sent:* Monday, December 3, 2018 3:03:41 PM
>
On 2018-11-28 4:14 a.m., Joonas Lahtinen wrote:
> Quoting Ho, Kenny (2018-11-27 17:41:17)
>> On Tue, Nov 27, 2018 at 4:46 AM Joonas Lahtinen
>> wrote:
>>> I think a more abstract property "% of GPU (processing power)" might
>>> be a more universal approach. One can then implement that through
>>
Ping. Any comments, R-b, A-b?
On 2018-11-20 10:07 p.m., Kuehling, Felix wrote:
> This round adds support for more ROCm memory manager features:
> * VRAM limit checking to avoid overcommitment
> * DMABuf import for graphics interoperability
> * Support for mapping doorbells into G
See comments inline. I didn't review the amdgpu_cs and amdgpu_gem parts
as I don't know them very well.
On 2018-12-03 3:19 p.m., Yang, Philip wrote:
> Use HMM helper function hmm_vma_fault() to get physical pages backing
> userptr and start CPU page table update track of those pages. Then use
> hm
Depending on the interrupt ring, the IRQ dispatch and processing
functions will run in interrupt context or in a worker thread.
Is there a way for the processing functions to find out which context
it's running in? That may influence decisions whether to process
interrupts in the same thread or sc
On 2018-12-05 4:15 a.m., Christian König wrote:
> This finally enables processing of ring 1 & 2.
>
> Signed-off-by: Christian König
> ---
> drivers/gpu/drm/amd/amdgpu/vega10_ih.c | 68 --
> 1 file changed, 63 insertions(+), 5 deletions(-)
>
> diff --git a/drivers/gpu/drm/
Patches 1-3 are Reviewed-by: Felix Kuehling
I applied all 10 patches and tested them with kfdtest on Fiji and
Vega10. It seems to not break anything obvious.
I think I found a problem in patch 9 and have a question about patch 8
regarding the context in which interrupt processing functions would
On 2018-12-06 6:32 a.m., Rex Zhu wrote:
> used to manager the reserverd vm space.
>
> Signed-off-by: Rex Zhu
> ---
> drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c | 8 ++--
> drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.h | 4 +++-
> drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c | 6 +-
> 3 files changed, 1
This change seems to be breaking the build for me. I'm getting errors like this:
CC [M] drivers/gpu/drm/amd/amdgpu/../display/amdgpu_dm/amdgpu_dm.o
In file included from ../include/trace/events/tlb.h:9:0,
from ../arch/x86/include/asm/mmu_context.h:10,
from ../i
On 2018-12-07 9:46 a.m., Wentland, Harry wrote:
> On 2018-12-07 9:41 a.m., Wentland, Harry wrote:
>> On 2018-12-07 12:40 a.m., Kuehling, Felix wrote:
>>> This change seems to be breaking the build for me. I'm getting errors like
>>> this:
>>>
>>>
Can you add them amdkfd/kfd_device.c as well while you're at it.
Thanks,
Felix
On 2018-12-07 4:03 p.m., Alex Deucher wrote:
> New vega ids.
>
> Signed-off-by: Alex Deucher
> Cc: sta...@vger.kernel.org
> ---
> drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c | 6 ++
> 1 file changed, 6 insertions(+
1 - 100 of 558 matches
Mail list logo