On 2023-01-16 17:04, Errabolu, Ramesh wrote:
[AMD Official Use Only - General]
A minor comment, unrelated to the patch. The comments are inline.
Regards,
Ramesh
-Original Message-
From: amd-gfx On Behalf Of Felix
Kuehling
Sent: Thursday, January 12, 2023 7:02 AM
To: amd-gfx
Am 2023-01-16 um 06:42 schrieb Christian König:
[SNIP]
When the BO is imported into the same GPU, you get a reference to
the same BO, so the imported BO has the same mmap_offset as the
original BO.
When the BO is imported into a different GPU, it is a new BO with a
new mmap_offset.
That wo
Am 2023-01-15 um 11:43 schrieb Christian König:
Am 14.01.23 um 00:15 schrieb Felix Kuehling:
On 2023-01-13 18:00, Chen, Xiaogang wrote:
On 1/13/2023 4:26 PM, Felix Kuehling wrote:
On 2023-01-12 17:41, Chen, Xiaogang wrote:
On 1/11/2023 7:31 PM, Felix Kuehling wrote:
Use proper
On 2023-01-13 18:00, Chen, Xiaogang wrote:
On 1/13/2023 4:26 PM, Felix Kuehling wrote:
On 2023-01-12 17:41, Chen, Xiaogang wrote:
On 1/11/2023 7:31 PM, Felix Kuehling wrote:
Use proper amdgpu_gem_prime_import function to handle all kinds of
imports. Remember the dmabuf reference to enable
On 2023-01-12 17:41, Chen, Xiaogang wrote:
On 1/11/2023 7:31 PM, Felix Kuehling wrote:
Use proper amdgpu_gem_prime_import function to handle all kinds of
imports. Remember the dmabuf reference to enable proper multi-GPU
attachment to multiple VMs without erroneously re-exporting the
underlying
create queues.
Add helper macro KFD_SUPPORT_XNACK_PER_PROCESS to remove duplicate code
and add new ASICs support in future.
Signed-off-by: Philip Yang
Reviewed-by: Felix Kuehling
---
.../amd/amdkfd/kfd_device_queue_manager_v9.c | 27 +--
drivers/gpu/drm/amd/amdkfd/kfd_priv.h
This is needed to correctly handle BOs imported into the GEM API, which
would otherwise get added twice to the same VM.
Signed-off-by: Felix Kuehling
---
.../gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c | 28 +++
1 file changed, 22 insertions(+), 6 deletions(-)
diff --git a
comments, remove TODOs that are no longer applicable
Signed-off-by: Felix Kuehling
---
.../gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c | 28 +++
1 file changed, 22 insertions(+), 6 deletions(-)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
b/drivers/gpu/drm/amd
Instead of attaching the eviction fence when a KFD BO is first mapped,
attach it when it is allocated or imported. This in preparation to allow
KFD BOs to be mapped using the render node API.
Signed-off-by: Felix Kuehling
---
.../gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c | 63
Let amdgpu_vm_handle_moved update all BO VA mappings of BOs reserved by
the caller. This will be useful for handling extra BO VA mappings in
KFD VMs that are managed through the render node API.
Signed-off-by: Felix Kuehling
Reviewed-by: Christian König
---
drivers/gpu/drm/amd/amdgpu
user mode change (Thunk API and kfdtest) is here:
https://github.com/fxkamd/ROCT-Thunk-Interface/commits/fxkamd/dmabuf
Signed-off-by: Felix Kuehling
---
drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h| 2 +
.../gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c | 45 +++
drivers/gpu/drm/amd
single render node FD and GPUVM address
space.
The DMABuf export API will also be used later for upstream IPC and RDMA
implementations.
Felix Kuehling (6):
drm/amdgpu: Generalize KFD dmabuf import
drm/amdkfd: Implement DMA buf fd export from KFD
drm/amdkfd: Improve amdgpu_vm_handle_moved
drm
Use proper amdgpu_gem_prime_import function to handle all kinds of
imports. Remember the dmabuf reference to enable proper multi-GPU
attachment to multiple VMs without erroneously re-exporting the
underlying BO multiple times.
Signed-off-by: Felix Kuehling
---
.../gpu/drm/amd/amdgpu
trigger
WARN_ONCE(adev && adev->kfd.vram_used < 0, "..."), to help debug the
accounting issue with warning and backtrace.
Signed-off-by: Philip Yang
Reviewed-by: Felix Kuehling
---
drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h | 2 +-
drivers/gpu/drm/amd/amdgpu/am
Am 2023-01-10 um 15:44 schrieb Philip Yang:
On 2023-01-10 13:33, Felix Kuehling wrote:
Am 2023-01-10 um 12:11 schrieb Philip Yang:
Use page aligned size to reserve memory usage because page aligned TTM
BO size is used to unreserve memory usage, otherwise no page aligned
size causes memory
Am 2023-01-10 um 12:11 schrieb Philip Yang:
Use page aligned size to reserve memory usage because page aligned TTM
BO size is used to unreserve memory usage, otherwise no page aligned
size causes memory usage accounting unbalanced.
Change vram_used definition type to int64_t to be able to trigge
Am 2023-01-05 um 14:28 schrieb Eric Huang:
The point bo->kfd_bo is NULL for queue's write pointer BO
when creating queue on mGPU. To avoid using the pointer
fixes the error.
Signed-off-by: Eric Huang
Reviewed-by: Felix Kuehling
---
drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd
Am 2023-01-10 um 10:19 schrieb Jason Gunthorpe:
On Tue, Jan 10, 2023 at 10:05:44AM -0500, Felix Kuehling wrote:
Am 2023-01-10 um 08:45 schrieb Christian König:
And I'm like 99% sure that Kabini/Wani should be identical to that.
Kabini is not supported by KFD. There should be no cal
Am 2023-01-10 um 08:45 schrieb Christian König:
And I'm like 99% sure that Kabini/Wani should be identical to that.
Kabini is not supported by KFD. There should be no calls to
amd_iommu_... functions on Kabini, at least not from kfd_iommu.c. And
I'm not aware of any other callers in amdgpu.ko
Am 2023-01-09 um 19:01 schrieb Philip Yang:
Use page aligned size to reserve VRAM usage because page aligned TTM BO
size is used to unreserve VRAM usage, otherwise this cause vram_used
accounting unbalanced.
Change vram_used definition type to int64_t to be able to trigger
WARN_ONCE(adev && adev
Am 2023-01-07 um 15:09 schrieb Deepak R Varma:
On Fri, Dec 23, 2022 at 02:45:00AM +0530, Deepak R Varma wrote:
Use the resource_size() function instead of a open coded computation
resource size. It makes the code more readable.
Issue identified using resource_size.cocci coccinelle semantic pa
Am 2023-01-09 um 15:23 schrieb Felix Kuehling:
Am 2023-01-09 um 15:18 schrieb Philip Yang:
On 2023-01-09 14:27, Eric Huang wrote:
There will be data corruption on vram allocated by svm
if initialization is not being done. Adding sync is to
resolve this issue.
Signed-off-by: Eric Huang
Am 2023-01-09 um 15:18 schrieb Philip Yang:
On 2023-01-09 14:27, Eric Huang wrote:
There will be data corruption on vram allocated by svm
if initialization is not being done. Adding sync is to
resolve this issue.
Signed-off-by: Eric Huang
---
drivers/gpu/drm/amd/amdkfd/kfd_svm.c | 7 +++
done. It is being done as a result of
setting AMDGPU_GEM_CREATE_VRAM_CLEARED. The problem is that the
initialization is not complete yet, so it can corrupt data written by
the application unless we wait for it to finish first.
Other than that the patch is
Reviewed-by: Felix Kuehling
Am 2023-01-04 um 04:23 schrieb Shashank Sharma:
On 04/01/2023 10:17, Christian König wrote:
Am 04.01.23 um 10:13 schrieb Shashank Sharma:
On 04/01/2023 10:10, Christian König wrote:
Am 04.01.23 um 07:21 schrieb Yadav, Arvind:
On 1/4/2023 12:07 AM, Felix Kuehling wrote:
Am 2023-01-03 um
Am 2022-12-27 um 12:12 schrieb Alex Deucher:
On Tue, Dec 27, 2022 at 12:10 PM Alex Deucher wrote:
A number of of the gfx8 cases where the same. Clean them
up.
typos here fixed up locally.
Alex
Signed-off-by: Alex Deucher
Reviewed-by: Felix Kuehling
---
drivers/gpu/drm/amd/amdkfd
goal, hence the flag for AQL vs PM4.
Alex
Regards
Shaoyun.liu
-Original Message-
From: amd-gfx On Behalf Of
Felix Kuehling
Sent: Tuesday, January 3, 2023 1:30 PM
To: Sharma, Shashank ;
amd-gfx@lists.freedesktop.org
Cc: Deucher, Alexander ; Koenig, Christian
; Yadav, Arvind ;
Paneer S
Am 2023-01-03 um 04:36 schrieb Shashank Sharma:
/*MQD struct for usermode Queue*/
+struct amdgpu_usermode_queue_mqd
This is specific to GC 11. Every IP and version will have its own MQD
format. That should live in the IP specific code, not the generic
code. We already have the generic MQD par
Am 2022-12-23 um 14:36 schrieb Shashank Sharma:
From: Alex Deucher
This patch intorduces new UAPI/IOCTL for usermode graphics
queue. The userspace app will fill this structure and request
the graphics driver to add a graphics work queue for it. The
output of this UAPI is a queue id.
This UAPI
Am 2023-01-03 um 11:05 schrieb Waiman Long:
On 1/3/23 10:39, Felix Kuehling wrote:
The regression point doesn't make sense. The kernel config doesn't
enable CONFIG_DRM_AMDGPU, so there is no way that a change in AMDGPU
could have caused this regression.
I agree. It is likely a pr
The regression point doesn't make sense. The kernel config doesn't
enable CONFIG_DRM_AMDGPU, so there is no way that a change in AMDGPU
could have caused this regression.
Regards,
Felix
Am 2022-12-29 um 01:26 schrieb syzbot:
syzbot has found a reproducer for the following issue on:
HEAD c
off-by: Philip Yang
Reviewed-by: Felix Kuehling
I'm still curious what caused the acquire_vm failure in the first place.
Regards,
Felix
---
drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h | 4 ++--
.../gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c | 18 ++
drivers/gpu/drm
00400 RSI: 0400 RDI: 7f32831ae000
[ 84.727944] RBP: 7fffb06c4750 R08: 7fffb06c4548 R09: 55e7570ad230
[ 84.735809] R10: 55e757088010 R11: 0246 R12: 55e75453cefa
[ 84.743688] R13: R14: 0021 R15:
[ 84.7
König
Signed-off-by: Felix Kuehling
---
drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c| 27 ++-
drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h| 58 ++-
drivers/gpu/drm/amd/amdgpu/amdgpu_vm_pt.c | 6 ++-
drivers/gpu/drm/amd/amdkfd/kfd_svm.c | 4 +-
4 files changed, 77
This points to a mutex to serialize with MMU notifiers during page table
updates. For graphics contexts, the notifier lock is per adev. For compute
contexts the lock is per process.
Signed-off-by: Felix Kuehling
---
drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c | 3 +++
drivers/gpu/drm/amd
t_rcu+0xd7/0x130
[ +0.004205] softirqs last disabled at (59649): []
irq_exit_rcu+0xd7/0x130
[ +0.004203] ---[ end trace ]---
Fixes: 0f28cca87e9a ("drm/amdkfd: Extend KFD device topology to surface peer-to-peer
links")
Signed-off-by: Mukul Joshi
Reviewe
6/0x270 [drm]
drm_close_helper.isra.14+0x60/0x70 [drm]
drm_release+0x6e/0xf0 [drm]
__fput+0xcc/0x280
fput+0xe/0x20
task_work_run+0x96/0xc0
do_exit+0x3d0/0xc10
Suggested-by: Felix Kuehling
I don't think I suggested this fix. I didn't realize that the problem
only a
run+0x96/0xc0
do_exit+0x3d0/0xc10
Suggested-by: Felix Kuehling
Signed-off-by: Philip Yang
---
drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c | 8 +++-
1 file changed, 7 insertions(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c
b/drivers/gpu/drm/amd/amdgpu/amdgpu_k
: Mukul Joshi
Reviewed-by: Felix Kuehling
---
drivers/gpu/drm/amd/amdgpu/amdgpu_irq.h | 2 +
drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c | 51 ---
drivers/gpu/drm/amd/amdgpu/nbio_v7_4.c| 2 +-
drivers/gpu/drm/amd/amdgpu/vega20_ih.c| 46
to evict first.
Signed-off-by: Christian König
Reviewed-by: Felix Kuehling
---
drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c| 16 +---
drivers/gpu/drm/amd/amdgpu/amdgpu_object.c | 6 +-
2 files changed, 18 insertions(+), 4 deletions(-)
diff --git a/drivers/gpu/drm/amd
Am 2022-12-10 um 09:12 schrieb Christian König:
Am 10.12.22 um 07:15 schrieb Felix Kuehling:
On 2022-11-25 05:21, Christian König wrote:
We already fallback to a dummy BO with no backing store when we
allocate GDS,GWS and OA resources and to GTT when we allocate VRAM.
Drop all those
On 2022-11-25 05:21, Christian König wrote:
We already fallback to a dummy BO with no backing store when we
allocate GDS,GWS and OA resources and to GTT when we allocate VRAM.
Drop all those workarounds and generalize this for GTT as well. This
fixes ENOMEM issues with runaway applications which
Am 2022-12-08 um 12:39 schrieb Christian König:
Am 08.12.22 um 17:28 schrieb Felix Kuehling:
Am 2022-12-08 um 10:44 schrieb Christian König:
Am 08.12.22 um 16:19 schrieb Felix Kuehling:
Am 2022-12-08 um 07:32 schrieb Christian König:
Hi Felix,
digging though the code I think I know now how
Am 2022-12-08 um 10:44 schrieb Christian König:
Am 08.12.22 um 16:19 schrieb Felix Kuehling:
Am 2022-12-08 um 07:32 schrieb Christian König:
Hi Felix,
digging though the code I think I know now how we can solve this.
The lock which needs to protect the validity of the pages is the
vm
holding vram_mgr->lock.
Regards,
Felix
Or am I missing something here?
Regards,
Christian.
Am 06.12.22 um 16:57 schrieb Christian König:
Am 06.12.22 um 16:14 schrieb Felix Kuehling:
Am 2022-12-06 um 03:20 schrieb Christian König:
Hi Felix,
to be honest I think the whole approach yo
h from drm-next, not amd-staging-drm-next) Other
than that, the patch is
Reviewed-by: Felix Kuehling
---
drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c
b/drivers/gpu/drm/amd/amdgpu/gmc_v9
Felix
Invalidating the mappings and eventually scheduling that they are
re-created is a separate step which should come independent of this if
I'm not completely mistaken.
Regards,
Christian.
Am 06.12.22 um 01:04 schrieb Felix Kuehling:
We fixed a similar issue with Philip's patc
Add a per-process MMU notifier lock for processing notifiers from
userptrs. Use that lock to properly synchronize page table updates with
MMU notifiers.
v2: rebased
Signed-off-by: Felix Kuehling
Reviewed-by: Xiaogang Chen (v1)
---
drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h| 13 +-
.../gpu
We fixed a similar issue with Philip's patch "drm/amdgpu: Drop eviction
lock when allocating PT BO", but there was another one hiding underneath
that (see the log below). The problem is, that we're still allocating
page tables while holding the prange->lock in the kfd_svm code, which is
also he
On 2022-12-01 10:38, Peter Maucher wrote:
The amdgpu kernel module has supported RDNA for a while,
mention that in the module description.
Signed-off-by: Peter Maucher
---
Documentation/gpu/amdgpu/index.rst | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/Documentation/gpu
On 2022-12-01 10:38, Peter Maucher wrote:
GART and GTT are two abbreviations that should be mentioned in the
glossary.
Signed-off-by: Peter Maucher
---
Documentation/gpu/amdgpu/amdgpu-glossary.rst | 6 ++
1 file changed, 6 insertions(+)
diff --git a/Documentation/gpu/amdgpu/amdgpu-gloss
On 2022-10-31 12:23, Jonathan Kim wrote:
Add missing debug trap registers references and initialize all debug
registers on boot by clearing the hardware exception overrides and the
wave allocation ID index.
For debug devices that only support single process debugging, enable
trap temporary set
On 2022-11-22 18:38, Felix Kuehling wrote:
On 2022-10-31 12:23, Jonathan Kim wrote:
Add missing debug trap registers references and initialize all debug
registers on boot by clearing the hardware exception overrides and the
wave allocation ID index.
For debug devices that only support
On 2022-10-31 12:23, Jonathan Kim wrote:
Introduce the require KGD debug calls that will execute hardware debug
mode setting.
Signed-off-by: Jonathan Kim
Reviewed-by: Felix Kuehling
---
.../gpu/drm/amd/include/kgd_kfd_interface.h | 34 +++
1 file changed, 34
On 2022-10-31 12:23, Jonathan Kim wrote:
Allow the debugger to set wave behaviour on to either normally operate,
halt at launch, trap on every instruction, terminate immediately or
stall on allocation.
Signed-off-by: Jonathan Kim
---
.../drm/amd/amdgpu/amdgpu_amdkfd_aldebaran.c | 18 ++
On 2022-10-31 12:23, Jonathan Kim wrote:
Bump the minor version to declare debugging capability is now
available.
Signed-off-by: Jonathan Kim
Reviewed-by: Felix Kuehling
---
drivers/gpu/drm/amd/amdkfd/kfd_chardev.c | 1 -
include/uapi/linux/kfd_ioctl.h | 3 ++-
2 files
y_size)
+{
+ struct kfd_dbg_device_info_entry device_info = {0};
Use memset. With that fixed, the patch is
Reviewed-by: Felix Kuehling
+ uint32_t tmp_entry_size = *entry_size, tmp_num_devices;
+ int i, r = 0;
+
+ if (!(target && user_info && number_of_de
On 2022-10-31 12:23, Jonathan Kim wrote:
Allow the debugger to get a snapshot of a specified number of queues
containing various queue property information that is copied to the
debugger.
Since the debugger doesn't know how many queues exist at any given time,
allow the debugger to pass the re
the queue exception status.
The debugger has the option of clearing the target exception on query.
Signed-off-by: Jonathan Kim
Reviewed-by: Felix Kuehling
---
drivers/gpu/drm/amd/amdkfd/kfd_chardev.c | 7 ++
drivers/gpu/drm/amd/amdkfd/kfd_debug.c | 120 +++
drivers/g
FIFO statement.
Other than that, this patch is
Reviewed-by: Felix Kuehling
The KFD should also return the GPU or Queue id of the exception.
The debugger also has the option of clearing exceptions after
being queried.
Signed-off-by: Jonathan Kim
---
drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
On 2022-10-31 12:23, Jonathan Kim wrote:
Allow the debugger to set single memory and single ALU operations.
Some exceptions are imprecise (memory violations, address watch) in the
sense that a trap occurs only when the exception interrupt occurs and
not at the non-halting faulty instruction.
On 2022-10-31 12:23, Jonathan Kim wrote:
Shader read, write and atomic memory operations can be alerted to the
debugger as an address watch exception.
Allow the debugger to pass in a watch point to a particular memory
address per device.
Note that there exists only 4 watch points per devices
On 2022-10-31 12:23, Jonathan Kim wrote:
In order to inspect waves from the saved context at any point during a
debug session, the debugger must be able to preempt queues to trigger
context save by suspending them.
On queue suspend, the KFD will copy the context save header information
so that
to
support unique EPERM on PTRACE failure.
Signed-off-by: Jonathan Kim
Reviewed-by: Felix Kuehling
---
.../drm/amd/amdgpu/amdgpu_amdkfd_aldebaran.c | 47 ++
.../drm/amd/amdgpu/amdgpu_amdkfd_arcturus.c | 2 +
.../drm/amd/amdgpu/amdgpu_amdkfd_gfx_v10.c| 55
s work queue is
not about freeing ttm_resources but about freeing the BOs. But it
affects freeing of ghost_objs that are holding the ttm_resources being
freed.
If those assumptions all make sense, patches 1-3 are
Reviewed-by: Felix Kuehling
---
drivers/gpu/drm/amd/amdgpu/amdgpu_device.
Fixes: d4ec4bdc0bd5 ("drm/amdkfd: Allow access for mmapping KFD BOs")
Reviewed-by: Felix Kuehling
---
drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
b/drivers/gpu
Reviewed-by: Felix Kuehling
Signed-off-by: Jonathan Kim
---
drivers/gpu/drm/amd/amdkfd/kfd_debug.c | 4
drivers/gpu/drm/amd/amdkfd/kfd_priv.h| 2 ++
drivers/gpu/drm/amd/amdkfd/kfd_process.c | 16
3 files changed, 22 insertions(+)
diff --git a/drivers/gpu
On 2022-11-25 02:39, Dan Carpenter wrote:
We re-arranged this code recently so "ret" is always zero at this point.
Signed-off-by: Dan Carpenter
Reviewed-by: Felix Kuehling
I'm applying your patch to amd-staging-drm-next. Thank you!
Felix
---
drivers/gpu/drm/amd/amdkfd
Am 2022-10-31 um 12:23 schrieb Jonathan Kim:
The debugger subscibes to nofication for requested exceptions on attach.
Allow the debugger to change its subsciption later on.
Signed-off-by: Jonathan Kim
Reviewed-by: Felix Kuehling
---
drivers/gpu/drm/amd/amdkfd/kfd_chardev.c | 3
Am 2022-11-24 um 09:58 schrieb Kim, Jonathan:
[AMD Official Use Only - General]
-Original Message-
From: Kuehling, Felix
Sent: November 22, 2022 6:59 PM
To: Kim, Jonathan ; amd-
g...@lists.freedesktop.org
Subject: Re: [PATCH 07/29] drm/amdgpu: add gfx9.4.1 hw debug mode
enable and disa
Cornwall
Reviewed-by: Felix Kuehling
Signed-off-by: Jonathan Kim
---
drivers/gpu/drm/amd/amdkfd/kfd_debug.c | 4
drivers/gpu/drm/amd/amdkfd/kfd_priv.h| 2 ++
drivers/gpu/drm/amd/amdkfd/kfd_process.c | 16
3 files changed, 22 insertions(+)
diff --git a/drivers
on APUs
Signed-off-by: Jay Cornwall
Reviewed-by: Felix Kuehling
Signed-off-by: Jonathan Kim
---
drivers/gpu/drm/amd/amdkfd/kfd_debug.c | 4
drivers/gpu/drm/amd/amdkfd/kfd_priv.h| 2 ++
drivers/gpu/drm/amd/amdkfd/kfd_process.c | 16
3 files changed, 22 insertions
On 2022-10-31 12:23, Jonathan Kim wrote:
This operation coordinates the debugger with the target HSA runtime
process.
The main motive for this coordination is due to CP performance overhead
I wouldn't call that the main motivation. The main motivation for
synchronizing runtime enable with t
On 2022-10-31 12:23, Jonathan Kim wrote:
The HWS schedule allows a grace period for wave completion prior to
preemption but the debugger requires good performance since it preempts
on every HW debug mode setting transaction request.
For good performance, allow immediate preemption by setting the
On 2022-10-31 12:23, Jonathan Kim wrote:
Implement the per-device calls to enable or disable HW debug mode for
GFX9 prior to GFX9.4.1.
GFX9.4.1 and onward will require their own enable/disable sequence as
follow on patches.
When hardware debug mode setting is requested, waves will inherit
the
On 2022-10-31 12:23, Jonathan Kim wrote:
On GFX9.4.1, the implicit wait count instruction on s_barrier is
disabled by default in the driver during normal operation for
performance requirements.
There is a hardware bug in GFX9.4.1 where if the implicit wait count
instruction after an s_barrier
On 2022-10-31 12:23, Jonathan Kim wrote:
The ROCm debugger will attach to a process to debug by PTRACE and will
expect the KFD to prepare a process for the target PID, whether the
target PID has opened the KFD device or not.
This patch is to explicity handle this requirement. Further HW mode
On 2022-10-31 12:23, Jonathan Kim wrote:
Add missing debug trap registers references and initialize all debug
registers on boot by clearing the hardware exception overrides and the
wave allocation ID index.
For debug devices that only support single process debugging, enable
trap temporary set
athan Kim
Reviewed-by: Felix Kuehling
---
drivers/gpu/drm/amd/amdkfd/kfd_topology.c | 88 +--
drivers/gpu/drm/amd/amdkfd/kfd_topology.h | 6 ++
include/uapi/linux/kfd_sysfs.h| 15
3 files changed, 104 insertions(+), 5 deletions(-)
diff --git a/dr
pshot sematics to match queue snapshot semantics
This looks really good. I have 3 more nit-picks inline. Other than that,
this patch is
Reviewed-by: Felix Kuehling
Do we have a debugger branch that uses the API yet? We should make this
public in order to complete this upstream code review.
S
[ 185.439463] amdgpu_ttm_tt_get_user_pages+0xc2/0x190 [amdgpu]
[ 185.439603] amdgpu_amdkfd_gpuvm_alloc_memory_of_gpu+0x49f/0x7a0 [amdgpu]
[ 185.439774] kfd_ioctl_alloc_memory_of_gpu+0xfb/0x410 [amdgpu]
Signed-off-by: James Zhu
Reviewed-by: Felix Kuehling
---
drivers/gpu/drm/amd/amdgpu
Am 2022-11-21 um 00:13 schrieb Ma Jun:
From: Felix Kuehling
Move the topology-locked part of kfd_topology_add_device into a separate
function to simlpify error handling and release the topology lock
consistently.
Reported-by: Dan Carpenter
Signed-off-by: Felix Kuehling
Signed-off-by: Ma Jun
Let amdgpu_vm_handle_moved update all BO VA mappings of BOs reserved by
the caller. This will be useful for handling extra BO VA mappings in
KFD VMs that are managed through the render node API.
Signed-off-by: Felix Kuehling
Reviewed-by: Christian König
---
drivers/gpu/drm/amd/amdgpu
Instead of attaching the eviction fence when a KFD BO is first mapped,
attach it when it is allocated or imported. This in preparation to allow
KFD BOs to be mapped using the render node API.
Signed-off-by: Felix Kuehling
---
.../gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c | 63
This is needed to correctly handle BOs imported into the GEM API, which
would otherwise get added twice to the same VM.
Signed-off-by: Felix Kuehling
---
.../gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c | 28 +++
1 file changed, 22 insertions(+), 6 deletions(-)
diff --git a
user mode change (Thunk API and kfdtest) is here:
https://github.com/fxkamd/ROCT-Thunk-Interface/commits/fxkamd/dmabuf
Signed-off-by: Felix Kuehling
---
drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h| 2 +
.../gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c | 45 +++
drivers/gpu/drm/amd
comments, remove TODOs that are no longer applicable
Signed-off-by: Felix Kuehling
---
.../gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c | 28 +++
1 file changed, 22 insertions(+), 6 deletions(-)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
b/drivers/gpu/drm/amd
Use proper amdgpu_gem_prime_import function to handle all kinds of
imports. Remember the dmabuf reference to enable proper multi-GPU
attachment to multiple VMs without erroneously re-exporting the
underlying BO multiple times.
Signed-off-by: Felix Kuehling
---
.../gpu/drm/amd/amdgpu
implementation.
Felix Kuehling (6):
drm/amdgpu: Generalize KFD dmabuf import
drm/amdkfd: Implement DMA buf fd export from KFD
drm/amdkfd: Improve amdgpu_vm_handle_moved
drm/amdgpu: Attach eviction fence on alloc
drm/amdgpu: update mappings not managed by KFD
drm/amdgpu: Do bo_va ref counting for
-
From: amd-gfx On Behalf Of Felix
Kuehling
Sent: Wednesday, November 2, 2022 9:00 PM
To: amd-gfx@lists.freedesktop.org
Subject: [PATCH] drm/amdgpu: Add notifier lock for KFD userptrs
Caution: This message originated from an External Source. Use proper caution
when opening attachments, cli
Am 2022-11-17 um 16:38 schrieb James Zhu:
When applications try to allocate large system (more than > 128GB),
"stall cpu" is reported.
for such large system memory, walk_page_range takes more than 20s usually.
The warning message can be removed when splitting hmm range into smaller
ones which is
d_destroy_crat_image(crat_image);
+
return res;
}
On 11/17/2022 4:49 AM, Felix Kuehling wrote:
Am 2022-11-16 um 03:04 schrieb Ma Jun:
Release the topology_lock in error case
Signed-off-by: Ma Jun
Reported-by: Dan Carpenter
Dan, did you change your email address, is this one cor
Am 2022-10-12 um 15:07 schrieb Jonathan Kim:
FW fix available to enable cooperative launch for GFX10.3.
Signed-off-by: Jonathan Kim
Reviewed-by: Felix Kuehling
---
drivers/gpu/drm/amd/amdkfd/kfd_device.c | 5 -
1 file changed, 4 insertions(+), 1 deletion(-)
diff --git a/drivers
Am 2022-11-16 um 11:54 schrieb Ramesh Errabolu:
Allow user to know number of compute units (CU) that are in use at any
given moment. Enable access to the method kgd_gfx_v9_get_cu_occupancy
that computes CU occupancy.
Signed-off-by: Ramesh Errabolu
Reviewed-by: Felix Kuehling
Am 2022-11-16 um 11:40 schrieb Alex Deucher:
All of the IP specific versions are the same now, so
we can just use a common function.
Signed-off-by: Alex Deucher
The series is
Acked-by: Felix Kuehling
---
drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c | 26 +++
drivers
res = -ENODEV;
+ up_write(&topology_lock);
goto err;
}
From ceb79972cdd490de181a6895836e40bf4e93c631 Mon Sep 17 00:00:00 2001
From: Felix Kuehling
Date: Wed, 16 Nov 2022 15:38:44 -0500
Subject: [PATCH] drm/amdkf
mutex
protected bo list for now.
v2: make sure range is set to NULL in case of an error
Signed-off-by: Christian König
Reviewed-by: Alex Deucher
Reviewed-by: Felix Kuehling
CC: sta...@vger.kernel.org
---
.../gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c | 12 +++--
drivers/gpu/drm/amd/a
Am 2022-11-15 um 08:37 schrieb Christian König:
Am 10.11.22 um 22:55 schrieb Felix Kuehling:
Am 2022-11-10 um 08:00 schrieb Christian König:
The basic problem here is that it's not allowed to page fault while
holding the reservation lock.
So it can happen that multiple processes t
. So removing
vram_pin_size will resolve it.
Signed-off-by: Eric Huang
Reviewed-by: Felix Kuehling
---
drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c | 4 +---
1 file changed, 1 insertion(+), 3 deletions(-)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
b/drivers/gpu
Am 2022-11-10 um 08:00 schrieb Christian König:
Clean that up a bit, no functional change.
Signed-off-by: Christian König
Reviewed-by: Felix Kuehling
---
drivers/gpu/drm/amd/amdgpu/Makefile | 2 +-
drivers/gpu/drm/amd/amdgpu/amdgpu.h | 1 -
.../gpu/drm/amd
701 - 800 of 2185 matches
Mail list logo