to amdgpu_vm_handle_fault and then to gmc interrupt handler
to show vm fault message.
Signed-off-by: Philip Yang
---
drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 5 +++--
drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h | 2 +-
drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c | 2 +-
drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c | 7
On 2024-05-02 08:42, James Zhu wrote:
On 2024-05-01 18:56, Philip Yang wrote:
On system with khugepaged enabled and user
cases with THP buffer, the
hmm_range_fault may takes > 15 seconds to return -EB
On 2024-05-02 00:09, Chen, Xiaogang
wrote:
On 5/1/2024 5:56 PM, Philip Yang wrote:
Caution: This message originated from an
External Source. Use proper caution when opening attachments,
clicking links, or responding
urn EBUSY, then userspace libdrm and Thunk will call
ioctl again.
Change EAGAIN to debug message as this is not error.
Signed-off-by: Philip Yang
---
drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c | 5 -
drivers/gpu/drm/amd/amdgpu/amdgpu_hmm.c | 12 +++-
drivers/gpu/drm/
-by: Ramesh Errabolu
Reviewed-by: Philip Yang
---
drivers/gpu/drm/amd/amdkfd/kfd_svm.c | 6 --
1 file changed, 4 insertions(+), 2 deletions(-)
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_svm.c b/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
index 386875e6eb96..dcb1d5d3f860 100644
TTM_PL_FLAG_CONTIFUOUS flag, and ask
VRAM buddy allocator to get contiguous VRAM.
Signed-off-by: Philip Yang
---
drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c | 4
include/uapi/linux/kfd_ioctl.h | 1 +
2 files changed, 5 insertions(+)
diff --git a/drivers/gpu/drm/amd/amdgpu
Bump the kfd ioctl minor version to delcare the contiguous VRAM
allocation flag support.
Signed-off-by: Philip Yang
---
include/uapi/linux/kfd_ioctl.h | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/include/uapi/linux/kfd_ioctl.h b/include/uapi/linux/kfd_ioctl.h
index
If the BO pages pinned for RDMA is not contiguous on VRAM, evict it to
system memory first to free the VRAM space, then allocate contiguous
VRAM space, and then move it from system memory back to VRAM.
v6: user context should use interruptible call (Felix)
Signed-off-by: Philip Yang
GPU prefix to the macro
name.
v6: use shorter flag name, use interruptible wait ctx, drop patch 5/6 (Felix)
Philip Yang (5):
drm/amdgpu: Support contiguous VRAM allocation
drm/amdgpu: Handle sg size limit for contiguous allocation
drm/amdgpu: Evict BOs from same process for contiguous alloca
from the same process, this will evict the user queues
first, and restore the queues later after contiguous VRAM allocation.
Signed-off-by: Philip Yang
Reviewed-by: Felix Kuehling
---
drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git
VRAM memory. To workaround the sg table segment
size limit, allocate multiple segments if contiguous size is bigger than
AMDGPU_MAX_SG_SEGMENT_SIZE.
Signed-off-by: Philip Yang
Reviewed-by: Christian König
---
drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c | 12 ++--
1 file changed, 6
On 2024-04-23 18:17, Felix Kuehling
wrote:
On 2024-04-23 11:28, Philip Yang wrote:
RDMA device with limited scatter-gather
ability requires contiguous VRAM
buffer allocation for RDMA peer direct support
On 2024-04-23 18:15, Felix Kuehling
wrote:
On
2024-04-23 11:28, Philip Yang wrote:
If the BO pages pinned for RDMA is not
contiguous on VRAM, evict it to
system memory first to free the VRAM space, then allocate
Bump the kfd ioctl minor version to delcare the contiguous VRAM
allocation flag support.
Signed-off-by: Philip Yang
---
include/uapi/linux/kfd_ioctl.h | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/include/uapi/linux/kfd_ioctl.h b/include/uapi/linux/kfd_ioctl.h
index
If the BO pages pinned for RDMA is not contiguous on VRAM, evict it to
system memory first to free the VRAM space, then allocate contiguous
VRAM space, and then move it from system memory back to VRAM.
Signed-off-by: Philip Yang
---
drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c | 16
memory. To workaround the sg table segment
size limit, allocate multiple segments if contiguous size is bigger than
MAX_SG_SEGMENT_SIZE.
Signed-off-by: Philip Yang
---
drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c | 12 ++--
1 file changed, 6 insertions(+), 6 deletions(-)
diff --git
from the same process, this will evict the user queues
first, and restore the queues later after contiguous VRAM allocation.
Signed-off-by: Philip Yang
---
drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/amd/amdgpu
time to 2 seconds, long enough for RDMA
pin BO to alloc the contiguous VRAM.
Signed-off-by: Philip Yang
---
drivers/gpu/drm/amd/amdkfd/kfd_priv.h | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_priv.h
b/drivers/gpu/drm/amd/amdkfd/kfd_priv.h
TTM_PL_FLAG_CONTIFUOUS flag, and ask
VRAM buddy allocator to get contiguous VRAM.
Signed-off-by: Philip Yang
---
drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c | 4
include/uapi/linux/kfd_ioctl.h | 1 +
2 files changed, 5 insertions(+)
diff --git a/drivers/gpu/drm/amd/amdgpu
GPU prefix to the macro
name.
Philip Yang (6):
drm/amdgpu: Support contiguous VRAM allocation
drm/amdgpu: Handle sg size limit for contiguous allocation
drm/amdgpu: Evict BOs from same process for contiguous allocation
drm/amdkfd: Evict BO itself for contiguous allocation
drm/amdkfd: Incre
On 2024-04-23 09:32, Christian König
wrote:
Am
23.04.24 um 15:04 schrieb Philip Yang:
To test RDMA using dummy driver on the
system without NIC/RDMA
device, the get/put dma pages pass in null device pointer, skip
TTM_PL_FLAG_CONTIFUOUS flag, and ask
VRAM buddy allocator to get contiguous VRAM.
Signed-off-by: Philip Yang
---
drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c | 4
include/uapi/linux/kfd_ioctl.h | 1 +
2 files changed, 5 insertions(+)
diff --git a/drivers/gpu/drm/amd/amdgpu
from the same process, this will evict the user queues
first, and restore the queues later after contiguous VRAM allocation.
Signed-off-by: Philip Yang
---
drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/amd/amdgpu
Bump the kfd ioctl minor version to delcare the contiguous VRAM
allocation flag support.
Signed-off-by: Philip Yang
---
include/uapi/linux/kfd_ioctl.h | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/include/uapi/linux/kfd_ioctl.h b/include/uapi/linux/kfd_ioctl.h
index
If the BO pages pinned for RDMA is not contiguous on VRAM, evict it to
system memory first to free the VRAM space, then allocate contiguous
VRAM space, and then move it from system memory back to VRAM.
Signed-off-by: Philip Yang
---
drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c | 16
To test RDMA using dummy driver on the system without NIC/RDMA
device, the get/put dma pages pass in null device pointer, skip the
dma map/unmap resource and sg table to avoid null pointer access.
Signed-off-by: Philip Yang
---
drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c | 33
time to 2 seconds, long enough for RDMA
pin BO to alloc the contiguous VRAM.
Signed-off-by: Philip Yang
---
drivers/gpu/drm/amd/amdkfd/kfd_priv.h | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_priv.h
b/drivers/gpu/drm/amd/amdkfd/kfd_priv.h
memory. To workaround the sg table segment
size limit, allocate multiple segments if contiguous size is bigger than
MAX_SG_SEGMENT_SIZE.
Signed-off-by: Philip Yang
---
drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c | 12 ++--
1 file changed, 6 insertions(+), 6 deletions(-)
diff --git
w GEM flag
v3: add patch 2 to handle sg segment size limit (Christian)
v4: remove the buddy block size limit from vram mgr because sg table creation
already
remove the limit, and resource uses u64 to handle block start, size
(Christian)
Philip Yang (7):
drm/amdgpu: Support contiguous VRAM
On 2024-04-22 10:56, Christian König
wrote:
Am
22.04.24 um 15:57 schrieb Philip Yang:
To test RDMA using dummy driver on the
system without NIC/RDMA
device, the get/put dma pages pass in null device pointer, skip
On 2024-04-22 10:40, Christian König
wrote:
Am
22.04.24 um 15:57 schrieb Philip Yang:
Define macro MAX_SG_SEGMENT_SIZE 2GB,
because struct scatterlist length
is unsigned int, and some users of it cast to a signed int, so
To test RDMA using dummy driver on the system without NIC/RDMA
device, the get/put dma pages pass in null device pointer, skip the
dma map/unmap resource and sg table to avoid null pointer access.
Signed-off-by: Philip Yang
---
drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c | 33
w GEM flag
v3: add patch 2 to handle sg segment size limit (Christian)
Philip Yang (7):
drm/amdgpu: Support contiguous VRAM allocation
drm/amdgpu: Handle sg size limit for contiguous allocation
drm/amdgpu: Evict BOs from same process for contiguous allocation
drm/amdkfd: Evict BO itself for
time to 2 seconds, long enough for RDMA
pin BO to alloc the contiguous VRAM.
Signed-off-by: Philip Yang
---
drivers/gpu/drm/amd/amdkfd/kfd_priv.h | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_priv.h
b/drivers/gpu/drm/amd/amdkfd/kfd_priv.h
Bump the kfd ioctl minor version to delcare the contiguous VRAM
allocation flag support.
Signed-off-by: Philip Yang
---
include/uapi/linux/kfd_ioctl.h | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/include/uapi/linux/kfd_ioctl.h b/include/uapi/linux/kfd_ioctl.h
index
from the same process, this will evict the user queues
first, and restore the queues later after contiguous VRAM allocation.
Signed-off-by: Philip Yang
---
drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/amd/amdgpu
memory. To workaround the sg table segment
size limit, allocate multiple segments if contiguous size is bigger than
MAX_SG_SEGMENT_SIZE.
Signed-off-by: Philip Yang
---
drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c | 17 -
1 file changed, 12 insertions(+), 5 deletions(-)
diff --git
If the BO pages pinned for RDMA is not contiguous on VRAM, evict it to
system memory first to free the VRAM space, then allocate contiguous
VRAM space, and then move it from system memory back to VRAM.
Signed-off-by: Philip Yang
---
drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c | 16
TTM_PL_FLAG_CONTIFUOUS flag, and ask
VRAM buddy allocator to get contiguous VRAM.
Signed-off-by: Philip Yang
---
drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c | 4
include/uapi/linux/kfd_ioctl.h | 1 +
2 files changed, 5 insertions(+)
diff --git a/drivers/gpu/drm/amd/amdgpu
-by: Felix Kuehling
Reviewed-by: Philip Yang
---
drivers/gpu/drm/amd/amdkfd/kfd_process.c | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_process.c b/drivers/gpu/drm/amd/amdkfd/kfd_process.c
index aafdf064651f..58c1fe542
On 2024-04-18 10:37, Christian König
wrote:
Am 18.04.24 um 15:57 schrieb Philip Yang:
RDMA device with limited scatter-gather
ability requires contiguous VRAM
buffer allocation for RDMA peer direct support
("drm/amdkfd: Run restore_workers on freezable WQs")
Signed-off-by: Felix Kuehling
Reviewed-by: Philip Yang
---
drivers/gpu/drm/amd/amdkfd/kfd_process.c | 9 +
1 file changed, 5 insertions(+), 4 deletions(-)
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_p
If the BO pages pinned for RDMA is not contiguous on VRAM, evict it to
system memory first to free the VRAM space, then allocate contiguous
VRAM space, and then move it from system memory back to VRAM.
Signed-off-by: Philip Yang
---
.../gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c| 17
TTM_PL_FLAG_CONTIFUOUS flag, and ask
VRAM buddy allocator to get contiguous VRAM.
Remove the 2GB max memory block size limit for contiguous allocation.
Signed-off-by: Philip Yang
---
drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c | 4
drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c | 9
To test RDMA using dummy driver on the system without NIC/RDMA
device, the get/put dma pages pass in null device pointer, skip the
dma map/unmap resource to avoid null pointer access.
Signed-off-by: Philip Yang
---
drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c | 33 +++-
1 file
Bump the kfd ioctl minor version to delcare the contiguous VRAM
allocation flag support.
Signed-off-by: Philip Yang
---
include/uapi/linux/kfd_ioctl.h | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/include/uapi/linux/kfd_ioctl.h b/include/uapi/linux/kfd_ioctl.h
index
time to 2 seconds, long enough for RDMA
pin BO to alloc the contiguous VRAM.
Signed-off-by: Philip Yang
---
drivers/gpu/drm/amd/amdkfd/kfd_priv.h | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_priv.h
b/drivers/gpu/drm/amd/amdkfd/kfd_priv.h
from the same process, this will evict the user queues
first, and restore the queues later after contiguous VRAM allocation.
Signed-off-by: Philip Yang
---
drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/amd/amdgpu
w GEM flag
Philip Yang (6):
drm/amdgpu: Support contiguous VRAM allocation
drm/amdgpu: Evict BOs from same process for contiguous allocation
drm/amdkfd: Evict BO itself for contiguous allocation
drm/amdkfd: Increase KFD bo restore wait time
drm/amdgpu: Skip dma map resource for null RDMA de
On 2024-04-17 10:32, Paneer Selvam,
Arunpravin wrote:
Hi
Christian,
On 4/17/2024 6:57 PM, Paneer Selvam, Arunpravin wrote:
Hi Christian,
On 4/17/2024 12:19 PM, Christian König wrote:
On 2024-04-15 08:02, Christian König
wrote:
Am
12.04.24 um 22:12 schrieb Philip Yang:
RDMA device with limited scatter-gather
capability requires physical
address contiguous VRAM buffer for RDMA peer direct access
On 2024-04-16 02:50, Paneer Selvam,
Arunpravin wrote:
On 4/16/2024 3:32 AM, Philip Yang wrote:
On 2024-04-14 10:57, Arunpravin Paneer Selvam wrote:
Now we have two flags for contiguous
On 2024-04-14 10:57, Arunpravin Paneer
Selvam wrote:
Now we have two flags for contiguous VRAM buffer allocation.
If the application request for AMDGPU_GEM_CREATE_VRAM_CONTIGUOUS,
it would set the ttm place TTM_PL_FLAG_CONTIGUOUS flag in the
buffer's
To test RDMA using dummy driver on the system without NIC/RDMA
device, the get dma pages pass in null device pointer, skip the
dma map resource to avoid null device pointer access.
Signed-off-by: Philip Yang
---
drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c | 33 +++-
1 file
KFD BOs from the same process, this will evict the user
queues first, and restore the queues later after contiguous VRAM
allocation.
Signed-off-by: Philip Yang
---
drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/amd
Bump the kfd ioctl minor version to delcare the contiguous VRAM
allocation flag support.
Signed-off-by: Philip Yang
---
include/uapi/linux/kfd_ioctl.h | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/include/uapi/linux/kfd_ioctl.h b/include/uapi/linux/kfd_ioctl.h
index
wait time to 2 seconds, long enough for RDMA
pin BO to finish the contiguous VRAM allocation.
Signed-off-by: Philip Yang
---
drivers/gpu/drm/amd/amdkfd/kfd_priv.h | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_priv.h
b/drivers/gpu/drm/amd
If the BO pages pinned for RDMA is not contiguous on VRAM, evict it to
system memory first to free the VRAM space, then allocate contiguous
VRAM and then move it from system memory back to VRAM.
Signed-off-by: Philip Yang
---
drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c | 15
AMDGPU_GEM_CREATE_VRAM_CONTIGUOUS flag, and then vram_mgr will set
TTM_PL_FLAG_CONTIFUOUS flag to ask VRAM buddy allocator to get
contiguous VRAM.
Remove the 2GB max memory block size limit for contiguous allocation.
Signed-off-by: Philip Yang
---
drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c | 7 +++
drivers/gpu/drm
This patch series implement new KFD memory alloc flag for best effort contiguous
VRAM allocation, to support peer direct access RDMA device with limited
scatter-gather
dma capability.
Philip Yang (6):
drm/amdgpu: Support contiguous VRAM allocation
drm/amdgpu: Evict BOs from same process
]
kfd_process_wq_release+0x273/0x3c0 [amdgpu]
process_scheduled_works+0x2a7/0x500
worker_thread+0x186/0x340
Fixes: 220ecde84bc8 ("drm/amdgpu: implement TLB flush fence")
Signed-off-by: Philip Yang
---
drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 7 ++-
1 file changed, 2
On 2024-03-26 11:01, Felix Kuehling
wrote:
On
2024-03-26 10:53, Philip Yang wrote:
On 2024-03-25 14:45, Felix Kuehling wrote:
On 2024-03-22 15:57, Zhigang Luo wrote:
it will cause
On 2024-03-25 14:45, Felix Kuehling
wrote:
On
2024-03-22 15:57, Zhigang Luo wrote:
it will cause page fault after device
recovered if there is a process running.
Signed-off-by: Zhigang Luo
drm-next.
Reviewed-by: Philip Yang
---
Ps: When I try to compile this file, there is a error :
drivers/gpu/drm/amd/amdkfd/kfd_migrate.c:28:10: fatal error: amdgpu_sync.h:
No such file or directory.
Maybe there are some steps I missed or this place need to be corrected?
Otherwise amdgpu_ttm_backend_unbind will not clear the gart page table
and leave valid mapping entry to the stale system page.
Signed-off-by: Philip Yang
---
drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
b
On 2024-03-06 09:41, Shashank Sharma
wrote:
From: Christian König
The problem is that when (for example) 4k pages are replaced
with a single 2M page we need to wait for change to be flushed
out by invalidating the TLB before the PT can be freed.
Solve
On 2024-03-01 06:07, Shashank Sharma
wrote:
The idea behind this patch is to delay the freeing of PT entry objects
until the TLB flush is done.
This patch:
- Adds a tlb_flush_waitlist which will keep the objects that need to be
freed after tlb_flush
-
On 2024-03-01 06:07, Shashank Sharma
wrote:
From: Christian König
The problem is that when (for example) 4k pages are replaced
with a single 2M page we need to wait for change to be flushed
out by invalidating the TLB before the PT can be freed.
Solve
On 2024-02-23 08:42, Shashank Sharma
wrote:
This patch:
- adds a new list in amdgou_vm to hold the VM PT entries being freed
- waits for the TLB flush using the vm->tlb_flush_fence
- actually frees the PT BOs
V2: rebase
V3: Do not attach the tlb_fence to
On 2024-02-23 11:58, Philip Yang wrote:
On 2024-02-23 08:42, Shashank Sharma
wrote:
From: Christian König
The problem is that when (for example) 4k pages are replaced
with a single 2M page we need to wait
On 2024-02-23 08:42, Shashank Sharma
wrote:
From: Christian König
The problem is that when (for example) 4k pages are replaced
with a single 2M page we need to wait for change to be flushed
out by invalidating the TLB before the PT can be freed.
Solve
On 2024-02-21 21:01, Lang Yu wrote:
This is useful to prevent copy-on-write semantics
from changing the physical location of a page if
the parent writes to it after a fork().
Signed-off-by: Lang Yu
---
drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c | 1 +
On 2024-02-16 15:16, Felix Kuehling
wrote:
On 2024-02-15 10:18, Philip Yang wrote:
Document how to use SMI system management
interface to receive SVM
events.
Define SVM events message
On 2024-02-15 12:54, Chen, Xiaogang
wrote:
On 2/15/2024 9:18 AM, Philip Yang wrote:
Caution: This message originated from an
External Source. Use proper caution when opening attachments,
clicking links, or responding
To track the migrate end-event in case of a migration failure, always
output migrate end event, with the failure result added to the existing
migrate end event string.
Signed-off-by: Philip Yang
---
drivers/gpu/drm/amd/amdkfd/kfd_migrate.c| 16
drivers/gpu/drm/amd/amdkfd
-by: Philip Yang
---
drivers/gpu/drm/amd/amdkfd/kfd_smi_events.c | 51 +++---
include/uapi/linux/kfd_ioctl.h | 77 -
2 files changed, 102 insertions(+), 26 deletions(-)
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_smi_events.c
b/drivers/gpu/drm/amd/amdkfd
loc and free.
Signed-off-by: Philip Yang
---
drivers/gpu/drm/amd/amdgpu/amdgpu_vm_pt.c | 8 +---
1 file changed, 5 insertions(+), 3 deletions(-)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm_pt.c
b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm_pt.c
index a3d609655ce3..ef3ef03e50ab 100
ree the PTB bo.
With this change, the vm->pt_freed list and work is not needed. Add
WARN_ON(unlocked) in amdgpu_vm_pt_free_dfs to catch if unmap to free the
PTB.
Signed-off-by: Philip Yang
---
drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c| 4 ---
drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h| 4 ---
d
On 2024-01-29 11:30, Christian König
wrote:
Am
29.01.24 um 17:25 schrieb Philip Yang:
On 2024-01-29 05:06, Christian König
wrote:
Am 26.01.24 um 20:47 schrieb Philip
Yang:
This is to work
On 2024-01-29 05:06, Christian König
wrote:
Am
26.01.24 um 20:47 schrieb Philip Yang:
This is to work around a bug in function
drm_prime_pages_to_sg if length
of nr_pages >= 4GB, by doing the same check for max_segm
GB GTT memory mapping for mGPUs with IOMMU isolation mode.
Signed-off-by: Philip Yang
---
drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.c | 50 ++---
1 file changed, 34 insertions(+), 16 deletions(-)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.c
b/drivers/gpu/drm/amd/
On 2024-01-26 10:35, Christian König
wrote:
Am
26.01.24 um 16:17 schrieb Philip Yang:
On 2024-01-26 09:59, Christian König
wrote:
Am 26.01.24 um 15:38 schrieb Philip
Yang:
svm range support
On 2024-01-26 09:59, Christian König
wrote:
Am
26.01.24 um 15:38 schrieb Philip Yang:
svm range support partial migration and
mapping update, for size 4MB
virtual address 4MB alignment and physical address continuous
to the stale mapping.
Limit the maximum fragment size to granularity size, 2MB by default,
with the mapping and unmapping based on gramularity size, to solve this
issue.
The change is only for SVM map/unmap range, no change for gfx and legacy
API path.
Signed-off-by: Philip Yang
---
drivers/gpu/drm/amd
ble once, have to wait sdma udate fence and then flush tlb.
No change if using CPU update GPU page table for large bar because no vm
update fence.
Remove wait parameter in svm_range_validate_and_map because it is always
called with true now.
Signed-off-by: Philip Yang
---
drivers/gpu/drm/
Calculate range start, last address aligned to the range granularity
size. This removes the duplicate code, and the helper function will be
used in the future patch to handle map, unmap to GPU based on range
granularity. No functional change.
Signed-off-by: Philip Yang
---
drivers/gpu/drm/amd
When changing the svm range granularity, update the svm range
bitmap_map based on new range granularity.
Signed-off-by: Philip Yang
---
drivers/gpu/drm/amd/amdkfd/kfd_svm.c | 49 +++-
1 file changed, 48 insertions(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/amd
,
split range may get incorrect bitmap_map for the remaining ranges.
Signed-off-by: Philip Yang
---
drivers/gpu/drm/amd/amdkfd/kfd_svm.c | 42 +++-
1 file changed, 29 insertions(+), 13 deletions(-)
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
b/drivers/gpu/drm/amd
Remove prange validate_timestamp which is not accurate for multiple
GPUs.
Use the bitmap_map flag to skip the retry fault from different pages of
the same granularity range if the granularity range is already mapped
on the specific GPU.
Signed-off-by: Philip Yang
Reviewed-by: Felix Kuehling
retry fault recover.
svm_range_partial_mapped is false only if no part of the range mapping
on any GPUs.
Signed-off-by: Philip Yang
---
drivers/gpu/drm/amd/amdkfd/kfd_svm.c | 258 ++-
drivers/gpu/drm/amd/amdkfd/kfd_svm.h | 7 +-
2 files changed, 219 insertions(
Add the helper function to get all GPUs bitmap that need access the svm
range. This helper will be used in the following patch to check if
prange is mapped to all gpus.
Refactor svm_range_validate_and_map to use the helper function, no
functional change.
Signed-off-by: Philip Yang
Reviewed
s 0 if old->vram_pages - new->vram_pages == 0.
new range takes svm_bo ref only if vram_pages not equal to 0.
Signed-off-by: Philip Yang
---
drivers/gpu/drm/amd/amdkfd/kfd_migrate.c | 8 +
drivers/gpu/drm/amd/amdkfd/kfd_svm.c | 42 ++--
drivers/gpu/drm/a
Partial migration to system memory should use migrate.addr, not
prange->start as virtual address to allocate system memory page.
Fixes: 18eb61bd5a6a ("drm/amdkfd: Use partial migrations/mapping for GPU/CPU
page faults in SVM"
Signed-off-by: Philip Yang
---
drivers/gpu/d
On 2024-01-11 12:37, Chen, Xiaogang
wrote:
On 1/11/2024 10:54 AM, Felix Kuehling wrote:
On 2024-01-10 17:01, Philip Yang wrote:
While svm range partial migrating to
system memory, clear
On 2024-01-11 11:54, Felix Kuehling
wrote:
On 2024-01-10 17:01, Philip Yang wrote:
While svm range partial migrating to
system memory, clear dma_addr vram
domain flag, otherwise the future split will get incorrect
s 0 if old->vram_pages - new->vram_pages == 0.
new range takes svm_bo ref only if vram_pages not equal to 0.
Signed-off-by: Philip Yang
---
drivers/gpu/drm/amd/amdkfd/kfd_migrate.c | 20 +++-
drivers/gpu/drm/amd/amdkfd/kfd_svm.c | 24 ++--
2 files ch
On 2024-01-10 11:30, Felix Kuehling
wrote:
On 2024-01-09 15:05, Philip Yang wrote:
After svm range partial migrating to
system memory, unmap to cleanup the
corresponding dma_addr vram domain flag, otherwise
On 2024-01-09 17:29, Chen, Xiaogang
wrote:
On 1/9/2024 2:05 PM, Philip Yang wrote:
After svm range partial migrating to
system memory, unmap to cleanup the
corresponding dma_addr vram domain flag, otherwise
ges is 0.
old range actual_loc is 0 if old->vram_pages - new->vram_pages == 0.
new range takes svm_bo ref only if vram_pages not equal to 0.
Signed-off-by: Philip Yang
---
drivers/gpu/drm/amd/amdkfd/kfd_migrate.c | 3 ++
drivers/gpu/drm/amd/amdkfd/kfd_svm.c | 35 +++-
On 2024-01-08 18:17, Chen, Xiaogang
wrote:
With a
nitpick below, this patch is
Reviewed-by:Xiaogang Chen
On 1/8/2024 4:36 PM, Philip Yang wrote:
After range spliting, set new range and
old range
1 - 100 of 636 matches
Mail list logo