Verify the parameters of
amdgpu_vm_bo_(map/replace_map/clearing_mappings) in one common place.
Reported-by: Vlad Stolyarov
Suggested-by: Christian König
Signed-off-by: xinhui pan
---
drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 72 --
1 file changed, 46 insertions(+), 26
Verify the parameters of
amdgpu_vm_bo_(map/replace_map/clearing_mappings) in one common place.
Reported-by: Vlad Stolyarov
Signed-off-by: xinhui pan
---
drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 63 --
1 file changed, 39 insertions(+), 24 deletions(-)
diff --git
Ensure there is no address overlapping.
Reported-by: Vlad Stolyarov
Signed-off-by: xinhui pan
---
drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 6 ++
1 file changed, 6 insertions(+)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
index 8af3f0fd3073
Like amdgpu_device_reset_sriov does, kfd suspend should be called at the
beginning to make sure kfd BO is idle. Otherwise the extra
amdgpu_device_evict_resources fails or amdgpu_virt_request_full_gpu
timeout.
Signed-off-by: xinhui pan
---
drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 6 +++---
1
that should still
work fine in sriov full access mode.
Fixes: 47ea20762bb7 ("drm/amdgpu: Add an extra evict_resource call during
device_suspend.")
Signed-off-by: xinhui pan
---
drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 9 +
1 file changed, 5 insertions(+), 4 deletions(-)
diff --git
extra evict_resource call during
device_suspend.")
Signed-off-by: xinhui pan
---
drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 9 +
1 file changed, 5 insertions(+), 4 deletions(-)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
index 5c
access job->entity //memory overwitten
As long as we can NOT guarantee entity is alive in this case, lets
revert it for now.
Signed-off-by: xinhui pan
---
drivers/gpu/drm/scheduler/sched_main.c | 6 --
1 file changed, 6 deletions(-)
diff --git a/drivers/gpu/drm/scheduler/sched_main.c
b/dr
.
Signed-off-by: xinhui pan
---
change from v5:
reworked
---
drivers/gpu/drm/drm_buddy.c | 161 ++--
1 file changed, 154 insertions(+), 7 deletions(-)
diff --git a/drivers/gpu/drm/drm_buddy.c b/drivers/gpu/drm/drm_buddy.c
index 11bb59399471..6c795e1b3247 100644
+1 or 0+1+2+0 or 0+2+1+0.
Without this patch, eviction is the final step to cleanup memory.
Now there is a chance to delay the evction and then reduce the total
count of evction.
Signed-off-by: xinhui pan
---
change from v4:
Fix offset check by using <= instead of <
Change patch descr
this patch aims to implement the rest cases.
Adding a new member leaf_link which links all leaf blocks in asceding
order. Now we can find more than 2 sub-order blocks easier.
Say, order 4 can be combined with corresponding order 4, 2+2, 1+2+1,
0+1+2+0, 0+2+1+0.
Signed-off-by: xinhui pan
this patch aims to implement the LR+RL case.
Signed-off-by: xinhui pan
---
change from v2:
search continuous block in nearby root if needed
change from v1:
implement top-down continuous allocation
---
drivers/gpu/drm/drm_buddy.c | 78 +
1 file changed, 71
Blocks are not guarnteed to be in ascending order.
Signed-off-by: xinhui pan
---
drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c | 21
1 file changed, 8 insertions(+), 13 deletions(-)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c
b/drivers/gpu/drm/amd/amdgpu
this patch aims to implement the LR+RL case.
Signed-off-by: xinhui pan
---
change from v1:
implement top-down continuous allocation
---
drivers/gpu/drm/drm_buddy.c | 66 +
1 file changed, 59 insertions(+), 7 deletions(-)
diff --git a/drivers/gpu/drm
this patch aims to implement the LR+RL case.
Signed-off-by: xinhui pan
---
drivers/gpu/drm/drm_buddy.c | 56 -
1 file changed, 49 insertions(+), 7 deletions(-)
diff --git a/drivers/gpu/drm/drm_buddy.c b/drivers/gpu/drm/drm_buddy.c
index 11bb59399471
That has been done in BO release notify.
Signed-off-by: xinhui pan
---
drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c | 5 -
1 file changed, 5 deletions(-)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
index 0f9811d02f61
Queue would be freed when create_queue_cpsch fails
So lets do queue cleanup otherwise various list and memory issues
happen.
Signed-off-by: xinhui pan
---
drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c | 11 +--
1 file changed, 5 insertions(+), 6 deletions(-)
diff --git
Fence is accessed by dma_resv_add_fence() now.
Use amdgpu_amdkfd_remove_eviction_fence instead.
Signed-off-by: xinhui pan
---
drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
Need get the new fence when we replace the old one.
Fixes: 047a1b877ed48 ("dma-buf & drm/amdgpu: remove dma_resv workaround")
Signed-off-by: xinhui pan
---
drivers/dma-buf/dma-resv.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/dma-buf/dma-resv.c b/d
Signed-off-by: xinhui pan
---
drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 17 ++---
drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h | 1 +
2 files changed, 15 insertions(+), 3 deletions(-)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
index
ttm_device_delayed_workqueue would reschedule itself if there is pending
BO to be destroyed. So just one flush + cancel_sync is not enough. We
still see lru_list not empty warnging.
Fix it by waiting all BO to be destroyed.
Acked-by: Guchun Chen
Signed-off-by: xinhui pan
---
drivers/gpu/drm
ttm_device_delayed_workqueue would reschedule itself if there is pending
BO to be destroyed. So just one flush + cancel_sync is not enough. We
still see lru_list not empty warnging.
Fix it by waiting all BO to be destroyed.
Signed-off-by: xinhui pan
---
drivers/gpu/drm/amd/amdgpu
do_group_exit+0x50/0xc0
__x64_sys_exit_group+0x18/0x20
do_syscall_64+0x38/0xc0
entry_SYSCALL_64_after_hwframe+0x44/0xae
Suggested-by: Christian König
Signed-off-by: xinhui pan
---
drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 22 +++---
drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h | 1 +
2
do_group_exit+0x50/0xc0
__x64_sys_exit_group+0x18/0x20
do_syscall_64+0x38/0xc0
entry_SYSCALL_64_after_hwframe+0x44/0xae
Signed-off-by: xinhui pan
---
drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 4
drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h | 1 +
2 files changed, 5 insertions(+)
diff --git a/drivers
ain first then alloc memory from specific
domain.
Suggested-by: Christian König
Signed-off-by: xinhui pan
---
drivers/gpu/drm/amd/amdgpu/amdgpu_object.c | 11 +++
1 file changed, 7 insertions(+), 4 deletions(-)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
b/drivers/gpu/drm/a
smem node to BO to make bo->resource valid.
Signed-off-by: xinhui pan
---
drivers/gpu/drm/amd/amdgpu/amdgpu_object.c | 14 --
1 file changed, 12 insertions(+), 2 deletions(-)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
index
amdgpu_amdkfd_gpuvm_free_memory_of_gpu drop dmabuf reference increased in
amdgpu_gem_prime_export.
amdgpu_bo_destroy drop dmabuf reference increased in
amdgpu_gem_prime_import.
So remove this extra dma_buf_put to avoid double free.
Signed-off-by: xinhui pan
---
drivers/gpu/drm/amd/amdgpu
After we move BO to a new memory region, we should put it to
the new memory manager's lru list regardless we unlock the resv or not.
Cc: sta...@vger.kernel.org
Reviewed-by: Christian König
Signed-off-by: xinhui pan
---
drivers/gpu/drm/ttm/ttm_bo.c | 2 ++
1 file changed, 2 insertions(+)
diff
BO might sit in a wrong lru list as there is a small period of memory
moving and lru list updating.
Lets skip eviction if we hit such mismatch.
Suggested-by: Christian König
Signed-off-by: xinhui pan
---
drivers/gpu/drm/ttm/ttm_bo.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion
After we move BO to a new memory region, we should put it to
the new memory manager's lru list regardless we unlock the resv or not.
Signed-off-by: xinhui pan
---
drivers/gpu/drm/ttm/ttm_bo.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/drivers/gpu/drm/ttm/ttm_bo.c b/drivers/gpu/drm
-by: xinhui pan
---
drivers/gpu/drm/amd/amdgpu/amdgpu_object.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
index 66bb8a53bb20..9a547bb38cda 100644
--- a/drivers/gpu/drm/amd/amdgpu
Now we use same BO for create/destroy msg. So destroy will wait for the
fence returned from create to be signaled. The default timeout value in
destroy is 10ms which is too short.
Lets wait both fences with the specific timeout.
Signed-off-by: xinhui pan
---
drivers/gpu/drm/amd/amdgpu
-by: xinhui pan
---
change from v1:
add enter/exit in more gmc_set_pte_pde callers
---
drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c | 11 ++
drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c | 11 +-
drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 28 +---
3 files changed, 36
1.09% [drm] [k] drm_dev_exit
So move drm_dev_enter/exit outside gmc code, instead let caller do it.
They are gart_unbind, gart_map, vm_cpu_update(already hold in its
caller) and gmc_init_pdb0(no need)
Signed-off-by: xinhui pan
---
drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c | 11
alloc extra msg from direct IB pool.
Reviewed-by: Christian König
Signed-off-by: xinhui pan
---
change from v1:
msg is aligned to gpu page boundary
---
drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c | 97 +++--
1 file changed, 44 insertions(+), 53 deletions(-)
diff --git
alloc extra msg from direct IB pool.
Signed-off-by: xinhui pan
---
change from v1:
msg is allocated separately.
msg is aligned to gpu page boundary
---
drivers/gpu/drm/amd/amdgpu/amdgpu_vce.c | 27 -
1 file changed, 13 insertions(+), 14 deletions(-)
diff --git a/drivers
move BO allocation in sw_init.
Signed-off-by: xinhui pan
---
change from v3:
drop the bo resv lock in ib test.
---
drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.c | 102
drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.h | 1 +
drivers/gpu/drm/amd/amdgpu/uvd_v6_0.c | 11 +--
drivers
Allow TTM know if vendor set new ttm mananger out of bounds by adding
build_bug_on.
Signed-off-by: xinhui pan
---
drivers/gpu/drm/ttm/ttm_range_manager.c | 8
include/drm/ttm/ttm_device.h| 3 +++
include/drm/ttm/ttm_range_manager.h | 18 --
3 files
Park the ring scheduler thread before accessing the ring.
And unpark it only when we finish accessing the ring.
The right sequence should be like below.
lock ring
park ring thread
direct access ring
[unlock ring, do something, lock ring]
unpark ring thread
unlock ring
Signed-off-by: xinhui pan
This is used for direct IB submission to ring.
Signed-off-by: xinhui pan
---
drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c | 2 ++
drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h | 1 +
2 files changed, 3 insertions(+)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c
b/drivers/gpu/drm/amd/amdgpu
Allow TTM know if vendor set new ttm mananger out of bounds by adding
build_bug_on.
Signed-off-by: xinhui pan
---
drivers/gpu/drm/ttm/ttm_range_manager.c | 2 ++
include/drm/ttm/ttm_device.h| 3 +++
include/drm/ttm/ttm_range_manager.h | 10 ++
3 files changed, 15
Direct IB submission should be exclusive. So use write lock.
Signed-off-by: xinhui pan
---
drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c
b/drivers/gpu/drm/amd/amdgpu
Direct IB submission should be exclusive. So use write lock.
Signed-off-by: xinhui pan
---
drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c | 9 +++--
1 file changed, 7 insertions(+), 2 deletions(-)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c
b/drivers/gpu/drm/amd/amdgpu
alloc extra msg from direct IB pool.
Signed-off-by: xinhui pan
---
change from v1:
let addr align up to gpu page boundary.
---
drivers/gpu/drm/amd/amdgpu/amdgpu_vce.c | 27 -
1 file changed, 13 insertions(+), 14 deletions(-)
diff --git a/drivers/gpu/drm/amd/amdgpu
alloc extra msg from direct IB pool.
Reviewed-by: Christian König
Signed-off-by: xinhui pan
---
change from v1:
let addr align up to gpu page boundary.
---
drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c | 97 +++--
1 file changed, 44 insertions(+), 53 deletions(-)
diff --git
move BO allocation in sw_init.
Signed-off-by: xinhui pan
---
change from v2:
use reservation trylock for direct IB test.
change from v1:
only use pre-allocated BO for direct IB submission.
and take its reservation lock to avoid any potential race.
better safe than sorry.
---
drivers/gpu/drm/amd
move BO allocation in sw_init.
Signed-off-by: xinhui pan
---
change from v1:
only use pre-allocated BO for direct IB submission.
and take its reservation lock to avoid any potential race.
better safe than sorry.
---
drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.c | 103 +---
drivers
alloc extra msg from direct IB pool.
Signed-off-by: xinhui pan
---
change from v1:
let addr align up to gpu page boundary.
---
drivers/gpu/drm/amd/amdgpu/amdgpu_vce.c | 27 -
1 file changed, 13 insertions(+), 14 deletions(-)
diff --git a/drivers/gpu/drm/amd/amdgpu
alloc extra msg from direct IB pool.
Reviewed-by: Christian König
Signed-off-by: xinhui pan
---
change from v1:
let addr align up to gpu page boundary.
---
drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c | 97 +++--
1 file changed, 44 insertions(+), 53 deletions(-)
diff --git
alloc extra msg from direct IB pool.
Signed-off-by: xinhui pan
---
drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c | 99 +++--
1 file changed, 45 insertions(+), 54 deletions(-)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c
b/drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c
index
alloc extra msg from direct IB pool.
Signed-off-by: xinhui pan
---
drivers/gpu/drm/amd/amdgpu/amdgpu_vce.c | 18 +++---
1 file changed, 3 insertions(+), 15 deletions(-)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vce.c
b/drivers/gpu/drm/amd/amdgpu/amdgpu_vce.c
index
move BO allocation in sw_init.
Signed-off-by: xinhui pan
---
drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.c | 75 +++--
drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.h | 1 +
drivers/gpu/drm/amd/amdgpu/uvd_v6_0.c | 8 +--
drivers/gpu/drm/amd/amdgpu/uvd_v7_0.c | 8 +--
4 files
Direct IB pool is used for vce/vcn IB extra msg too. Increase its size
to AMDGPU_IB_POOL_SIZE.
Signed-off-by: xinhui pan
---
drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c | 8 ++--
1 file changed, 2 insertions(+), 6 deletions(-)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c
b/drivers/gpu
Let vce/uvd/vcn use it to avoid memory allocation during IB test.
This is useful when memory is nearly used up and no BO can be
evicted/swappout.
Signed-off-by: xinhui pan
---
drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.c | 51 ---
drivers/gpu/drm/amd/amdgpu/amdgpu_vce.c | 9 +-
drivers
Two dedicated VRAM and GTT BOs for IB test.
Signed-off-by: xinhui pan
---
change from v1
check the existence of uvd and clean the code
---
drivers/gpu/drm/amd/amdgpu/amdgpu.h| 4 ++
drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 6 +++
drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c
Let vce/uvd/vcn use it to avoid memory allocation during IB test.
This is useful when memory is nearly used up and no BO can be
evicted/swappout.
Signed-off-by: xinhui pan
---
drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.c | 51 ---
drivers/gpu/drm/amd/amdgpu/amdgpu_vce.c | 9 +-
drivers
Two dedicated VRAM and GTT BOs for IB test.
Signed-off-by: xinhui pan
---
drivers/gpu/drm/amd/amdgpu/amdgpu.h| 4 ++
drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 6 +++
drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c | 54 ++
3 files changed, 64 insertions(+)
diff
There is one dedicated IB pool for IB test. So lets use it for uvd msg
too.
For some older HW, use one reserved BO at specific range.
Signed-off-by: xinhui pan
---
drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.c | 173 +++-
drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.h | 2 +
2 files
There is one dedicated IB pool for IB test. So lets use it for uvd msg
too.
For some older HW, use one reserved BO at specific range.
Signed-off-by: xinhui pan
---
drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.c | 174 +++-
drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.h | 2 +
2 files
There is one dedicated IB pool for IB test. So lets use it for uvd msg
too.
For some older HW, use one reserved BO at specific range.
Signed-off-by: xinhui pan
---
drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.c | 174 +++-
drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.h | 2 +
2 files
There is one dedicated IB pool for IB test. So lets use it for uvd msg
too.
Signed-off-by: xinhui pan
---
drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.c | 223 +---
1 file changed, 126 insertions(+), 97 deletions(-)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.c
b/drivers
There is one dedicated IB pool for IB test. So lets use it for uvd msg
too.
Signed-off-by: xinhui pan
---
drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.c | 223 +---
1 file changed, 126 insertions(+), 97 deletions(-)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.c
b/drivers
There is one dedicated IB pool for IB test. So lets use it for uvd msg
too.
Signed-off-by: xinhui pan
---
drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.c | 223 +---
1 file changed, 126 insertions(+), 97 deletions(-)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.c
b/drivers
wapout) will
be stuck as we actually did not free any BO memory. This usually happens
when the fence is not signaled for a long time.
Signed-off-by: xinhui pan
Reviewed-by: Christian König
---
drivers/gpu/drm/ttm/ttm_bo.c | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/drivers/g
wapout) will
be stuck as we actually did not free any BO memory. This usually happens
when the fence is not signaled for a long time.
Signed-off-by: xinhui pan
Reviewed-by: Christian König
---
drivers/gpu/drm/ttm/ttm_bo.c | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/drivers/g
Like vce/vcn does, visible VRAM is OK for ib test.
While commit a11d9ff3ebe0 ("drm/amdgpu: use GTT for
uvd_get_create/destory_msg") says VRAM is not mapped correctly in his
platform which is likely an arm64.
So lets change back to use VRAM on x86_64 platform.
Signed-off-by:
dgpu_bo_unmap_and_free(bo1,
bo1_va_handle, bo1_mc,
sdma_write_length);
CU_ASSERT_EQUAL(r, 0);
--
*** BLURB HERE ***
xinhui pan (2):
drm/ttm: Fix a deadlock if the target BO is not idle during swap
drm/amdpgu: Use VRAM domain in UVD IB test
drivers/gpu/drm/amd/amd
]
destroy_queue_cpsch+0x20c/0x330 [amdgpu]
pqm_destroy_queue+0x1a3/0x390 [amdgpu]
kfd_ioctl_destroy_queue+0x57/0xc0 [amdgpu]
Signed-off-by: xinhui pan
---
.../gpu/drm/amd/amdkfd/kfd_device_queue_manager.c | 15 ++-
drivers/gpu/drm/amd/amdkfd/kfd_process.c | 4 +++-
.../drm/amd
To avoid any list corruption.
Signed-off-by: xinhui pan
---
.../drm/amd/amdkfd/kfd_device_queue_manager.c | 22 ++-
1 file changed, 12 insertions(+), 10 deletions(-)
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c
b/drivers/gpu/drm/amd/amdkfd
[amdgpu]
Signed-off-by: xinhui pan
---
.../gpu/drm/amd/amdkfd/kfd_device_queue_manager.c | 13 +
drivers/gpu/drm/amd/amdkfd/kfd_process.c| 4 +++-
.../gpu/drm/amd/amdkfd/kfd_process_queue_manager.c | 2 ++
3 files changed, 18 insertions(+), 1 deletion(-)
diff --git
To avoid any list corruption.
Signed-off-by: xinhui pan
---
.../gpu/drm/amd/amdkfd/kfd_device_queue_manager.c| 12
1 file changed, 8 insertions(+), 4 deletions(-)
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c
b/drivers/gpu/drm/amd/amdkfd
Because TTM do page counting on userptr BOs which is actually not
needed. To avoid that, lets set TTM_PAGE_FLAG_SG after tt_create and
before tt_populate.
Signed-off-by: xinhui pan
---
drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)
diff
=7
pid=2511
kfree+0x322/0x340
free_mqd_hiq_sdma+0x20/0x60 [amdgpu]
destroy_queue_cpsch+0x20c/0x330 [amdgpu]
pqm_destroy_queue+0x1a3/0x390 [amdgpu]
kfd_ioctl_destroy_queue+0x57/0xc0 [amdgpu]
Signed-off-by: xinhui pan
---
drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c | 2 ++
drivers
-by: xinhui pan
---
drivers/gpu/drm/ttm/ttm_tt.c | 32 +---
1 file changed, 13 insertions(+), 19 deletions(-)
diff --git a/drivers/gpu/drm/ttm/ttm_tt.c b/drivers/gpu/drm/ttm/ttm_tt.c
index a1a25410ec74..4fa0a8cd71c0 100644
--- a/drivers/gpu/drm/ttm/ttm_tt.c
+++ b/drivers/gpu
We call free_mqd without dqm lock hold, that causes double free of
mqd_mem_obj. Fix it by using a tmp pointer.
We need walk through the queues_list with dqm lock hold. Otherwise hit
list corruption.
Signed-off-by: xinhui pan
---
.../drm/amd/amdkfd/kfd_device_queue_manager.c | 17
ss_zero))
dma_resv_unlock(resv)
unlock lru_lock
To fix it simply, let's acquire lru_lock before resv trylock to avoid
the race above.
Signed-off-by: xinhui pan
---
drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c | 2 ++
1 file changed, 2 ins
Swapping a ttm object which has no backend pages makes no sense.
Suggested-by: Christian König
Signed-off-by: xinhui pan
---
drivers/gpu/drm/ttm/ttm_device.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/ttm/ttm_device.c b/drivers/gpu/drm/ttm/ttm_device.c
ctually this is not a bug if trylock fails. So use dma_resv_lock
instead.
Signed-off-by: xinhui pan
---
drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
b/drivers/gpu/drm/amd
n BO A // memory overwritten
To fix this issue, we can set TTM_PAGE_FLAG_SG when we create userptr BO
ttm. Then swapout function would not swap it.
Signed-off-by: xinhui pan
---
drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c | 4 +---
drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c | 4
Signed-off-by: xinhui pan
---
drivers/gpu/drm/ttm/ttm_device.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/drivers/gpu/drm/ttm/ttm_device.c b/drivers/gpu/drm/ttm/ttm_device.c
index 510e3e001dab..a9772fcc8f9c 100644
--- a/drivers/gpu/drm/ttm/ttm_device.c
+++ b/drivers/gpu/drm/ttm
ion too. But I hit page fault mostly.
Signed-off-by: xinhui pan
---
drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c | 16 +++-
1 file changed, 15 insertions(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
b/drivers/gpu/drm/amd/amdgpu/amdg
0 [amdgpu]
[ 1236.046912] kfd_ioctl+0x463/0x690 [amdgpu]
[ 1236.051632] ? kfd_dev_is_large_bar+0xf0/0xf0 [amdgpu]
[ 1236.057360] __x64_sys_ioctl+0x91/0xc0
[ 1236.061457] do_syscall_64+0x38/0x90
[ 1236.065383] entry_SYSCALL_64_after_hwframe+0x44/0xa9
[ 1236.070920] RIP: 0033:0x7f5013dbe50b
Signed-off-b
ttm->num_pages is uint32. Hit overflow when << PAGE_SHIFT directly
Signed-off-by: xinhui pan
---
drivers/gpu/drm/radeon/radeon_ttm.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/gpu/drm/radeon/radeon_ttm.c
b/drivers/gpu/drm/radeon/radeon_tt
ttm->num_pages is uint32. Hit overflow when << PAGE_SHIFT directly
Fix: 230c079fd (drm/ttm: make num_pages uint32_t)
Signed-off-by: xinhui pan
---
drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/amd/amdgpu/amd
ttm->num_pages is uint32. Hit overflow when << PAGE_SHIFT directly
Fix: 230c079fd (drm/ttm: make num_pages uint32_t)
Signed-off-by: xinhui pan
---
drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/amd/amdgpu/amd
To make size is 4 byte aligned. Use &~0x3ULL instead of &3ULL.
Signed-off-by: xinhui pan
---
drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm
drm_gem_object_put() should be paired with drm_gem_object_lookup().
All gem objs are saved in fb->base.obj[]. Need put the old first before
assign a new obj.
Trigger VRAM leak by running command below
$ service gdm restart
Signed-off-by: xinhui pan
---
drivers/gpu/drm/amd/amd
BO would be added into swap list if it is validated into system domain.
If BO is validated again into non-system domain, say, VRAM domain. It
actually should not be in the swap list.
Signed-off-by: xinhui pan
---
drivers/gpu/drm/ttm/ttm_bo.c | 2 ++
1 file changed, 2 insertions(+)
diff --git
Free the memory on failure.
Also no need to re-alloc memory on retry.
Signed-off-by: xinhui pan
---
drivers/gpu/drm/ttm/ttm_bo.c | 9 ++---
1 file changed, 6 insertions(+), 3 deletions(-)
diff --git a/drivers/gpu/drm/ttm/ttm_bo.c b/drivers/gpu/drm/ttm/ttm_bo.c
index e38102282fd5
Flag TTM_PL_FLAG_CONTIGUOUS is only valid for VRAM domain. So fix the
false positive by checking memory type too.
Suggested-by: Felix Kuehling
Acked-by: Christian König
Signed-off-by: xinhui pan
---
drivers/gpu/drm/amd/amdgpu/amdgpu_object.c | 3 ++-
1 file changed, 2 insertions(+), 1
Size is page count here.
Signed-off-by: xinhui pan
---
drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
index 4a85f8cedd77..11dd3d9eac15 100644
during vm init and bo
moving.
But looks like we forget to reserve the immediate shared fence slot
during vm fault.
Signed-off-by: xinhui pan
---
drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 3 +++
1 file changed, 3 insertions(+)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
b/drivers/gpu/drm
-by: xinhui pan
---
drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
index 37221b99ca96..9e0116c7f8d1 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
+++ b
ves.
Signed-off-by: xinhui pan
---
drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
index 37221b99ca96..77689cecd189 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
++
979.354934] CR2: 94dfc4bc
[ 979.358566] ---[ end trace 5b622843e4242519 ]---
Signed-off-by: xinhui pan
---
drivers/gpu/drm/amd/amdgpu/amdgpu_gtt_mgr.c | 104 ++--
drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c | 18 +---
2 files changed, 80 insertions(+), 42 deletions(-)
diff
Ras error occurs while gpu recovery. We can not add its list head
to two lists at same time.
Signed-off-by: xinhui pan
---
drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c | 18 +++---
1 file changed, 7 insertions(+), 11 deletions(-)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c
b
The delayed delete list is per device which might be very huge. And in
a heavy workload test, the list might always not be empty. That will
trigger any RCU stall warnings or softlockups in non-preemptible kernels
Lets do break out the loops in that case.
Signed-off-by: xinhui pan
---
drivers
The delayed delete list is per device which might be very huge. And in
a heavy workload test, the list might always not be empty. That will
trigger any RCU stall warnings or softlockups in non-preemptible kernels
Lets do schedule out if possible in that case.
Signed-off-by: xinhui pan
We have three ib pools, they are normal, VM, direct pools.
Any jobs which schedule IBs without dependence on gpu scheduler should
use DIRECT pool.
Any jobs schedule direct VM update IBs should use VM pool.
Any other jobs use NORMAL pool.
Signed-off-by: xinhui pan
---
drivers/gpu/drm/amd
use corresponding ib pool for each job
Signed-off-by: xinhui pan
---
drivers/gpu/drm/amd/amdgpu/amdgpu_jpeg.c| 2 +-
drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c | 3 ++-
drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.c | 3 ++-
drivers/gpu/drm/amd/amdgpu/amdgpu_vce.c | 5 +++--
drivers/gpu
We have tree ib pools, they are normal, VM, direct pools.
Any jobs which schedule IBs without dependence on gpu scheduler should
use DIRECT pool.
Any jobs schedule direct VM update IBs should use VM pool.
Any other jobs use NORMAL pool.
Signed-off-by: xinhui pan
---
drivers/gpu/drm/amd
1 - 100 of 123 matches
Mail list logo