wed-by: James Zhu
Thanks & Best Regards!
James
On 2021-09-17 7:30 a.m., Andrey Grodzovsky wrote:
Add more guards to MMIO access post device
unbind/unplug
Bug:https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fbugs.archlinux.org%2Ftask%2F72092%3Fproject%3D1%26order%3Ddat
for PCIe error recovery to avoid
accessing registres. This allows to successfully complete
pm resume sequence and finish pci remove.
Signed-off-by: Andrey Grodzovsky
---
drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c | 4
1 file changed, 4 insertions(+)
diff --git a/drivers/gpu/drm/amd/amdgpu
Add more guards to MMIO access post device
unbind/unplug
Bug:https://bugs.archlinux.org/task/72092?project=1=dateopened=desc=1
Signed-off-by: Andrey Grodzovsky
---
drivers/gpu/drm/amd/amdgpu/vcn_v2_0.c | 8 ++--
drivers/gpu/drm/amd/amdgpu/vcn_v2_5.c | 17 +++--
2 files changed
On 2021-09-16 11:51 a.m., Lazar, Lijo wrote:
On 9/16/2021 9:15 PM, Andrey Grodzovsky wrote:
On 2021-09-16 4:20 a.m., Lazar, Lijo wrote:
A minor comment below.
On 9/16/2021 1:11 AM, Andrey Grodzovsky wrote:
Crash:
BUG: unable to handle page fault for address: 10e1
RIP: 0010
On 2021-09-16 4:20 a.m., Lazar, Lijo wrote:
A minor comment below.
On 9/16/2021 1:11 AM, Andrey Grodzovsky wrote:
Crash:
BUG: unable to handle page fault for address: 10e1
RIP: 0010:vega10_power_gate_vce+0x26/0x50 [amdgpu]
Call Trace:
pp_set_powergating_by_smu+0x16a/0x2b0 [amdgpu
I fixed 2 regressions and latest code, applied your patch on top and
passed libdrm tests
on Vega 10. You can pickup those 2 patches and try too if you have time.
In any case -
Reviewed-and-tested-by: Andrey Grodzovsky
Andrey
On 2021-09-15 2:37 a.m., xinhui pan wrote:
We hit soft hang while
Why:
DC core is being released from DM before it's referenced
from hpd_rx wq destruction code.
How: Move hpd_rx destruction before DC core destruction.
Signed-off-by: Andrey Grodzovsky
---
.../gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 24 +--
1 file changed, 12 insertions
ee6679aaa61c drm/amdgpu: add missing cleanups for Polaris12 UVD/VCE on suspend
Signed-off-by: Andrey Grodzovsky
---
drivers/gpu/drm/amd/amdgpu/uvd_v3_1.c | 24 ---
drivers/gpu/drm/amd/amdgpu/uvd_v4_2.c | 24 ---
drivers/gpu/drm/amd/amdgpu/uvd_v5_0.c | 24 ---
drivers
Pushed
Andrey
On 2021-09-15 7:45 a.m., Christian König wrote:
Yes, I think so as well. Andrey can you push this?
Christian.
Am 15.09.21 um 00:59 schrieb Grodzovsky, Andrey:
AFAIK this one is independent.
Christian, can you confirm ?
Andrey
On 2021-09-15 9:57 a.m., Christian König wrote:
Am 15.09.21 um 15:52 schrieb Andrey Grodzovsky:
On 2021-09-15 2:42 a.m., Pan, Xinhui wrote:
[AMD Official Use Only]
Andrey
I hit panic with this plug/unplug test without this patch.
Can you please tell which ASIC you are using and which
On 2021-09-15 2:42 a.m., Pan, Xinhui wrote:
[AMD Official Use Only]
Andrey
I hit panic with this plug/unplug test without this patch.
Can you please tell which ASIC you are using and which kernel branch and
what is the tip commit ?
But as we add enter/exit in all its callers. maybe it
I think you missed 'reply all' so bringing back to public
On 2021-09-14 11:40 p.m., Pan, Xinhui wrote:
[AMD Official Use Only]
perf says it is the lock addl $0x0,-0x4(%rsp)
details is below. the contention is huge maybe.
Yes - that makes sense to me too as long as the lock here is some
On 2021-09-14 9:42 p.m., xinhui pan wrote:
We hit soft hang while doing memory pressure test on one numa system.
After a qucik look, this is because kfd invalid/valid userptr memory
frequently with process_info lock hold.
perf top says below,
75.81% [kernel] [k] __srcu_read_unlock
Please add a tag V2 in description explaining what was the delta from V1.
Other then that looks good to me.
Andrey
On 2021-09-12 7:48 p.m., xinhui pan wrote:
Direct IB submission should be exclusive. So use write lock.
Signed-off-by: xinhui pan
---
On 2021-09-02 10:28 a.m., Daniel Vetter wrote:
On Tue, Aug 31, 2021 at 02:24:52PM -0400, Andrey Grodzovsky wrote:
On 2021-08-31 9:11 a.m., Daniel Vetter wrote:
On Thu, Aug 26, 2021 at 11:04:14AM +0200, Daniel Vetter wrote:
On Thu, Aug 19, 2021 at 11:25:09AM -0400, Andrey Grodzovsky wrote
On 2021-09-01 12:40 a.m., Jingwen Chen wrote:
On Wed Sep 01, 2021 at 12:28:59AM -0400, Andrey Grodzovsky wrote:
On 2021-09-01 12:25 a.m., Jingwen Chen wrote:
On Wed Sep 01, 2021 at 12:04:47AM -0400, Andrey Grodzovsky wrote:
I will answer everything here -
On 2021-08-31 9:58 p.m., Liu, Monk
On 2021-09-01 12:25 a.m., Jingwen Chen wrote:
On Wed Sep 01, 2021 at 12:04:47AM -0400, Andrey Grodzovsky wrote:
I will answer everything here -
On 2021-08-31 9:58 p.m., Liu, Monk wrote:
[AMD Official Use Only]
In the previous discussion, you guys stated that we should
I will answer everything here -
On 2021-08-31 9:58 p.m., Liu, Monk wrote:
[AMD Official Use Only]
In the previous discussion, you guys stated that we should drop the
“kthread_should_park” in cleanup_job.
@@ -676,15 +676,6 @@ drm_sched_get_cleanup_job(struct
drm_gpu_scheduler *sched)
{
On 2021-08-31 12:01 p.m., Luben Tuikov wrote:
On 2021-08-31 11:23, Andrey Grodzovsky wrote:
On 2021-08-31 10:38 a.m., Daniel Vetter wrote:
On Tue, Aug 31, 2021 at 10:20:40AM -0400, Andrey Grodzovsky wrote:
On 2021-08-31 10:03 a.m., Daniel Vetter wrote:
On Tue, Aug 31, 2021 at 09:53:36AM
On 2021-08-31 9:11 a.m., Daniel Vetter wrote:
On Thu, Aug 26, 2021 at 11:04:14AM +0200, Daniel Vetter wrote:
On Thu, Aug 19, 2021 at 11:25:09AM -0400, Andrey Grodzovsky wrote:
On 2021-08-19 5:30 a.m., Daniel Vetter wrote:
On Wed, Aug 18, 2021 at 10:51:00AM -0400, Andrey Grodzovsky wrote
On 2021-08-31 10:38 a.m., Daniel Vetter wrote:
On Tue, Aug 31, 2021 at 10:20:40AM -0400, Andrey Grodzovsky wrote:
On 2021-08-31 10:03 a.m., Daniel Vetter wrote:
On Tue, Aug 31, 2021 at 09:53:36AM -0400, Andrey Grodzovsky wrote:
It's says patch [2/2] but i can't find patch 1
On 2021-08-31 6
On 2021-08-31 10:03 a.m., Daniel Vetter wrote:
On Tue, Aug 31, 2021 at 09:53:36AM -0400, Andrey Grodzovsky wrote:
It's says patch [2/2] but i can't find patch 1
On 2021-08-31 6:35 a.m., Monk Liu wrote:
tested-by: jingwen chen
Signed-off-by: Monk Liu
Signed-off-by: jingwen chen
It's says patch [2/2] but i can't find patch 1
On 2021-08-31 6:35 a.m., Monk Liu wrote:
tested-by: jingwen chen
Signed-off-by: Monk Liu
Signed-off-by: jingwen chen
---
drivers/gpu/drm/scheduler/sched_main.c | 24
1 file changed, 4 insertions(+), 20 deletions(-)
On 2021-08-30 11:24 p.m., Pan, Xinhui wrote:
[AMD Official Use Only]
[AMD Official Use Only]
Unreserve root BO before return otherwise next allocation got deadlock.
Signed-off-by: xinhui pan
---
drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c | 11 +--
1 file changed, 5 insertions(+), 6
empty before suspend.
v2: Call drm_sched_resubmit_job before drm_sched_start to
restart jobs from the pending list.
Suggested-by: Andrey Grodzovsky
Suggested-by: Christian König
Signed-off-by: Guchun Chen
Reviewed-by: Christian König
---
drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c | 8
On 2021-08-30 1:05 p.m., Christian König wrote:
Am 30.08.21 um 19:02 schrieb Andrey Grodzovsky:
On 2021-08-30 12:51 p.m., Christian König wrote:
Am 30.08.21 um 16:16 schrieb Andrey Grodzovsky:
On 2021-08-30 4:58 a.m., Christian König wrote:
Am 27.08.21 um 22:39 schrieb Andrey
On 2021-08-30 12:51 p.m., Christian König wrote:
Am 30.08.21 um 16:16 schrieb Andrey Grodzovsky:
On 2021-08-30 4:58 a.m., Christian König wrote:
Am 27.08.21 um 22:39 schrieb Andrey Grodzovsky:
This list will be used to capture all non VRAM BOs not
on LRU so when device is hot unplugged we
On 2021-08-30 4:58 a.m., Christian König wrote:
Am 27.08.21 um 22:39 schrieb Andrey Grodzovsky:
This list will be used to capture all non VRAM BOs not
on LRU so when device is hot unplugged we can iterate
the list and unmap DMA mappings before device is removed.
v2: Reanme function
Handle all DMA IOMMU group related dependencies before the
group is removed and we try to access it after free.
v2:
Move the actul handling function to TTM
Signed-off-by: Andrey Grodzovsky
---
drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 2 ++
1 file changed, 2 insertions(+)
diff --git
Switch to ttm_tt_unpopulate
Signed-off-by: Andrey Grodzovsky
---
drivers/gpu/drm/ttm/ttm_device.c | 47
include/drm/ttm/ttm_device.h | 1 +
2 files changed, 48 insertions(+)
diff --git a/drivers/gpu/drm/ttm/ttm_device.c b/drivers/gpu/drm/ttm/ttm_device.c
index
To support libdrm tests.
Signed-off-by: Andrey Grodzovsky
---
drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
index 6400259a7c4b..c2fdf67ff551 100644
This list will be used to capture all non VRAM BOs not
on LRU so when device is hot unplugged we can iterate
the list and unmap DMA mappings before device is removed.
v2: Reanme function to ttm_bo_move_to_pinned
v3: Move the pinned list to ttm device
Signed-off-by: Andrey Grodzovsky
Suggested
IOMMU hnadling to TTM layer.
v3:
Move pinned list to ttm device and a few others.
Andrey Grodzovsky (4):
drm/ttm: Create pinned list
drm/ttm: Clear all DMA mappings on demand
drm/amdgpu: drm/amdgpu: Handle IOMMU enabled case
drm/amdgpu: Add a UAPI flag for hot plug/unplug
drivers/gpu
I don't think it will start/stop twice because
amdgpu_fence_driver_hw_fini/inint is not called during reset.
I am worried about calling drm_sched_start without calling
drm_sched_resubmit_job first since that
the place where the jobs are actually restarted. Also calling
drm_sched_start with
better than starting the timer when pushing the job to the
ring buffer, because that is completely off.
Christian.
Am 27.08.21 um 20:22 schrieb Andrey Grodzovsky:
As I mentioned to Monk before - what about cases such as in this test
-
https://gitlab.freedesktop.org/mesa/drm/-/commit
into the
ring buffer, but rather when it starts processing.
Starting processing is a bit swampy defined, but just starting the
timer when the previous job completes should be fine enough.
Christian.
Am 27.08.21 um 15:57 schrieb Andrey Grodzovsky:
The TS represents the point in time when the job
Ping
Andrey
On 2021-08-26 1:27 p.m., Andrey Grodzovsky wrote:
Bunch of fixes to enable passing hotplug tests i previosly added
here[1] with latest code.
Once accepted I will enable the tests on libdrm side.
[1] - https://gitlab.freedesktop.org/mesa/drm/-/merge_requests/172
v2:
Dropping VCE
6.08.21 um 22:14 schrieb Andrey Grodzovsky:
Attached quick patch for per job TTL calculation to make more precises
next timer expiration. It's on top of the patch in this thread. Let me
know if this makes sense.
Andrey
On 2021-08-26 10:03 a.m., Andrey Grodzovsky wrote:
On 2021-08-26 12:55 a.m.
: [PATCH] drm/sched: fix the bug of time out calculation(v3)
Attached quick patch for per job TTL calculation to make more precises next
timer expiration. It's on top of the patch in this thread. Let me know if this
makes sense.
Andrey
On 2021-08-26 10:03 a.m., Andrey Grodzovsky wrote:
On 2021-08
Attached quick patch for per job TTL calculation to make more precises
next timer expiration. It's on top of the patch in this thread. Let me
know if this makes sense.
Andrey
On 2021-08-26 10:03 a.m., Andrey Grodzovsky wrote:
On 2021-08-26 12:55 a.m., Monk Liu wrote:
issue:
in cleanup_job
Used by drivers supporting hot unplug to handle all
DMA IOMMU group related dependencies before the group
is removed during device removal and we try to access
it after free when last device pointer from user space
is dropped.
Signed-off-by: Andrey Grodzovsky
---
drivers/gpu/drm/ttm
To support libdrm tests.
Signed-off-by: Andrey Grodzovsky
---
drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
index 6400259a7c4b..c2fdf67ff551 100644
IOMMU hnadling to TTM layer.
Andrey Grodzovsky (4):
drm/ttm: Create pinned list
drm/ttm: Clear all DMA mappings on demand
drm/amdgpu: drm/amdgpu: Handle IOMMU enabled case
drm/amdgpu: Add a UAPI flag for hot plug/unplug
drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 2 +
drivers/gpu/drm
Handle all DMA IOMMU group related dependencies before the
group is removed and we try to access it after free.
v2:
Move the actul handling function to TTM
Signed-off-by: Andrey Grodzovsky
---
drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 2 ++
1 file changed, 2 insertions(+)
diff --git
assigned to them.
Signed-off-by: Andrey Grodzovsky
Suggested-by: Christian König
---
drivers/gpu/drm/ttm/ttm_bo.c | 30 ++
drivers/gpu/drm/ttm/ttm_resource.c | 1 +
include/drm/ttm/ttm_resource.h | 1 +
3 files changed, 28 insertions(+), 4 deletions(-)
diff
On 2021-08-26 12:55 a.m., Monk Liu wrote:
issue:
in cleanup_job the cancle_delayed_work will cancel a TO timer
even the its corresponding job is still running.
fix:
do not cancel the timer in cleanup_job, instead do the cancelling
only when the heading job is signaled, and if there is a
Ping
Andrey
On 2021-08-25 11:36 a.m., Andrey Grodzovsky wrote:
On 2021-08-25 2:43 a.m., Christian König wrote:
Am 24.08.21 um 23:01 schrieb Andrey Grodzovsky:
Handle all DMA IOMMU group related dependencies before the
group is removed and we try to access it after free.
Signed-off
On 2021-08-26 12:55 a.m., Liu, Monk wrote:
[AMD Official Use Only]
But for timer pending case (common case) your mod_delayed_work will effectively
do exactly the same if you don't use per job TTLs - you mod it to
sched->timeout value which resets the pending timer to again count from 0.
On 2021-08-25 10:31 p.m., Liu, Monk wrote:
[AMD Official Use Only]
Hi Andrey
I'm not quite sure if I read you correctly
Seems to me you can only do it for empty pending list otherwise you risk
cancelling a legit new timer that was started by the next job or not restarting
timer at all
On 2021-08-25 8:11 a.m., Christian König wrote:
No, this would break that logic here.
See drm_sched_start_timeout() can be called multiple times, this is
intentional and very important!
The logic in queue_delayed_work() makes sure that the timer is only
started once and then never again.
On 2021-08-25 2:43 a.m., Christian König wrote:
Am 24.08.21 um 23:01 schrieb Andrey Grodzovsky:
Handle all DMA IOMMU group related dependencies before the
group is removed and we try to access it after free.
Signed-off-by: Andrey Grodzovsky
---
drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
here too.
https://lists.freedesktop.org/archives/amd-gfx/2021-August/067972.html
https://lists.freedesktop.org/archives/amd-gfx/2021-August/067967.html
BR
Evan
-Original Message-
From: amd-gfx On Behalf Of
Andrey Grodzovsky
Sent: Wednesday, August 25, 2021 5:01 AM
To: dri-de
To support libdrm tests.
Signed-off-by: Andrey Grodzovsky
---
drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
index 6400259a7c4b..c2fdf67ff551 100644
Handle all DMA IOMMU group related dependencies before the
group is removed and we try to access it after free.
Signed-off-by: Andrey Grodzovsky
---
drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 2 +
drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c| 50 ++
drivers/gpu/drm/amd
Attepmts to powergate after device is removed lead to crash.
Signed-off-by: Andrey Grodzovsky
---
drivers/gpu/drm/amd/amdgpu/amdgpu_vce.c | 1 -
drivers/gpu/drm/amd/amdgpu/vce_v2_0.c | 4
drivers/gpu/drm/amd/amdgpu/vce_v3_0.c | 5 -
drivers/gpu/drm/amd/amdgpu/vce_v4_0.c | 2 ++
4
This list will be used to capture all non VRAM BOs not
on LRU so when device is hot unplugged we can iterate
the list and unmap DMA mappings before device is removed.
Signed-off-by: Andrey Grodzovsky
Suggested-by: Christian König
---
drivers/gpu/drm/ttm/ttm_bo.c | 24
Bunch of fixes to enable passing hotplug tests i previosly added
here[1] with latest code.
Once accepted I will enable the tests on libdrm side.
[1] - https://gitlab.freedesktop.org/mesa/drm/-/merge_requests/172
Andrey Grodzovsky (4):
drm/amdgpu: Move flush VCE idle_work during HW fini
drm
On 2021-08-24 5:51 a.m., Monk Liu wrote:
the original logic is wrong that the timeout will not be retriggerd
after the previous job siganled, and that lead to the scenario that all
jobs in the same scheduler shares the same timeout timer from the very
begining job in this scheduler which is
On 2021-08-24 10:46 a.m., Andrey Grodzovsky wrote:
On 2021-08-24 5:51 a.m., Monk Liu wrote:
the original logic is wrong that the timeout will not be retriggerd
after the previous job siganled, and that lead to the scenario that all
jobs in the same scheduler shares the same timeout timer
hursday, August 19, 2021 5:31 PM
To: Grodzovsky, Andrey
Cc: Daniel Vetter ; Alex Deucher
; Chen, JingWen ; Maling
list - DRI developers ; amd-gfx list
; Liu, Monk ; Koenig,
Christian
Subject: Re: [PATCH v2] Revert "drm/scheduler: Avoid accessing freed bad job."
On Wed, Aug 18, 2021 at
in s3 test (v2)
Please go ahead. Thanks!
Alex
On Thu, Aug 19, 2021 at 8:05 AM Mike Lothian
wrote:
Hi
Do I need to open a new bug report for this?
Cheers
Mike
On Wed, 18 Aug 2021 at 06:26, Andrey Grodzovsky
wrote:
On 2021-08-02 1:16 a.m., Guchun Chen wrote
sky, Andrey
Cc: Daniel Vetter ; Alex Deucher ; Chen, JingWen
; Maling list - DRI developers ; amd-gfx list
; Liu, Monk ; Koenig, Christian
Subject: Re: [PATCH v2] Revert "drm/scheduler: Avoid accessing freed bad job."
On Wed, Aug 18, 2021 at 10:51:00AM -0400, Andrey Grodzovsky wrot
Maling list - DRI developers ; amd-gfx list
; Liu, Monk ; Koenig, Christian
Subject: Re: [PATCH v2] Revert "drm/scheduler: Avoid accessing freed bad job."
On Wed, Aug 18, 2021 at 10:51:00AM -0400, Andrey Grodzovsky wrote:
On 2021-08-18 10:42 a.m., Daniel Vetter wrote:
On Wed, Aug 18
On 2021-08-19 5:30 a.m., Daniel Vetter wrote:
On Wed, Aug 18, 2021 at 10:51:00AM -0400, Andrey Grodzovsky wrote:
On 2021-08-18 10:42 a.m., Daniel Vetter wrote:
On Wed, Aug 18, 2021 at 10:36:32AM -0400, Andrey Grodzovsky wrote:
On 2021-08-18 10:32 a.m., Daniel Vetter wrote:
On Wed, Aug 18
On 2021-08-18 10:42 a.m., Daniel Vetter wrote:
On Wed, Aug 18, 2021 at 10:36:32AM -0400, Andrey Grodzovsky wrote:
On 2021-08-18 10:32 a.m., Daniel Vetter wrote:
On Wed, Aug 18, 2021 at 10:26:25AM -0400, Andrey Grodzovsky wrote:
On 2021-08-18 10:02 a.m., Alex Deucher wrote:
+ dri-devel
On 2021-08-18 10:32 a.m., Daniel Vetter wrote:
On Wed, Aug 18, 2021 at 10:26:25AM -0400, Andrey Grodzovsky wrote:
On 2021-08-18 10:02 a.m., Alex Deucher wrote:
+ dri-devel
Since scheduler is a shared component, please add dri-devel on all
scheduler patches.
On Wed, Aug 18, 2021 at 7:21 AM
On 2021-08-18 10:02 a.m., Alex Deucher wrote:
+ dri-devel
Since scheduler is a shared component, please add dri-devel on all
scheduler patches.
On Wed, Aug 18, 2021 at 7:21 AM Jingwen Chen wrote:
[Why]
for bailing job, this commit will delete it from pending list thus the
bailing job will
On 2021-08-02 1:16 a.m., Guchun Chen wrote:
In amdgpu_fence_driver_hw_fini, no need to call drm_sched_fini to stop
scheduler in s3 test, otherwise, fence related failure will arrive
after resume. To fix this and for a better clean up, move drm_sched_fini
from fence_hw_fini to fence_sw_fini, as
Looks reasonable to me.
Reviewed-by: Andrey Grodzovsky
Andrey
On 2021-08-17 5:50 a.m., YuBiao Wang wrote:
[Why]
In some cases when we unload driver, warning call trace
will show up in vram_mgr_fini which claims that LRU is not empty, caused
by the ttm bo inside delay deleted queue.
[How]
We
On 2021-08-17 12:28 a.m., Jingwen Chen wrote:
[Why]
for bailing job, this commit will delete it from pending list thus the
bailing job will never have a chance to be resubmitted even in advance
tdr mode.
[How]
after embeded hw_fence into amdgpu_job is done, the race condition that
this commit
Reviewed-by: Andrey Grodzovsky
Andrey
On 2021-08-09 11:22 p.m., Jingwen Chen wrote:
From: Jack Zhang
Why: Previously hw fence is alloced separately with job.
It caused historical lifetime issues and corner cases.
The ideal situation is to take fence to manage both job
and fence's lifetime
On 2021-08-05 4:31 a.m., Jingwen Chen wrote:
[Why]
After embeded hw_fence to amdgpu_job, we need to add tdr support
for this feature.
[How]
1. Add a resubmit_flag for resubmit jobs.
2. Clear job fence from RCU and force complete vm flush fences in
pre_asic_reset
3. skip dma_fence_get for
On 2021-08-05 4:31 a.m., Jingwen Chen wrote:
From: Jack Zhang
Why: Previously hw fence is alloced separately with job.
It caused historical lifetime issues and corner cases.
The ideal situation is to take fence to manage both job
and fence's lifetime, and simplify the design of
On 2021-07-22 8:20 p.m., Jingwen Chen wrote:
On Thu Jul 22, 2021 at 01:50:09PM -0400, Andrey Grodzovsky wrote:
On 2021-07-22 1:27 p.m., Jingwen Chen wrote:
On Thu Jul 22, 2021 at 01:17:13PM -0400, Andrey Grodzovsky wrote:
On 2021-07-22 12:47 p.m., Jingwen Chen wrote:
On Thu Jul 22, 2021
On 2021-07-22 1:27 p.m., Jingwen Chen wrote:
On Thu Jul 22, 2021 at 01:17:13PM -0400, Andrey Grodzovsky wrote:
On 2021-07-22 12:47 p.m., Jingwen Chen wrote:
On Thu Jul 22, 2021 at 06:24:28PM +0200, Christian König wrote:
Am 22.07.21 um 16:45 schrieb Andrey Grodzovsky:
On 2021-07-22 6:45 a.m
On 2021-07-22 12:47 p.m., Jingwen Chen wrote:
On Thu Jul 22, 2021 at 06:24:28PM +0200, Christian König wrote:
Am 22.07.21 um 16:45 schrieb Andrey Grodzovsky:
On 2021-07-22 6:45 a.m., Jingwen Chen wrote:
On Wed Jul 21, 2021 at 12:53:51PM -0400, Andrey Grodzovsky wrote:
On 2021-07-20 11:13
On 2021-07-22 6:45 a.m., Jingwen Chen wrote:
On Wed Jul 21, 2021 at 12:53:51PM -0400, Andrey Grodzovsky wrote:
On 2021-07-20 11:13 p.m., Jingwen Chen wrote:
[Why]
After embeded hw_fence to amdgpu_job, we need to add tdr support
for this feature.
[How]
1. Add a resubmit_flag for resubmit
On 2021-07-20 11:13 p.m., Jingwen Chen wrote:
[Why]
After embeded hw_fence to amdgpu_job, we need to add tdr support
for this feature.
[How]
1. Add a resubmit_flag for resubmit jobs.
2. Clear job fence from RCU and force complete vm flush fences in
pre_asic_reset
3. skip dma_fence_get for
Update callback signature and update implementation.
Signed-off-by: Andrey Grodzovsky
---
drivers/gpu/drm/amd/amdgpu/amdgpu_psp.h | 6 ++--
drivers/gpu/drm/amd/amdgpu/psp_v11_0.c | 41 -
2 files changed, 16 insertions(+), 31 deletions(-)
diff --git a/drivers/gpu/drm
Add USBC PD FW implementation here to be used with relevant ASICs.
Signed-off-by: Andrey Grodzovsky
---
drivers/gpu/drm/amd/amdgpu/psp_v13_0.c | 66 ++
1 file changed, 66 insertions(+)
diff --git a/drivers/gpu/drm/amd/amdgpu/psp_v13_0.c
b/drivers/gpu/drm/amd/amdgpu
System memory-based implementation for updating the
USBCPD is deprecated for switching
to LFB based implementation for all the ASICs.
Signed-off-by: Andrey Grodzovsky
---
drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c | 33 ++---
1 file changed, 13 insertions(+), 20 deletions
With added CPU domain to placement you can have
now 3 placemnts at once.
Signed-off-by: Andrey Grodzovsky
---
drivers/gpu/drm/amd/amdgpu/amdgpu_object.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
b/drivers/gpu/drm/amd/amdgpu
From: Lang Yu
Change mgr->available into mgr->used (invert the value).
Makes more sense to do it this way since we don't need the spinlock any
more to double check the handling.
v3 (chk): separated from the TEMPOARAY FLAG change.
Signed-off-by: Lang Yu
Signed-off-by: Christian König
---
From: Lang Yu
Currently, we have a limitted GTT memory size and need a bounce buffer
when doing buffer migration between VRAM and SYSTEM domain.
The problem is under GTT memory pressure we can't do buffer migration
between VRAM and SYSTEM domain. But in some cases we really need that.
Problem:
Under memory pressure when GTT domain is almost full multihop assert
will come up when trying to evict LRU BO from VRAM to SYSTEM.
Fix:
Don't assert on multihop error in evict code but rather do a retry
as we do in ttm_bo_move_buffer
Signed-off-by: Andrey Grodzovsky
---
drivers/gpu
From: Christian König
When we run out of GTT we should still be able to evict VRAM->SYSTEM
with a bounce bufferdrm/amdgpu: always allow evicting to SYSTEM domain
Signed-off-by: Christian König
---
drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c | 6 --
1 file changed, 4 insertions(+), 2
From: Lang Yu
Sometimes drivers need to use bounce buffers to evict BOs. While those reside
in some domain they are not necessarily suitable for CS.
Add a flag so that drivers can note that a bounce buffers needs to be
reallocated during validation.
v2: add detailed comments
v3 (chk): merge
] libdrm tests for hot-unplug feature
Please open a gitlab MR for these.
Alex
On Tue, Jun 1, 2021 at 4:17 PM Andrey Grodzovsky
wrote:
Adding some tests to acompany the recently added hot-unplug
feature. For now the test suite is disabled until the feature
propagates from drm-misc-next to drm-next
On 2021-06-03 10:53 p.m., Alex Deucher wrote:
On Thu, Jun 3, 2021 at 9:37 PM Dave Airlie wrote:
On Fri, 4 Jun 2021 at 07:20, Alex Deucher wrote:
Please open a gitlab MR for these.
I'd really prefer these tests all get migrated out of here into igt. I
don't think libdrm_amdgpu really
Ping
Andrey
On 2021-06-02 10:20 a.m., Andrey Grodzovsky wrote:
On 2021-06-02 3:59 a.m., Daniel Vetter wrote:
On Tue, Jun 1, 2021 at 10:17 PM Andrey Grodzovsky
wrote:
Adding some tests to acompany the recently added hot-unplug
feature. For now the test suite is disabled until the feature
It calls drmNodeIsDRM which is private function itself so - if i
implement it in amdgpu part
I still need to expose drmNodeIsDRM. Note that this function is
basically a subset of drmGetDeviceNameFromFd2
Andrey
On 2021-06-02 5:16 a.m., Simon Ser wrote:
Do we really need to make this a public
On 2021-06-02 3:59 a.m., Daniel Vetter wrote:
On Tue, Jun 1, 2021 at 10:17 PM Andrey Grodzovsky
wrote:
Adding some tests to acompany the recently added hot-unplug
feature. For now the test suite is disabled until the feature
propagates from drm-misc-next to drm-next.
Andrey Grodzovsky (7
Used to access device controls
Signed-off-by: Andrey Grodzovsky
---
xf86drm.c | 23 +++
xf86drm.h | 1 +
2 files changed, 24 insertions(+)
diff --git a/xf86drm.c b/xf86drm.c
index edfeb347..a5ecd323 100644
--- a/xf86drm.c
+++ b/xf86drm.c
@@ -4361,6 +4361,29 @@ drm_public
Add just the test suite skeleton.
Signed-off-by: Andrey Grodzovsky
---
tests/amdgpu/amdgpu_test.c | 11
tests/amdgpu/amdgpu_test.h | 23 +++
tests/amdgpu/hotunplug_tests.c | 116 +
tests/amdgpu/meson.build | 1 +
4 files changed, 151
Expose close device and add open device wich preserves
test index.
Signed-off-by: Andrey Grodzovsky
---
tests/amdgpu/amdgpu_test.c | 31 ---
tests/amdgpu/amdgpu_test.h | 3 +++
2 files changed, 31 insertions(+), 3 deletions(-)
diff --git a/tests/amdgpu
Disconnect device while BO is exported.
Signed-off-by: Andrey Grodzovsky
---
tests/amdgpu/hotunplug_tests.c | 46 --
1 file changed, 44 insertions(+), 2 deletions(-)
diff --git a/tests/amdgpu/hotunplug_tests.c b/tests/amdgpu/hotunplug_tests.c
index 6e133a07
Add plug/unplug device and open/close device file
infrastrucutre.
Add basic test - unplug device while device file still
open. Close device file afterwards and replug the device.
Signed-off-by: Andrey Grodzovsky
---
tests/amdgpu/hotunplug_tests.c | 135 +
1 file
Struct access after free
Signed-off-by: Andrey Grodzovsky
---
tests/amdgpu/basic_tests.c | 5 +++--
1 file changed, 3 insertions(+), 2 deletions(-)
diff --git a/tests/amdgpu/basic_tests.c b/tests/amdgpu/basic_tests.c
index 8e7c4916..8b7fd0f6 100644
--- a/tests/amdgpu/basic_tests.c
+++ b/tests
Adding some tests to acompany the recently added hot-unplug
feature. For now the test suite is disabled until the feature
propagates from drm-misc-next to drm-next.
Andrey Grodzovsky (7):
tests/amdgpu: Fix valgrind warning
xf86drm: Add function to retrieve char device path
test/amdgpu: Add
Same as simple test but while doing cs
Signed-off-by: Andrey Grodzovsky
---
tests/amdgpu/hotunplug_tests.c | 128 -
1 file changed, 126 insertions(+), 2 deletions(-)
diff --git a/tests/amdgpu/hotunplug_tests.c b/tests/amdgpu/hotunplug_tests.c
index c2bc1cf2
301 - 400 of 1477 matches
Mail list logo