Re: [PATCH 00/13] shadow page table support

Christian König Mon, 25 Jul 2016 03:32:00 -0700

First of all patches #10 and #11 look like bug fixes to existing code tome. So we should fix those problems before working on anything else.


Patch #10 is Reviewed-by: Christian König <[email protected]>


Patch #11:

     list_for_each_entry(s_job, &sched->ring_mirror_list, node) {
         struct amd_sched_fence *s_fence = s_job->s_fence;
-        struct fence *fence = sched->ops->run_job(s_job);
+        struct fence *fence;

+        spin_unlock(&sched->job_list_lock);
+        fence = sched->ops->run_job(s_job);
         atomic_inc(&sched->hw_rq_count);
         if (fence) {
             s_fence->parent = fence_get(fence);

@@ -451,6 +453,7 @@ void amd_sched_job_recovery(structamd_gpu_scheduler *sched)

             DRM_ERROR("Failed to run job!\n");
             amd_sched_process_job(NULL, &s_fence->cb);
         }
+        spin_lock(&sched->job_list_lock);
     }
     spin_unlock(&sched->job_list_lock);

The problem is that the job might complete while we dropped the lock.

Please use list_for_each_entry_safe here and add a comment why the listcould be modified in the meantime.

With that fixed the patch is Reviewed-by: Christian König<[email protected]> as well.

The remaining set looks very good to me as well, but I was ratherthinking of a more general approach instead of making it VM PD/PT specific.

For example we also need to backup/restore shaders when a hard GPU resethappens.


So I would suggest the following:

1. We add an optional "shadow" flag so that when a BO in VRAM isallocated we also allocate a shadow BO in GART.

2. We have another "backup" flag that says on the next commandsubmission the BO is backed up from VRAM to GART before that submission.

3. We set the shadow flag for VM PD/PT BOs and every time we modify themset the backup flag so they get backed up on next CS.

4. We add an IOCTL to allow setting the backup flag from userspace sothat we can trigger another backup even after the first CS.


What do you think?

Regards,
Christian.

Am 25.07.2016 um 09:22 schrieb Chunming Zhou:

Since we cannot make sure VRAM is safe after gpu reset, page table backup
is neccessary, shadow page table is sense way to recovery page talbe when
gpu reset happens.
We need to allocate GTT bo as the shadow of VRAM bo when creating page table,
and make them same. After gpu reset, we will need to use SDMA to copy GTT bo
content to VRAM bo, then page table will be recoveried.

Chunming Zhou (13):
   drm/amdgpu: add pd/pt bo shadow
   drm/amdgpu: update shadow pt bo while update pt
   drm/amdgpu: update pd shadow while updating pd
   drm/amdgpu: implement amdgpu_vm_recover_page_table_from_shadow
   drm/amdgpu: link all vm clients
   drm/amdgpu: add vm_list_lock
   drm/amd: add block entity function
   drm/amdgpu: recover page tables after gpu reset
   drm/amdgpu: add vm recover pt fence
   drm/amd: reset hw count when reset job
   drm/amd: fix deadlock of job_list_lock
   drm/amd: wait neccessary dependency before running job
   drm/amdgpu: fix sched deadoff

  drivers/gpu/drm/amd/amdgpu/amdgpu.h           |  17 ++-
  drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c        |  12 +-
  drivers/gpu/drm/amd/amdgpu/amdgpu_device.c    |  30 ++++-
  drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c       |   5 +-
  drivers/gpu/drm/amd/amdgpu/amdgpu_job.c       |   5 +
  drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c        | 161 ++++++++++++++++++++++++--
  drivers/gpu/drm/amd/scheduler/gpu_scheduler.c |  35 +++++-
  drivers/gpu/drm/amd/scheduler/gpu_scheduler.h |   3 +
  8 files changed, 250 insertions(+), 18 deletions(-)


_______________________________________________
amd-gfx mailing list
[email protected]
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

Re: [PATCH 00/13] shadow page table support

Reply via email to