Patch #1 looks like a fix to me, please add my rb and push ASAP.

You should separate such fixes from bugger sets, send out individually and maybe ping Alex to review and add them to his fixes branch.

Patch #2:
+               amdgpu_ttm_placement_init(adev, &placement,
+                                         placements, AMDGPU_GEM_DOMAIN_GTT,
+                                         AMDGPU_GEM_CREATE_CPU_GTT_USWC);
+
+               r = amdgpu_bo_create_restricted(adev, size, byte_align, kernel,
+                                               AMDGPU_GEM_DOMAIN_GTT,
+                                               AMDGPU_GEM_CREATE_CPU_GTT_USWC,
+                                               NULL, &placement,
+                                               (*bo_ptr)->tbo.resv,
+                                               &(*bo_ptr)->shadow);
+               if (r)
+                       amdgpu_bo_unref(bo_ptr);

I think we should use the same reservation object as the parent BO here and add something like "(*bo_ptr)->shadow->parent = amdgpu_bo_ref(*bo_ptr);".

This way we can be sure that the shadow isn't freed before the BO.

+       } else
+               (*bo_ptr)->shadow = NULL;
From the coding style when an "if" has {} the else part needs {} as well. But I think BOs are zero allocated anyway, so we can probably completely drop that.

+               if (tbo == NULL)
Well that check is totally nonsense in the originally code as well. We should probably remove it.

Additional to that I would add something to amdgpu_cs_list_validate() to always validate the shadow when the real BO is validated.

Patch #3:
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
index d415805..ae4608c 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
@@ -875,6 +875,8 @@ struct amdgpu_ring {
  struct amdgpu_vm_pt {
        struct amdgpu_bo_list_entry     entry;
        uint64_t                        addr;
+       struct amdgpu_bo_list_entry     entry_shadow;
+       uint64_t                        addr_shadow;
  };
This becomes unnecessary when we use the same reservation object for the shadow and validate during CS and amdgpu_gem_va_update_vm().

Should reduce the CPU overhead of this significantly, cause otherwise we need to reserve and validate another BO for each page table.

For the remaining patches I would still use the approach of the common backup procedure as we discussed (with inverting the order for the VM BOs). Doubling all the page table updates clearly doesn't sound like a good idea to me.

Regards,
Christian.

Am 02.08.2016 um 09:48 schrieb Chunming Zhou:
Since we cannot make sure VRAM is safe after gpu reset, page table backup
is neccessary, shadow page table is sense way to recovery page talbe when
gpu reset happens.
We need to allocate GTT bo as the shadow of VRAM bo when creating page table,
and make them same. After gpu reset, we will need to use SDMA to copy GTT bo
content to VRAM bo, then page table will be recoveried.

Chunming Zhou (13):
   drm/amdgpu: irq resume should be immediately after gpu resume
   drm/amdgpu: add shadow bo support
   drm/amdgpu: set shadow flag for pd/pt bo
   drm/amdgpu: update shadow pt bo while update pt
   drm/amdgpu: update pd shadow while updating pd
   drm/amdgpu: implement amdgpu_vm_recover_page_table_from_shadow
   drm/amdgpu: link all vm clients
   drm/amdgpu: add vm_list_lock
   drm/amd: add block entity function
   drm/amdgpu: recover page tables after gpu reset
   drm/amd: wait neccessary dependency before running job
   drm/amdgpu: add vm recover pt fence
   drm/amdgpu: add backup condition for shadow page table

  drivers/gpu/drm/amd/amdgpu/amdgpu.h           |  15 ++
  drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c        |   6 +
  drivers/gpu/drm/amd/amdgpu/amdgpu_device.c    |  33 +++-
  drivers/gpu/drm/amd/amdgpu/amdgpu_job.c       |   5 +
  drivers/gpu/drm/amd/amdgpu/amdgpu_object.c    |  36 ++++-
  drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c        | 219 ++++++++++++++++++++++----
  drivers/gpu/drm/amd/scheduler/gpu_scheduler.c |  30 +++-
  drivers/gpu/drm/amd/scheduler/gpu_scheduler.h |   3 +
  include/uapi/drm/amdgpu_drm.h                 |   2 +
  9 files changed, 316 insertions(+), 33 deletions(-)


_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

Reply via email to