Am 10.05.2017 um 11:20 schrieb zhoucm1:


On 2017年05月10日 17:21, Christian König wrote:
Am 10.05.2017 um 11:00 schrieb zhoucm1:


On 2017年05月10日 16:50, Christian König wrote:
Am 10.05.2017 um 10:38 schrieb zhoucm1:


On 2017年05月10日 16:26, Christian König wrote:
Am 10.05.2017 um 09:31 schrieb Chunming Zhou:
this is an improvement for previous patch, the sched_sync is to store fence that could be skipped as scheduled, when job is executed, we didn't need pipeline_sync if all fences in sched_sync are signalled, otherwise insert
pipeline_sync still.

Change-Id: I26d3a2794272ba94b25753d4bf367326d12f6939
Signed-off-by: Chunming Zhou <[email protected]>
---
  drivers/gpu/drm/amd/amdgpu/amdgpu.h     | 1 +
  drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c  | 7 ++++++-
  drivers/gpu/drm/amd/amdgpu/amdgpu_job.c | 5 ++++-
  3 files changed, 11 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
index 787acd7..ef018bf 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
@@ -1162,6 +1162,7 @@ struct amdgpu_job {
      struct amdgpu_vm    *vm;
      struct amdgpu_ring    *ring;
      struct amdgpu_sync    sync;
+    struct amdgpu_sync    sched_sync;
      struct amdgpu_ib    *ibs;
      struct fence        *fence; /* the hw fence */
      uint32_t        preamble_status;
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c
index 2c6624d..86ad507 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c
@@ -121,6 +121,7 @@ int amdgpu_ib_schedule(struct amdgpu_ring *ring, unsigned num_ibs,
  {
      struct amdgpu_device *adev = ring->adev;
      struct amdgpu_ib *ib = &ibs[0];
+    struct fence *tmp;
      bool skip_preamble, need_ctx_switch;
      unsigned patch_offset = ~0;
      struct amdgpu_vm *vm;
@@ -167,8 +168,12 @@ int amdgpu_ib_schedule(struct amdgpu_ring *ring, unsigned num_ibs,
          return r;
      }
- if (ring->funcs->emit_pipeline_sync && job && job->need_pipeline_sync)
+    if (ring->funcs->emit_pipeline_sync && job &&
+        (tmp = amdgpu_sync_get_fence(&job->sched_sync))) {
+        job->need_pipeline_sync = true;
          amdgpu_ring_emit_pipeline_sync(ring);
+        fence_put(tmp);
+    }
      if (vm) {
amdgpu_ring_insert_nop(ring, extra_nop); /* prevent CE go too fast than DE */ diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
index cfa97ab..fa0c8b1 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
@@ -60,6 +60,7 @@ int amdgpu_job_alloc(struct amdgpu_device *adev, unsigned num_ibs,
      (*job)->need_pipeline_sync = false;
        amdgpu_sync_create(&(*job)->sync);
+    amdgpu_sync_create(&(*job)->sched_sync);
        return 0;
  }
@@ -98,6 +99,7 @@ static void amdgpu_job_free_cb(struct amd_sched_job *s_job)
        fence_put(job->fence);
      amdgpu_sync_free(&job->sync);
+    amdgpu_sync_free(&job->sched_sync);
      kfree(job);
  }
  @@ -107,6 +109,7 @@ void amdgpu_job_free(struct amdgpu_job *job)
        fence_put(job->fence);
      amdgpu_sync_free(&job->sync);
+    amdgpu_sync_free(&job->sched_sync);
      kfree(job);
  }
@@ -154,7 +157,7 @@ static struct fence *amdgpu_job_dependency(struct amd_sched_job *sched_job)
      }
if (amd_sched_dependency_optimized(fence, sched_job->s_entity))
-        job->need_pipeline_sync = true;
+        amdgpu_sync_fence(job->adev, &job->sched_sync, fence);

This can result in an -ENOMEM
will handle it.
and additional to that we only need to remember the last fence optimized like this, not all of them.

So just keep the last one found here in job->sched_fence instead.
I guess this isn't enough.
The dependency is not in order when calling, so the last one is not always the last scheduled fence. And they could be sched fence not hw fence, although they are handled by same hw ring, but the sched fence context isn't same.
so we still need sched_sync here, right?

No, amdgpu_job_dependency is only called again when the returned fence is signaled (or scheduled on the same ring).
Let use give an example for it:
Assume job->sync has two fences(fenceA and fenceB) which could be scheduled. fenceA is from entity1, fenceB is from entity2, but both for gfx engine, but fenceA could be submitted to hw ring behind fenceB.
the order in job->sync list is: others---->fenceA---->fenceB--->others.
when calling amdgpu_job_dependency, fenceA will be checked first, and then fenceB.

If following your proposal, we only store fenceB, but fenceA is the later. Which isn't expected.

Ah! Indeed, I didn't realized that the dependent fence could have already been scheduled.

Mhm, how are we going to handle the out of memory situation then? Sine we are inside a kernel thread we are not supposed to fail at this point.
like grab vmid failed case, add DRM_ERROR, is it ok?

Not ideal, but should at least work for the moment.

Christian.


Regards,
David Zhou

Regards,
Christian.



Regards,
David Zhou

So when this is called and you find that you need to wait for another fence the order is guaranteed.

Regards,
Christian.


Regards,
David zhou

Regards,
Christian.

        return fence;
  }







_______________________________________________
amd-gfx mailing list
[email protected]
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


_______________________________________________
amd-gfx mailing list
[email protected]
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

Reply via email to