sched: Avoid double re-lock on the job free path

Tvrtko Ursulin Fri, 04 Jul 2025 08:15:11 -0700


On 04/07/2025 14:59, Philipp Stanner wrote:

On Fri, 2025-07-04 at 14:30 +0100, Tvrtko Ursulin wrote:


On 04/07/2025 13:56, Philipp Stanner wrote:

On Fri, 2025-07-04 at 09:29 -0300, Maíra Canal wrote:

Hi Tvrtko,

On 23/06/25 09:27, Tvrtko Ursulin wrote:

Currently the job free work item will lock sched->job_list_lock
first time
to see if there are any jobs, free a single job, and then lock
again to
decide whether to re-queue itself if there are more finished
jobs.

Since drm_sched_get_finished_job() already looks at the second
job
in the
queue we can simply add the signaled check and have it return
the
presence
of more jobs to free to the caller. That way the work item does
not
have
to lock the list again and repeat the signaled check.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursu...@igalia.com>
Cc: Christian König <christian.koe...@amd.com>
Cc: Danilo Krummrich <d...@kernel.org>
Cc: Matthew Brost <matthew.br...@intel.com>
Cc: Philipp Stanner <pha...@kernel.org>
---
    drivers/gpu/drm/scheduler/sched_main.c | 39 +++++++++++-----
-----
-----
    1 file changed, 16 insertions(+), 23 deletions(-)

diff --git a/drivers/gpu/drm/scheduler/sched_main.c
b/drivers/gpu/drm/scheduler/sched_main.c
index 1f077782ec12..c6c26aec07b6 100644
--- a/drivers/gpu/drm/scheduler/sched_main.c
+++ b/drivers/gpu/drm/scheduler/sched_main.c
@@ -366,22 +366,6 @@ static void
__drm_sched_run_free_queue(struct
drm_gpu_scheduler *sched)
                queue_work(sched->submit_wq, &sched-

work_free_job);

-/**

- * drm_sched_run_free_queue - enqueue free-job work if ready
- * @sched: scheduler instance
- */
-static void drm_sched_run_free_queue(struct drm_gpu_scheduler
*sched)
-{
-       struct drm_sched_job *job;
-
-       spin_lock(&sched->job_list_lock);
-       job = list_first_entry_or_null(&sched->pending_list,
-                                      struct drm_sched_job,
list);
-       if (job && dma_fence_is_signaled(&job->s_fence-

finished))

-               __drm_sched_run_free_queue(sched);
-       spin_unlock(&sched->job_list_lock);
-}
-
    /**
     * drm_sched_job_done - complete a job
     * @s_job: pointer to the job which is done
@@ -1102,12 +1086,13 @@ drm_sched_select_entity(struct
drm_gpu_scheduler *sched)
     * drm_sched_get_finished_job - fetch the next finished job
to be
destroyed
     *
     * @sched: scheduler instance
+ * @have_more: are there more finished jobs on the list
     *
     * Returns the next finished job from the pending list (if
there
is one)
     * ready for it to be destroyed.
     */
    static struct drm_sched_job *
-drm_sched_get_finished_job(struct drm_gpu_scheduler *sched)
+drm_sched_get_finished_job(struct drm_gpu_scheduler *sched,
bool
*have_more)
    {
        struct drm_sched_job *job, *next;

@@ -1115,22 +1100,27 @@ drm_sched_get_finished_job(struct

drm_gpu_scheduler *sched)

job = list_first_entry_or_null(&sched->pending_list,

                                       struct drm_sched_job,
list);
-
        if (job && dma_fence_is_signaled(&job->s_fence-

finished))

{
                /* remove job from pending_list */
                list_del_init(&job->list);

/* cancel this job's TO timer */

                cancel_delayed_work(&sched->work_tdr);
-               /* make the scheduled timestamp more accurate
*/
+
+               *have_more = false;
                next = list_first_entry_or_null(&sched-

pending_list,

                                                typeof(*next),
list);
-
                if (next) {
+                       /* make the scheduled timestamp more
accurate */
                        if
(test_bit(DMA_FENCE_FLAG_TIMESTAMP_BIT,
                                     &next->s_fence-

scheduled.flags))

                                next->s_fence-

scheduled.timestamp

=
                                        dma_fence_timestamp(&j
ob-

s_fence->finished);

+
+                       if
(test_bit(DMA_FENCE_FLAG_SIGNALED_BIT,
+                                    &next->s_fence-

finished.flags))


Shouldn't we use dma_fence_is_signaled() to keep the same check
that
we
have in drm_sched_run_free_queue()?


There is a paused-ongoing discussion about this function:

https://lore.kernel.org/all/20250522112540.161411-2-pha...@kernel.org/


dma_fence_is_signaled() can have side effects by actually
signaling,
instead of just checking.

Not sure if Tvrtko wanted to bypass that behavior here, though.


No, no ulterior motives here. :)

It is ages I wrote this, but now I revisited it, and AFAICT I don't
see
that it matters in this case.

It is a scheduler fence which does not implement fence->ops-

signaled()

so opportunistic signaling does not come into the picture.

I am happy to change it to dma_fence_is_signaled() if that is the
preference.


Its our (scheduler's) fence, so we can be sure dma_fence_is_signaled()
is OK.


Okay, I changed it to dma_fence_is_signaled() locally.

Regards,

Tvrtko

I'd still prefer if we could get Christian to accept a function with a
superior name, though..

P.


Regards,

Tvrtko

+                               *have_more = true;
+
                        /* start TO timer for next job */
                        drm_sched_start_timeout(sched);
                }

Re: [PATCH v5 04/16] drm/sched: Avoid double re-lock on the job free path

Reply via email to