Re: [PATCH] drm/sched: Mandate usage of drm_sched_job_cleanup()

Tvrtko Ursulin Sat, 18 Oct 2025 15:39:20 -0700


On 14/10/2025 12:56, Philipp Stanner wrote:

On Mon, 2025-10-13 at 13:20 +0100, Tvrtko Ursulin wrote:


On 26/09/2025 13:36, Philipp Stanner wrote:

drm_sched_job_cleanup()'s documentation so far uses relatively soft
language, only "recommending" usage of the function. To avoid memory
leaks and, potentiall, other bugs, however, the function has to be used.

Demand usage of the function explicitly.

Signed-off-by: Philipp Stanner <[email protected]>
---
   drivers/gpu/drm/scheduler/sched_main.c | 4 ++--
   1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/scheduler/sched_main.c 
b/drivers/gpu/drm/scheduler/sched_main.c
index 46119aacb809..0a9df9e61963 100644
--- a/drivers/gpu/drm/scheduler/sched_main.c
+++ b/drivers/gpu/drm/scheduler/sched_main.c
@@ -1030,7 +1030,7 @@ EXPORT_SYMBOL(drm_sched_job_has_dependency);
    *
    * Cleans up the resources allocated with drm_sched_job_init().
    *
- * Drivers should call this from their error unwind code if @job is aborted
+ * Drivers need to call this from their error unwind code if @job is aborted


Should is indeed wrong. I think it could be better to go with "shall" or
"must" though. Since those two are more standard language for this.


"need to" is standard UK(?) English for "must to" I think.

But I'm fine with "must"

Yeah will is standard English and IMO understandable. I was justthinking whether it would best to stick to the established "requirementslanguage". For example:


https://www.nasa.gov/reference/appendix-c-how-to-write-a-good-requirement/

https://argondigital.com/blog/product-management/using-the-correct-terms-shall-will-should/

https://reqi.io/articles/shall-vs-should-vs-will-vs-must-whats-the-difference

But I am okay with need, your call.

    * before drm_sched_job_arm() is called.
    *
    * drm_sched_job_arm() is a point of no return since it initializes the 
fences
@@ -1038,7 +1038,7 @@ EXPORT_SYMBOL(drm_sched_job_has_dependency);
    * submit it with drm_sched_entity_push_job() and cannot simply abort it by
    * calling drm_sched_job_cleanup().
    *
- * This function should be called in the &drm_sched_backend_ops.free_job 
callback.
+ * This function must be called in the &drm_sched_backend_ops.free_job 
callback.
    */
   void drm_sched_job_cleanup(struct drm_sched_job *job)
   {


Hm, having read the thread so far the situation seems not so easy to
untangle.

On one hand I see Matt's point that free_job callback is a bit
misleadingly named and that there isn't anything really requiring
drm_sched_job_cleanup() to be called exactly from there. Free_job
semantics seem to be "job is done, *scheduler* is done with it".

Drm_sched_job_cleanup() already says it has to be called after the point
of no return (arm) so it could be left at drivers discretion (de facto
state today) to choose how and when to do it.

What I did not get is what is actually the perceived problem with
letting this stay? (Ie. "should" indicating it is recommended, but not a
must/shall.)


I think the most correct formulation with our current mess would be
"cleanup() must be called earliest in free_job(), or later (if there is
a very good reason for that)."

Question would be how to document that.

I still have that big life time docu patch pending. Let me dig into how
I could combine that..


Ok.

Regards,

Tvrtko

Re: [PATCH] drm/sched: Mandate usage of drm_sched_job_cleanup()

Reply via email to