On 07/05/2025 13:33, Maíra Canal wrote:
Hi Tvrtko,
Thanks for the review!
On 06/05/25 08:32, Tvrtko Ursulin wrote:
On 03/05/2025 21:59, Maíra Canal wrote:
When the DRM scheduler times out, it's possible that the GPU isn't hung;
instead, a job may still be running, and there may be no valid reason to
reset the hardware. This can occur in two situations:
1. The GPU exposes some mechanism that ensures the GPU is still
making
progress. By checking this mechanism, we can safely skip the
reset,
rearm the timeout, and allow the job to continue running until
completion. This is the case for v3d and Etnaviv.
2. TDR has fired before the IRQ that signals the fence. Consequently,
the job actually finishes, but it triggers a timeout before
signaling
the completion fence.
These two scenarios are problematic because we remove the job from the
`sched->pending_list` before calling `sched->ops->timedout_job()`. This
means that when the job finally signals completion (e.g. in the IRQ
handler), the scheduler won't call `sched->ops->free_job()`. As a
result,
the job and its resources won't be freed, leading to a memory leak.
To resolve this issue, we create a new `drm_gpu_sched_stat` that
allows a
driver to skip the reset. This new status will indicate that the job
should be reinserted into the pending list, and the driver will still
signal its completion.
[...]
diff --git a/include/drm/gpu_scheduler.h b/include/drm/gpu_scheduler.h
index
1a7e377d4cbb4fc12ed93c548b236970217945e8..fe9043b6d43141bee831b5fc16b927202a507d51 100644
--- a/include/drm/gpu_scheduler.h
+++ b/include/drm/gpu_scheduler.h
@@ -389,11 +389,13 @@ struct drm_sched_job {
* @DRM_GPU_SCHED_STAT_NONE: Reserved. Do not use.
* @DRM_GPU_SCHED_STAT_NOMINAL: Operation succeeded.
* @DRM_GPU_SCHED_STAT_ENODEV: Error: Device is not available anymore.
+ * @DRM_GPU_SCHED_STAT_RUNNING: GPU is still running, so skip the
reset.
s/GPU/job/ ?
*/
enum drm_gpu_sched_stat {
DRM_GPU_SCHED_STAT_NONE,
DRM_GPU_SCHED_STAT_NOMINAL,
DRM_GPU_SCHED_STAT_ENODEV,
+ DRM_GPU_SCHED_STAT_RUNNING,
I am wondering if we could make it more obvious what is the difference
between "nominal" and "running" and from whose point of view should
those statuses be considered.
> > So far we have "nominal" which means scheduler/hardware is working
fine
but the job may or may have not been cancelled. With "running" we kind
of split it into two sub-statuses and it would be nice for that to be
intuitively visible from the naming. But I struggle to suggest an
elegant name while preserving nominal as is.
I was thinking: how about changing DRM_GPU_SCHED_STAT_NOMINAL to
DRM_GPU_SCHED_STAT_RESET (the hardware is fine, but we reset it)?
Then, when we skip the reset, we would have DRM_GPU_SCHED_STAT_NOMINAL
(which means the hardware is fine and we didn't reset it).
I'm open to other suggestions.
DRM_GPU_SCHED_STAT_RESET sounds like a good name and seems to paint a
consistent story between running - reset - enodev.
Thinking out loud here - perhaps that is pointing towards an
alternative that instead of a new status, a new helper to re-insert
the single job (like drm_sched_resubmit_job(sched, job)) would fit
better? Although it would be more churn.
Although your solution might be more elegant, I'm worried that such a
function could be used improperly by new users (e.g. being called in
contexts other than `timedout_job()`).
We could call it drm_sched_untimedout_job(). </humour>
I'd prefer to have a new status as it'll be use solely for
`timedout_job()` (making it harder for users to use it inappropriately).
With the addition of Matthew's feedback (calling
`drm_sched_run_free_queue()` after adding the job to the pending list),
I think it makes even more sense to keep it inside the timeout function.
I hope others can chime in and give their opinions about your idea.
Yeah - Philipp - Danilo - what do you prefer? Third enum with or a new
helper?
Regards,
Tvrtko