On Thu, 2025-08-07 at 16:15 +0200, Christian König wrote: > On 05.08.25 12:22, Philipp Stanner wrote: > > On Tue, 2025-08-05 at 11:05 +0200, Christian König wrote: > > > On 24.07.25 17:07, Philipp Stanner wrote: > > > > > +/** > > > > > + * DOC: Scheduler Fence Object > > > > > + * > > > > > + * The scheduler fence object (&struct drm_sched_fence) encapsulates > > > > > the whole > > > > > + * time from pushing the job into the scheduler until the hardware > > > > > has finished > > > > > + * processing it. It is managed by the scheduler. The implementation > > > > > provides > > > > > + * dma_fence interfaces for signaling both scheduling of a command > > > > > submission > > > > > + * as well as finishing of processing. > > > > > + * > > > > > + * The lifetime of this object also follows normal dma_fence > > > > > refcounting rules. > > > > > + */ > > > > > > > > The relict I'm most unsure about is this docu for the scheduler fence. > > > > I know that some drivers are accessing the s_fence, but I strongly > > > > suspect that this is a) unncessary and b) dangerous. > > > > > > Which s_fence member do you mean? The one in the job? That should be > > > harmless as far as I can see. > > > > I'm talking about struct drm_sched_fence. > > Yeah that is necessary for the drivers to know about. We could potentially > abstract it better but we can't really hide it completely. > > > > > > > > But the original draft from Christian hinted at that. So, @Christian, > > > > this would be an opportunity to discuss this matter. > > > > > > > > Otherwise I'd drop this docu section in v2. What users don't know, they > > > > cannot misuse. > > > > > > I would rather like to keep that to avoid misusing the job as the object > > > for tracking the submission lifetime. > > > > Why would a driver ever want to access struct drm_sched_fence? The > > driver knows when it signaled the hardware fence, and it knows when its > > callbacks run_job() and free_job() were invoked. > > > > I'm open to learn what amdgpu does there and why. > > The simplest use case is performance optimization. You sometimes have > submissions which ideally run with others at the same time. > > So we have AMDGPU_CHUNK_ID_SCHEDULED_DEPENDENCIES which basically tries to > cast a fence to a scheduler fence and then only waits for the dependency to > be pushed to the HW instead of waiting for it to finish (see amdgpu_cs.c).
But the driver recognizes that a certain fence got / gets pushed right now through backend_ops.run_job(), doesn't it? > > Another example are gang submissions (where I still have the TODO to actually > fix the code to not crash in an OOM situation). > > Here we have a gang leader and gang members which are guaranteed to run > together on the HW at the same time. > > This works by adding scheduled dependencies to the gang leader so that the > scheduler pushes it to the HW only after all gang members have been pushed. > > The first gang member pushed now triggers a dependency handling which makes > sure that no other gang can be pushed until gang leader is pushed as well. You mean amdgpu registers callbacks to drm_sched_fence? > > > > > > +/** > > > > > + * DOC: Error and Timeout handling > > > > > + * > > > > > + * Errors are signaled by using dma_fence_set_error() on the > > > > > hardware fence > > > > > + * object before signaling it with dma_fence_signal(). Errors are > > > > > then bubbled > > > > > + * up from the hardware fence to the scheduler fence. > > > > > + * > > > > > + * The entity allows querying errors on the last run submission > > > > > using the > > > > > + * drm_sched_entity_error() function which can be used to cancel > > > > > queued > > > > > + * submissions in &struct drm_sched_backend_ops.run_job as well as > > > > > preventing > > > > > + * pushing further ones into the entity in the driver's submission > > > > > function. > > > > > + * > > > > > + * When the hardware fence doesn't signal within a configurable > > > > > amount of time > > > > > + * &struct drm_sched_backend_ops.timedout_job gets invoked. The > > > > > driver should > > > > > + * then follow the procedure described in that callback's > > > > > documentation. > > > > > + * > > > > > + * (TODO: The timeout handler should probably switch to using the > > > > > hardware > > > > > + * fence as parameter instead of the job. Otherwise the handling > > > > > will always > > > > > + * race between timing out and signaling the fence). > > > > > > > > This TODO can probably removed, too. The recently merged > > > > DRM_GPU_SCHED_STAT_NO_HANG has solved this issue. > > > > > > No, it only scratched on the surface of problems here. > > > > > > I'm seriously considering sending a RFC patch to cleanup the job lifetime > > > and implementing this change. > > > > > > Not necessarily giving the HW fence as parameter to the timeout callback, > > > but more generally not letting the scheduler depend on driver behavior. > > > > That's rather vague. Regarding this TODO, "racing between timing out > > and signaling the fence" can now be corrected by the driver. Are there > > more issues? If so, we want to add a new FIXME for them. > > Yeah good point. We basically worked around all those issues now. > > It's just that I still see that we are missing a general concept. E.g. we > applied workaround on top of workaround until it didn't crashed any more > instead of saying ok that is the design does that work? Is it valid? etc... Yes, that seems to have been our destiny for a while now :) :( What I'm afraid of right now is that with the callbacks vs. drm_sched_fence we now potentially have several distinct mechanisms for doing things. The hardware fence is clearly the relevant synchronization object for telling when a job is completed; yet, we also have s_fence->finished. Using it (for what?) is even encouraged by the docu: /** * @finished: this fence is what will be signaled by the scheduler * when the job is completed. * * When setting up an out fence for the job, you should use * this, since it's available immediately upon * drm_sched_job_init(), and the fence returned by the driver * from run_job() won't be created until the dependencies have * resolved. */ Anyways. I think this is a big topic very suitable for our work shop at XDC. I also have some ideas about paths forward that I want to present. P. > > > That said, such an RFC would obviously be great. We can discuss the > > paragraph above there, if you want. > > I will try to hack something together. Not necessarily complete but it should > show the direction. > > Christian. > > > > > > > Regards > > P.