sched: Convert drm scheduler to use a work queue rather than kthread

Christian König Wed, 16 Aug 2023 07:59:51 -0700

Am 16.08.23 um 14:30 schrieb Danilo Krummrich:

On 8/16/23 16:05, Christian König wrote:
Am 16.08.23 um 13:30 schrieb Danilo Krummrich:
Hi Matt,
On 8/11/23 04:31, Matthew Brost wrote:
In XE, the new Intel GPU driver, a choice has made to have a 1 to 1
mapping between a drm_gpu_scheduler and drm_sched_entity. At firstthis
seems a bit odd but let us explain the reasoning below.

1. In XE the submission order from multiple drm_sched_entity is not
guaranteed to be the same completion even if targeting the samehardware
engine. This is because in XE we have a firmware scheduler, the GuC,
which allowed to reorder, timeslice, and preempt submissions. If ausingshared drm_gpu_scheduler across multiple drm_sched_entity, the TDRfalls
apart as the TDR expects submission order == completion order. Using a
dedicated drm_gpu_scheduler per drm_sched_entity solve this problem.

2. In XE submissions are done via programming a ring buffer (circular
buffer), a drm_gpu_scheduler provides a limit on number of jobs, ifthelimit of number jobs is set to RING_SIZE / MAX_SIZE_PER_JOB we getflow
control on the ring for free.
In XE, where does the limitation of MAX_SIZE_PER_JOB come from?
In Nouveau we currently do have such a limitation as well, but it isderived from the RING_SIZE, hence RING_SIZE / MAX_SIZE_PER_JOB wouldalways be 1. However, I think most jobs won't actually utilize thewhole ring.
Well that should probably rather be RING_SIZE / MAX_SIZE_PER_JOB =hw_submission_limit (or even hw_submission_limit - 1 when the hwcan't distinct full vs empty ring buffer).
Not sure if I get you right, let me try to clarify what I was tryingto say: I wanted to say that in Nouveau MAX_SIZE_PER_JOB isn't reallylimited by anything other than the RING_SIZE and hence we'd neverallow more than 1 active job.

But that lets the hw run dry between submissions. That is usually apretty horrible idea for performance.

However, it seems to be more efficient to base ring flow control onthe actual size of each incoming job rather than the worst case,namely the maximum size of a job.

That doesn't sounds like a good idea to me. See we don't limit thenumber of submitted jobs based on the ring size, but rather we calculatethe ring size based on the number of submitted jobs.

In other words the hw_submission_limit defines the ring size, not theother way around. And you usually want the hw_submission_limit as low aspossible for good scheduler granularity and to avoid extra overhead.


Christian.

Otherwise your scheduler might just overwrite the ring buffer bypushing things to fast.
Christian.
Given that, it seems like it would be better to let the schedulerkeep track of empty ring "slots" instead, such that the schedulercan deceide whether a subsequent job will still fit on the ring andif not re-evaluate once a previous job finished. Of course eachsubmitted job would be required to carry the number of slots itrequires on the ring.
What to you think of implementing this as alternative flow controlmechanism? Implementation wise this could be a union with theexisting hw_submission_limit.
- Danilo
A problem with this design is currently a drm_gpu_scheduler uses a
kthread for submission / job cleanup. This doesn't scale if a large
number of drm_gpu_scheduler are used. To work around the scalingissue,
use a worker rather than kthread for submission / job cleanup.

v2:
   - (Rob Clark) Fix msm build
   - Pass in run work queue
v3:
   - (Boris) don't have loop in worker
v4:
- (Tvrtko) break out submit ready, stop, start helpers into ownpatch
Signed-off-by: Matthew Brost <matthew.br...@intel.com>

Re: [PATCH v2 1/9] drm/sched: Convert drm scheduler to use a work queue rather than kthread

Reply via email to