On Thu, May 15, 2025 at 2:28 AM Philipp Stanner <pha...@mailbox.org> wrote: > > Hello, > > On Wed, 2025-05-14 at 09:59 -0700, Rob Clark wrote: > > From: Rob Clark <robdcl...@chromium.org> > > > > Similar to the existing credit limit mechanism, but applying to jobs > > enqueued to the scheduler but not yet run. > > > > The use case is to put an upper bound on preallocated, and > > potentially > > unneeded, pgtable pages. When this limit is exceeded, pushing new > > jobs > > will block until the count drops below the limit. > > the commit message doesn't make clear why that's needed within the > scheduler. > > From what I understand from the cover letter, this is a (rare?) Vulkan > feature. And as important as Vulkan is, it's the drivers that implement > support for it. I don't see why the scheduler is a blocker.
Maybe not rare, or at least it comes up with a group of deqp-vk tests ;-) Basically it is a way to throttle userspace to prevent it from OoM'ing itself. (I suppose userspace could throttle itself, but it doesn't really know how much pre-allocation will need to be done for pgtable updates.) > All the knowledge about when to stop pushing into the entity is in the > driver, and the scheduler obtains all the knowledge about that from the > driver anyways. > > So you could do > > if (my_vulkan_condition()) > drm_sched_entity_push_job(); > > couldn't you? It would need to reach in and use the sched's job_scheduled wait_queue_head_t... if that isn't too ugly, maybe the rest could be implemented on top of sched. But it seemed like a reasonable thing for the scheduler to support directly. > > > > Signed-off-by: Rob Clark <robdcl...@chromium.org> > > --- > > drivers/gpu/drm/scheduler/sched_entity.c | 16 ++++++++++++++-- > > drivers/gpu/drm/scheduler/sched_main.c | 3 +++ > > include/drm/gpu_scheduler.h | 13 ++++++++++++- > > 3 files changed, 29 insertions(+), 3 deletions(-) > > > > diff --git a/drivers/gpu/drm/scheduler/sched_entity.c > > b/drivers/gpu/drm/scheduler/sched_entity.c > > index dc0e60d2c14b..c5f688362a34 100644 > > --- a/drivers/gpu/drm/scheduler/sched_entity.c > > +++ b/drivers/gpu/drm/scheduler/sched_entity.c > > @@ -580,11 +580,21 @@ void drm_sched_entity_select_rq(struct > > drm_sched_entity *entity) > > * under common lock for the struct drm_sched_entity that was set up > > for > > * @sched_job in drm_sched_job_init(). > > */ > > -void drm_sched_entity_push_job(struct drm_sched_job *sched_job) > > +int drm_sched_entity_push_job(struct drm_sched_job *sched_job) > > Return code would need to be documented in the docstring, too. If we'd > go for that solution. > > > { > > struct drm_sched_entity *entity = sched_job->entity; > > + struct drm_gpu_scheduler *sched = sched_job->sched; > > bool first; > > ktime_t submit_ts; > > + int ret; > > + > > + ret = wait_event_interruptible( > > + sched->job_scheduled, > > + atomic_read(&sched->enqueue_credit_count) <= > > + sched->enqueue_credit_limit); > > This very significantly changes the function's semantics. This function > is used in a great many drivers, and here it would be transformed into > a function that can block. > > From what I see below those credits are to be optional. But even if, it > needs to be clearly documented when a function can block. Sure. The behavior changes only for drivers that use the enqueue_credit_limit, so other drivers should be unaffected. I can improve the docs. (Maybe push_credit or something else would be a better name than enqueue_credit?) > > > + if (ret) > > + return ret; > > + atomic_add(sched_job->enqueue_credits, &sched- > > >enqueue_credit_count); > > > > trace_drm_sched_job(sched_job, entity); > > atomic_inc(entity->rq->sched->score); > > @@ -609,7 +619,7 @@ void drm_sched_entity_push_job(struct > > drm_sched_job *sched_job) > > spin_unlock(&entity->lock); > > > > DRM_ERROR("Trying to push to a killed > > entity\n"); > > - return; > > + return -EINVAL; > > } > > > > rq = entity->rq; > > @@ -626,5 +636,7 @@ void drm_sched_entity_push_job(struct > > drm_sched_job *sched_job) > > > > drm_sched_wakeup(sched); > > } > > + > > + return 0; > > } > > EXPORT_SYMBOL(drm_sched_entity_push_job); > > diff --git a/drivers/gpu/drm/scheduler/sched_main.c > > b/drivers/gpu/drm/scheduler/sched_main.c > > index 9412bffa8c74..1102cca69cb4 100644 > > --- a/drivers/gpu/drm/scheduler/sched_main.c > > +++ b/drivers/gpu/drm/scheduler/sched_main.c > > @@ -1217,6 +1217,7 @@ static void drm_sched_run_job_work(struct > > work_struct *w) > > > > trace_drm_run_job(sched_job, entity); > > fence = sched->ops->run_job(sched_job); > > + atomic_sub(sched_job->enqueue_credits, &sched- > > >enqueue_credit_count); > > complete_all(&entity->entity_idle); > > drm_sched_fence_scheduled(s_fence, fence); > > > > @@ -1253,6 +1254,7 @@ int drm_sched_init(struct drm_gpu_scheduler > > *sched, const struct drm_sched_init_ > > > > sched->ops = args->ops; > > sched->credit_limit = args->credit_limit; > > + sched->enqueue_credit_limit = args->enqueue_credit_limit; > > sched->name = args->name; > > sched->timeout = args->timeout; > > sched->hang_limit = args->hang_limit; > > @@ -1308,6 +1310,7 @@ int drm_sched_init(struct drm_gpu_scheduler > > *sched, const struct drm_sched_init_ > > INIT_LIST_HEAD(&sched->pending_list); > > spin_lock_init(&sched->job_list_lock); > > atomic_set(&sched->credit_count, 0); > > + atomic_set(&sched->enqueue_credit_count, 0); > > INIT_DELAYED_WORK(&sched->work_tdr, drm_sched_job_timedout); > > INIT_WORK(&sched->work_run_job, drm_sched_run_job_work); > > INIT_WORK(&sched->work_free_job, drm_sched_free_job_work); > > diff --git a/include/drm/gpu_scheduler.h > > b/include/drm/gpu_scheduler.h > > index da64232c989d..d830ffe083f1 100644 > > --- a/include/drm/gpu_scheduler.h > > +++ b/include/drm/gpu_scheduler.h > > @@ -329,6 +329,7 @@ struct drm_sched_fence *to_drm_sched_fence(struct > > dma_fence *f); > > * @s_fence: contains the fences for the scheduling of job. > > * @finish_cb: the callback for the finished fence. > > * @credits: the number of credits this job contributes to the > > scheduler > > + * @enqueue_credits: the number of enqueue credits this job > > contributes > > * @work: Helper to reschedule job kill to different context. > > * @id: a unique id assigned to each job scheduled on the scheduler. > > * @karma: increment on every hang caused by this job. If this > > exceeds the hang > > @@ -366,6 +367,7 @@ struct drm_sched_job { > > > > enum drm_sched_priority s_priority; > > u32 credits; > > + u32 enqueue_credits; > > What's the policy of setting this? > > drm_sched_job_init() and drm_sched_job_arm() are responsible for > initializing jobs. It should be set before drm_sched_entity_push_job(). I wouldn't really expect drivers to know the value at drm_sched_job_init() time. But they would by the time drm_sched_entity_push_job() is called. > > /** @last_dependency: tracks @dependencies as they signal */ > > unsigned int last_dependency; > > atomic_t karma; > > @@ -485,6 +487,10 @@ struct drm_sched_backend_ops { > > * @ops: backend operations provided by the driver. > > * @credit_limit: the credit limit of this scheduler > > * @credit_count: the current credit count of this scheduler > > + * @enqueue_credit_limit: the credit limit of jobs pushed to > > scheduler and not > > + * yet run > > + * @enqueue_credit_count: the current crdit count of jobs pushed to > > scheduler > > + * but not yet run > > * @timeout: the time after which a job is removed from the > > scheduler. > > * @name: name of the ring for which this scheduler is being used. > > * @num_rqs: Number of run-queues. This is at most > > DRM_SCHED_PRIORITY_COUNT, > > @@ -518,6 +524,8 @@ struct drm_gpu_scheduler { > > const struct drm_sched_backend_ops *ops; > > u32 credit_limit; > > atomic_t credit_count; > > + u32 enqueue_credit_limit; > > + atomic_t enqueue_credit_count; > > long timeout; > > const char *name; > > u32 num_rqs; > > @@ -550,6 +558,8 @@ struct drm_gpu_scheduler { > > * @num_rqs: Number of run-queues. This may be at most > > DRM_SCHED_PRIORITY_COUNT, > > * as there's usually one run-queue per priority, but may > > be less. > > * @credit_limit: the number of credits this scheduler can hold from > > all jobs > > + * @enqueue_credit_limit: the number of credits that can be enqueued > > before > > + * drm_sched_entity_push_job() blocks > > Is it optional or not? Can it be deactivated? > > It seems to me that it is optional, and so far only used in msm. If > there are no other parties in need for that mechanism, the right place > to have this feature probably is msm, which has all the knowledge about > when to block already. > As with the existing credit_limit, it is optional. Although I think it would be also useful for other drivers that use drm sched for VM_BIND queues, for the same reason. BR, -R > > Regards > P. > > > > * @hang_limit: number of times to allow a job to hang before > > dropping it. > > * This mechanism is DEPRECATED. Set it to 0. > > * @timeout: timeout value in jiffies for submitted jobs. > > @@ -564,6 +574,7 @@ struct drm_sched_init_args { > > struct workqueue_struct *timeout_wq; > > u32 num_rqs; > > u32 credit_limit; > > + u32 enqueue_credit_limit; > > unsigned int hang_limit; > > long timeout; > > atomic_t *score; > > @@ -600,7 +611,7 @@ int drm_sched_job_init(struct drm_sched_job *job, > > struct drm_sched_entity *entity, > > u32 credits, void *owner); > > void drm_sched_job_arm(struct drm_sched_job *job); > > -void drm_sched_entity_push_job(struct drm_sched_job *sched_job); > > +int drm_sched_entity_push_job(struct drm_sched_job *sched_job); > > int drm_sched_job_add_dependency(struct drm_sched_job *job, > > struct dma_fence *fence); > > int drm_sched_job_add_syncobj_dependency(struct drm_sched_job *job, >