+Cc Sima, Dave
On Mon, 2025-09-29 at 16:07 +0200, Danilo Krummrich wrote:
> On Wed Sep 3, 2025 at 5:23 PM CEST, Tvrtko Ursulin wrote:
> > This is another respin of this old work^1 which since v7 is a total rewrite
> > and
> > completely changes how the control is done.
>
> I only got some of the
On Sun, 2025-09-28 at 16:34 +0200, Christian König wrote:
> On 27.09.25 11:01, Philipp Stanner wrote:
> > On Fri, 2025-09-26 at 09:10 -0700, Boqun Feng wrote:
> > > On Thu, Sep 18, 2025 at 02:30:59PM +0200, Philipp Stanner wrote:
> > > > dma_fence is a synchronizatio
On Fri, 2025-09-26 at 09:10 -0700, Boqun Feng wrote:
> On Thu, Sep 18, 2025 at 02:30:59PM +0200, Philipp Stanner wrote:
> > dma_fence is a synchronization mechanism which is needed by virtually
> > all GPU drivers.
> >
> > A dma_fence offers many features, among wh
drm_sched_job_cleanup()'s documentation so far uses relatively soft
language, only "recommending" usage of the function. To avoid memory
leaks and, potentiall, other bugs, however, the function has to be used.
Demand usage of the function explicitly.
Signed-off-by: Philipp Stanne
On Thu, 2025-09-25 at 12:52 +0100, Tvrtko Ursulin wrote:
>
> On 24/09/2025 10:11, Philipp Stanner wrote:
> > On Wed, 2025-09-03 at 11:18 +0100, Tvrtko Ursulin wrote:
> > > To implement fair scheduling we need a view into the GPU time consumed by
> > > entities. Pr
On Thu, 2025-09-18 at 15:52 +0200, Boqun Feng wrote:
> On Thu, Sep 18, 2025 at 02:30:59PM +0200, Philipp Stanner wrote:
> [...]
> > ---
> > So. ¡Hola!
> >
> > This is a highly WIP RFC. It's obviously at many places not yet
> > conforming very well to Ru
is to do things like general improvements by renaming
variables (see my comments in the previous patch) in a separate
cleanup-patch following this one.
Few comments below
>
> Signed-off-by: Tvrtko Ursulin
> Cc: Christian König
> Cc: Danilo Krummrich
> Cc: Matthew Bros
eted jobs as soon as possible so the metric is most up to date when
> view from the submission side of things.
>
> Signed-off-by: Tvrtko Ursulin
Looks like a good patch to me.
> Cc: Christian König
> Cc: Danilo Krummrich
> Cc: Matthew Brost
> Cc: Philipp Stanner
&
On Fri, 2025-09-19 at 08:33 +0100, Tvrtko Ursulin wrote:
>
> On 19/09/2025 07:44, Philipp Stanner wrote:
> > A rework of the scheduler unit tests removed the done_list. That list is
> > still mentioned in the mock test header.
> >
> > Remove that relict.
> >
A rework of the scheduler unit tests removed the done_list. That list is
still mentioned in the mock test header.
Remove that relict.
Fixes: 4576de9b7977 ("drm/sched/tests: Implement cancel_job() callback")
Signed-off-by: Philipp Stanner
---
drivers/gpu/drm/scheduler/tests/sched_t
should_ be relatively trivial to implement, though.
Signed-off-by: Philipp Stanner
---
So. ¡Hola!
This is a highly WIP RFC. It's obviously at many places not yet
conforming very well to Rust's standards.
Nevertheless, it has progressed enough that I want to request comments
from the community.
ences more
> robust regarding context lifetime.
>
> On 18.09.25 14:30, Philipp Stanner wrote:
> > dma_fence is a synchronization mechanism which is needed by virtually
> > all GPU drivers.
> >
> > A dma_fence offers many features, among which the most important ones
>
On Tue, 2025-09-02 at 13:12 +0200, Philipp Stanner wrote:
> From: Philipp Stanner
>
> The various objects and their memory lifetime used by the GPU scheduler
> are currently not fully documented.
>
> Add documentation describing the scheduler's objects. Improve the
>
On Mon, 2025-09-15 at 21:23 +0800, Luc Ma wrote:
> The mentioned function has been renamed since commit 180fc134d712
> ("drm/scheduler: Rename cleanup functions v2."), so let it refer to
> the current one.
>
> v2: use proper pattern for function cross-reference
>
> Signed-off-by: Luc Ma
Applied
On Thu, 2025-09-11 at 15:55 +0100, Tvrtko Ursulin wrote:
>
> On 11/09/2025 15:20, Philipp Stanner wrote:
> > On Wed, 2025-09-03 at 11:18 +0100, Tvrtko Ursulin wrote:
> > > Move the code dealing with entities entering and exiting run queues to
> > > helpers to l
On Fri, 2025-09-12 at 21:44 +0800, Luc Ma wrote:
> The mentioned function has been renamed since commit 180fc134d712
> ("drm/scheduler: Rename cleanup functions v2."), so let it refer to
> the current one.
>
> Signed-off-by: Luc Ma
Thx for the patch.
> ---
> include/drm/gpu_scheduler.h | 2 +-
work. Or
could it be made generic for the current in-tree scheduler?
>
> Apart from that, the upcoming fair scheduling algorithm will rely on the
> tree only containing runnable entities.
>
> Signed-off-by: Tvrtko Ursulin
> Cc: Christian König
> Cc: Danilo Krummrich
>
patches or could it be branched out?
P.
>
> Signed-off-by: Tvrtko Ursulin
> Cc: Christian König
> Cc: Danilo Krummrich
> Cc: Matthew Brost
> Cc: Philipp Stanner
> ---
> drivers/gpu/drm/scheduler/sched_entity.c | 64 ++-
> drivers/gpu/drm/scheduler/s
On Thu, 2025-09-04 at 12:27 +0200, Christian König wrote:
> On 01.09.25 10:31, Philipp Stanner wrote:
> > This reverts:
> >
> > commit bead88002227 ("drm/nouveau: Remove waitque for sched teardown")
> > commit 5f46f5c7af8c ("drm/nouveau: Add new callback
On Thu, 2025-09-04 at 13:56 +0200, Christian König wrote:
> On 04.09.25 13:12, Philipp Stanner wrote:
> > On Thu, 2025-09-04 at 12:27 +0200, Christian König wrote:
> > > On 01.09.25 10:31, Philipp Stanner wrote:
> > > > This reverts:
> > > >
> &g
On Tue, 2025-08-12 at 16:34 +0200, Christian König wrote:
> From: Christian König
>
> We have the re-occurring problem that people try to invent a
> DMA-fences implementation which signals fences based on an userspace
> IOCTL.
>
> This is well known as source of hard to track down crashes and is
On Mon, 2025-09-01 at 15:14 +0200, Pierre-Eric Pelloux-Prayer wrote:
>
>
> Le 25/08/2025 à 15:13, Philipp Stanner a écrit :
> > On Fri, 2025-08-22 at 15:43 +0200, Pierre-Eric Pelloux-Prayer wrote:
> > > Currently, the scheduler score is incremented when a job is pushe
On Tue, 2025-09-02 at 10:22 +0200, Philipp Stanner wrote:
> On Tue, 2025-09-02 at 10:18 +0200, Philipp Stanner wrote:
> > On Tue, 2025-09-02 at 08:59 +0100, Tvrtko Ursulin wrote:
> > >
> > > On 02/09/2025 08:27, Philipp Stanner wrote:
> > > > On Mon, 2025-09
From: Philipp Stanner
The various objects and their memory lifetime used by the GPU scheduler
are currently not fully documented.
Add documentation describing the scheduler's objects. Improve the
general documentation at a few other places.
Co-developed-by: Christian König
Signed-o
On Tue, 2025-09-02 at 10:18 +0200, Philipp Stanner wrote:
> On Tue, 2025-09-02 at 08:59 +0100, Tvrtko Ursulin wrote:
> >
> > On 02/09/2025 08:27, Philipp Stanner wrote:
> > > On Mon, 2025-09-01 at 14:40 +0200, Pierre-Eric Pelloux-Prayer wrote:
> > > > The drm
On Tue, 2025-09-02 at 08:59 +0100, Tvrtko Ursulin wrote:
>
> On 02/09/2025 08:27, Philipp Stanner wrote:
> > On Mon, 2025-09-01 at 14:40 +0200, Pierre-Eric Pelloux-Prayer wrote:
> > > The drm_sched_job_unschedulable trace point can access
> > > entity->depen
On Mon, 2025-09-01 at 14:40 +0200, Pierre-Eric Pelloux-Prayer wrote:
> The drm_sched_job_unschedulable trace point can access
> entity->dependency after it was cleared by the callback
> installed in drm_sched_entity_add_dependency_cb, causing:
>
> BUG: kernel NULL pointer dereference, address: 000
ove waitque for sched teardown")
Suggested-by: Danilo Krummrich
Signed-off-by: Philipp Stanner
---
Changes in v2:
- Don't revert commit 89b2675198ab ("drm/nouveau: Make fence container helper
usable driver-wide")
- Add Fixes-tag
---
drivers/gpu/drm/nouveau/nouveau_fence.c | 15 --
ll patches related to the waitqueue removal.
Suggested-by: Danilo Krummrich
Signed-off-by: Philipp Stanner
---
drivers/gpu/drm/nouveau/nouveau_fence.c | 20 +---
drivers/gpu/drm/nouveau/nouveau_fence.h | 6 --
drivers/gpu/drm/nouveau/nouveau_sched.c | 20 ++
On Wed, 2025-08-13 at 14:58 +0200, Danilo Krummrich wrote:
> On Wed Aug 13, 2025 at 10:56 AM CEST, Philipp Stanner wrote:
> > In drm_sched_fini() all entities are marked as stopped - without taking
> > the appropriate lock, because that would deadlock. That means that
> >
On Mon, 2025-08-25 at 12:48 +0200, Markus Elfring wrote:
> > > > The header file is already included on line 8. Remove
> > > > the
> > > > redundant include.
> > >
> > > You would like to omit a duplicate #include directive, don't you?
>
> The change intention is probably clear.
>
>
> > > Wil
-off-by: Tvrtko Ursulin
> Cc: Christian König
> Cc: Danilo Krummrich
> Cc: Matthew Brost
> Cc: Philipp Stanner
Applied to drm-misc-next
Thx
P.
> ---
> drivers/gpu/drm/scheduler/sched_entity.c | 14 +++---
> 1 file changed, 7 insertions(+), 7 deletions(-)
>
&
On Fri, 2025-08-22 at 15:43 +0200, Pierre-Eric Pelloux-Prayer wrote:
> Currently, the scheduler score is incremented when a job is pushed to an
> entity and when an entity is attached to the scheduler.
It's indeed awkward why attaching is treated equivalently to job
submission.
Can you expand the
On Tue, 2025-08-19 at 18:15 +0200, Markus Elfring wrote:
> > The header file is already included on line 8. Remove the
> > redundant include.
>
> You would like to omit a duplicate #include directive, don't you?
> Will a corresponding refinement become helpful for the summary phrase
> and change
On Wed, 2025-08-20 at 11:06 +0200, Pierre-Eric Pelloux-Prayer wrote:
>
>
> Le 21/07/2025 à 17:18, Pierre-Eric Pelloux-Prayer a écrit :
> >
> >
> > Le 26/06/2025 à 16:05, Tvrtko Ursulin a écrit :
> > >
> > > On 26/06/2025 14:43, Pierre-Eric Pelloux-Prayer wrote:
> > > > Hi,
> > > >
> > > > Le
On Thu, 2025-08-14 at 12:45 +0100, Tvrtko Ursulin wrote:
>
> On 14/08/2025 11:42, Tvrtko Ursulin wrote:
> >
> > On 21/07/2025 08:52, Philipp Stanner wrote:
> > > +Cc Tvrtko, who's currently reworking FIFO and RR.
> > >
> > > On Sun, 2025-07-20
associated with a scheduler must be torn down first. Then,
however, the locking should be removed from drm_sched_fini() alltogether
with an appropriate comment.
Reported-by: James Flowers
Link:
https://lore.kernel.org/dri-devel/20250720235748.2798-1-bold.zone2...@fastmail.com/
Signed-off-by: Philipp
On Tue, 2025-08-12 at 16:34 +0200, Christian König wrote:
> From: Christian König
Is this the correct mail addr? :)
>
> We have the re-occurring problem that people try to invent a
> DMA-fences implementation which signals fences based on an userspace
> IOCTL.
>
> This is well known as source
On Tue, 2025-08-12 at 08:58 +0200, Christian König wrote:
> On 12.08.25 08:37, Liu01, Tong (Esther) wrote:
> > [AMD Official Use Only - AMD Internal Distribution Only]
> >
> > Hi Christian,
> >
> > If a job is submitted into a stopped entity, in addition to an error log,
> > it will also cause t
On Thu, 2025-08-07 at 16:15 +0200, Christian König wrote:
> On 05.08.25 12:22, Philipp Stanner wrote:
> > On Tue, 2025-08-05 at 11:05 +0200, Christian König wrote:
> > > On 24.07.25 17:07, Philipp Stanner wrote:
> > > > > +/**
> >
On Mon, 2025-08-11 at 10:18 +0200, Philipp Stanner wrote:
> Hi,
>
> title: this patch changes nothing in amdgpu.
>
> Thus, the prefix must be drm/sched: Fix […]
>
>
> Furthermore, please use scripts/get_maintainer. A few relevant folks
> are missing. +Cc Danilo, Ma
Hi,
title: this patch changes nothing in amdgpu.
Thus, the prefix must be drm/sched: Fix […]
Furthermore, please use scripts/get_maintainer. A few relevant folks
are missing. +Cc Danilo, Matthew
On Mon, 2025-08-11 at 15:20 +0800, Liu01 Tong wrote:
> During process kill, drm_sched_entity_flush
The Nova GPU driver has a sub-website on the Rust-for-Linux website
which so far was missing from the respective section in MAINTAINERS.
Add the Nova website.
Signed-off-by: Philipp Stanner
---
MAINTAINERS | 2 ++
1 file changed, 2 insertions(+)
diff --git a/MAINTAINERS b/MAINTAINERS
index
On Tue, 2025-08-05 at 11:05 +0200, Christian König wrote:
> On 24.07.25 17:07, Philipp Stanner wrote:
> > > +/**
> > > + * DOC: Scheduler Fence Object
> > > + *
> > > + * The scheduler fence object (&struct drm_sched_fence) encapsulates the
> > >
On Fri, 2025-08-01 at 15:42 +, Timur Tabi wrote:
> On Fri, 2025-08-01 at 17:12 +0200, Danilo Krummrich wrote:
> > On Fri Aug 1, 2025 at 4:50 PM CEST, Timur Tabi wrote:
> > > Does mean that the TODO has been done, or that someone completely forgot
> > > and now your patch
> > > is
> > > remove
struct nouveau_channel contains the member 'accel_done' and a forgotten
TODO which hints at that mechanism being removed in the "near future".
Since that variable is read nowhere anymore, this "near future" is now.
Remove the variable and the TODO.
Signed-off-by:
associated with a scheduler must be torn down first. Then,
however, the locking should be removed from drm_sched_fini() alltogether
with an appropriate comment.
Reported-by: James Flowers
Link:
https://lore.kernel.org/dri-devel/20250720235748.2798-1-bold.zone2...@fastmail.com/
Signed-off-by: Philipp
gt; > loosely called random. Under the assumption it will not always be the
> > > same
> > > entity which is re-joining the queue under these circumstances.
> > >
> > > Another way to look at this is that it is adding a little bit of limited
> > > random
Two comments from myself to open up room for discussion:
On Thu, 2025-07-24 at 16:01 +0200, Philipp Stanner wrote:
> From: Philipp Stanner
>
> The various objects and their memory lifetime used by the GPU scheduler
> are currently not fully documented.
>
> Add documentat
From: Philipp Stanner
The various objects and their memory lifetime used by the GPU scheduler
are currently not fully documented.
Add documentation describing the scheduler's objects. Improve the
general documentation at a few other places.
Co-developed-by: Christian König
Signed-o
Hello,
On Tue, 2025-07-22 at 13:05 -0700, James wrote:
> On Mon, Jul 21, 2025, at 1:16 AM, Philipp Stanner wrote:
> > On Mon, 2025-07-21 at 09:52 +0200, Philipp Stanner wrote:
> > > +Cc Tvrtko, who's currently reworking FIFO and RR.
> > >
> > > On Sun,
On Tue, 2025-07-22 at 01:45 -0700, Matthew Brost wrote:
> On Tue, Jul 22, 2025 at 01:07:29AM -0700, Matthew Brost wrote:
> > On Tue, Jul 22, 2025 at 09:37:11AM +0200, Philipp Stanner wrote:
> > > On Mon, 2025-07-21 at 11:07 -0700, Matthew Brost wrote:
> > > > On M
On Mon, 2025-07-21 at 11:07 -0700, Matthew Brost wrote:
> On Mon, Jul 21, 2025 at 12:14:31PM +0200, Danilo Krummrich wrote:
> > On Mon Jul 21, 2025 at 10:16 AM CEST, Philipp Stanner wrote:
> > > On Mon, 2025-07-21 at 09:52 +0200, Philipp Stanner wrote:
> > > > On
On Mon, 2025-07-21 at 09:52 +0200, Philipp Stanner wrote:
> +Cc Tvrtko, who's currently reworking FIFO and RR.
>
> On Sun, 2025-07-20 at 16:56 -0700, James Flowers wrote:
> > Fixes an issue where entities are added to the run queue in
> > drm_sched_rq_update_fifo
+Cc Tvrtko, who's currently reworking FIFO and RR.
On Sun, 2025-07-20 at 16:56 -0700, James Flowers wrote:
> Fixes an issue where entities are added to the run queue in
> drm_sched_rq_update_fifo_locked after being killed, causing a
> slab-use-after-free error.
>
> Signed-off-by: James Flowers
>
On Fri, 2025-07-18 at 10:35 +0100, Tvrtko Ursulin wrote:
>
> On 18/07/2025 10:31, Philipp Stanner wrote:
> > On Fri, 2025-07-18 at 08:13 +0100, Tvrtko Ursulin wrote:
> > >
> > > On 16/07/2025 21:44, Maíra Canal wrote:
> > > > Hi Tvrtko,
> > >
gt; > > > in the
> > > > > > > queue we can simply add the signaled check and have it return the
> > > > > > > presence
> > > > > > > of more jobs to be freed to the caller. That way the work item
> > > > > > &g
On Thu, 2025-07-17 at 16:44 +0800, Lin.Cao wrote:
> When application A submits jobs and application B submits a job with
> a
> dependency on A's fence, the normal flow wakes up the scheduler after
> processing each job. However, the optimization in
> drm_sched_entity_add_dependency_cb() uses a call
e
> job and add a new flag to verify that the code path had executed as
> expected.
>
> Signed-off-by: Tvrtko Ursulin
> Fixes: 1472e7549f84 ("drm/sched: Add new test for
> DRM_GPU_SCHED_STAT_NO_HANG")
> Cc: Maíra Canal
> Cc: Philipp Stanner
Applied to drm-misc
; > > > from the
> > > > mock scheduler job list and the drm_mock_sched_advance() call
> > > > in the
> > > > test
> > > > will fail.
> > > >
> > > > Fix it by making the "don't reset" flag persist for the
> > > &
On Wed, 2025-07-16 at 14:05 +0200, Greg Kroah-Hartman wrote:
> On Wed, Jul 16, 2025 at 01:32:42PM +0200, Philipp Stanner wrote:
> > On Wed, 2025-07-16 at 13:15 +0200, Greg Kroah-Hartman wrote:
> > > On Wed, Jul 16, 2025 at 12:58:28PM +0200, Christian König wrote:
> > > &
On Wed, 2025-07-16 at 13:15 +0200, Greg Kroah-Hartman wrote:
> On Wed, Jul 16, 2025 at 12:58:28PM +0200, Christian König wrote:
> > On 16.07.25 12:46, Philipp Stanner wrote:
> > > +Cc Greg, Sasha
> > >
> > > On Wed, 2025-07-16 at 12:40 +0200, Michel Dän
+Cc Greg, Sasha
On Wed, 2025-07-16 at 12:40 +0200, Michel Dänzer wrote:
> On 16.07.25 11:57, Philipp Stanner wrote:
> > On Wed, 2025-07-16 at 09:43 +, cao, lin wrote:
> > >
> > > Hi Philipp,
> > >
> > >
> > > Thank you for the review.
: sta...@vger.kernel.org # v4.6+
Fixes: 777dbd458c89 ("drm/amdgpu: drop a dummy wakeup scheduler")
P.
>
>
> Thanks,
> Lin
>
>
> From: Philipp Stanner
> Sent: Wednesday, July 16, 2025 16:33
> To: cao, lin ; dri-devel@lists.freedesktop.org
>
>
On Tue, 2025-07-15 at 21:50 +0800, Lin.Cao wrote:
> When application A submits jobs and application B submits a job with
> a
> dependency on A's fence, the normal flow wakes up the scheduler after
> processing each job. However, the optimization in
> drm_sched_entity_add_dependency_cb() uses a call
On Tue, 2025-07-15 at 14:32 +0200, Christian König wrote:
> On 15.07.25 14:20, Philipp Stanner wrote:
> > On Tue, 2025-07-15 at 12:52 +0200, Christian König wrote:
> > > On 15.07.25 12:27, Philipp Stanner wrote:
> > > > On Tue, 2025-07-15 at 09:51 +, cao, lin wr
On Tue, 2025-07-15 at 12:52 +0200, Christian König wrote:
> On 15.07.25 12:27, Philipp Stanner wrote:
> > On Tue, 2025-07-15 at 09:51 +, cao, lin wrote:
> > >
> > > [AMD Official Use Only - AMD Internal Distribution Only]
> > >
> > >
> >
"optimization" consists of the
work item not being scheduled. I think that was the piece of the puzzle
I was missing.
I / DRM tools will also include a link to this thread, so I think that
will then be sufficient.
Thx
P.
>
> Thanks,
> Lin
>
>
>
>
>
&
ence_scheduled()
>
>
> Thanks,
> Lin
>
>
> From: Koenig, Christian
> Sent: Monday, July 14, 2025 21:39
> To: pha...@kernel.org ; cao, lin ;
> dri-devel@lists.freedesktop.org
> Cc: Yin, ZhenGuo (Chris) ; Deng, Emily
> ; d...@kernel.org ;
> matthew.br...@in
On Mon, 2025-07-14 at 15:08 +0200, Christian König wrote:
>
>
> On 14.07.25 14:46, Philipp Stanner wrote:
> > regarding the patch subject: the prefix we use for the scheduler
> > is:
> > drm/sched:
> >
> >
> > On Mon, 2025-07-14 at 14:23 +0800,
regarding the patch subject: the prefix we use for the scheduler is:
drm/sched:
On Mon, 2025-07-14 at 14:23 +0800, Lin.Cao wrote:
> When Application A submits jobs (a1, a2, a3) and application B submits
s/Application/application
> job b1 with a dependency on a2's scheduler fence, killing appli
On Mon, 2025-07-14 at 11:23 +0200, Christian König wrote:
> On 13.07.25 21:03, Maíra Canal wrote:
> > Hi Christian,
> >
> > On 11/07/25 12:20, Christian König wrote:
> > > On 11.07.25 15:37, Philipp Stanner wrote:
> > > > On Fri, 2025-0
On Fri, 2025-07-11 at 15:22 +0200, Christian König wrote:
>
>
> On 08.07.25 15:25, Maíra Canal wrote:
> > When the DRM scheduler times out, it's possible that the GPU isn't hung;
> > instead, a job just took unusually long (longer than the timeout) but is
> > still running, and there is, thus, no
nce, -ESRCH);
> WARN_ON(job->s_fence->parent);
> job->sched->ops->free_job(job);
> --
>
>
> Thanks,
> Lin
>
>
>
>
>
> From: Koenig, Christian
> Sent: Thursday, July 10, 2025 15:52
> To: cao, lin ; dri-devel@lists.freedesktop.org
>
t way the work item does not
optional nit:
s/to free/to be freed
Reads a bit more cleanly.
> have
> to lock the list again and repeat the signaled check.
>
> Signed-off-by: Tvrtko Ursulin
> Cc: Christian König
> Cc: Danilo Krummrich
> Cc: Matthew Brost
> Cc: Phi
On Thu, 2025-07-10 at 14:54 +0200, Philipp Stanner wrote:
> Changes in v4:
> - Change dev_err() to dev_warn() in pending_list emptyness check.
>
> Changes in v3:
> - Remove forgotten copy-past artifacts. (Tvrtko)
> - Remove forgotten done_list struct member. (Tvrtko)
>
On Thu, 2025-07-10 at 14:54 +0200, Philipp Stanner wrote:
> When the GPU scheduler was ported to using a struct for its
> initialization parameters, it was overlooked that panfrost creates a
> distinct workqueue for timeout handling.
>
> The pointer to this new workqueue is not ini
nouveau_sched_cancel_job()
the waitque is not necessary anymore.
Remove the waitque.
Signed-off-by: Philipp Stanner
Acked-by: Danilo Krummrich
---
drivers/gpu/drm/nouveau/nouveau_sched.c | 20 +++-
drivers/gpu/drm/nouveau/nouveau_sched.h | 9 +++--
drivers/gpu/drm/nouveau/nouveau_uvmm.c
: Philipp Stanner
Acked-by: Danilo Krummrich
---
drivers/gpu/drm/nouveau/nouveau_fence.c | 20 +++-
drivers/gpu/drm/nouveau/nouveau_fence.h | 6 ++
2 files changed, 13 insertions(+), 13 deletions(-)
diff --git a/drivers/gpu/drm/nouveau/nouveau_fence.c
b/drivers/gpu/drm
There is a new callback for always tearing the scheduler down in a
leak-free, deadlock-free manner.
Port Nouveau as its first user by providing the scheduler with a
callback that ensures the fence context gets killed in drm_sched_fini().
Signed-off-by: Philipp Stanner
Acked-by: Danilo Krummrich
drm_sched_fini() can leak jobs under certain circumstances.
Warn if that happens.
Signed-off-by: Philipp Stanner
---
drivers/gpu/drm/scheduler/sched_main.c | 3 +++
1 file changed, 3 insertions(+)
diff --git a/drivers/gpu/drm/scheduler/sched_main.c
b/drivers/gpu/drm/scheduler/sched_main.c
The scheduler unit tests now provide a new callback, cancel_job(). This
callback gets used by drm_sched_fini() for all still pending jobs to
cancel them.
Implement a new unit test to test this.
Signed-off-by: Philipp Stanner
Reviewed-by: Tvrtko Ursulin
---
drivers/gpu/drm/scheduler/tests
the code where necessary.
Signed-off-by: Philipp Stanner
---
.../gpu/drm/scheduler/tests/mock_scheduler.c | 68 +++
drivers/gpu/drm/scheduler/tests/sched_tests.h | 1 -
2 files changed, 25 insertions(+), 44 deletions(-)
diff --git a/drivers/gpu/drm/scheduler/tests
-tvrtko.ursu...@igalia.com/
Signed-off-by: Philipp Stanner
Reviewed-by: Maíra Canal
---
drivers/gpu/drm/scheduler/sched_main.c | 34 --
include/drm/gpu_scheduler.h| 18 ++
2 files changed, 39 insertions(+), 13 deletions(-)
diff --git a/drivers/gpu/drm
-4d55-aa47-c35cd7861...@igalia.com/
Signed-off-by: Philipp Stanner
---
drivers/gpu/drm/panfrost/panfrost_job.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/panfrost/panfrost_job.c
b/drivers/gpu/drm/panfrost/panfrost_job.c
index 5657106c2f7d..15e2d505550f 10
e still in
drm_sched.pending_list.
This series solves the leaks in a backwards-compatible manner by adding
a new, optional callback. If that callback is implemented, the scheduler
uses it to cancel all jobs from pending_list and then frees them.
Philipp Stanner (8):
drm/panfrost: Fix scheduler workqueue
> [3]
> https://lore.kernel.org/dri-devel/20250430210643.57924-1-mca...@igalia.com/T/
>
> Best Regards,
> - Maíra
>
> ---
> v1 -> v2:
>
> - Fix several grammar nits across the documentation and commit messages.
> - Drop "drm/sched: Always free the job after the
gt;ops->free_job()` - leading to a memory leak.
>
> To solve these problems, create a new `drm_gpu_sched_stat`, called
> DRM_GPU_SCHED_STAT_NO_HANG, which allows a driver to skip the reset.
> The
> new status will indicate that the job must be reinserted into
> `sched->pending_list`, an
nouveau_sched_cancel_job()
the waitque is not necessary anymore.
Remove the waitque.
Signed-off-by: Philipp Stanner
Acked-by: Danilo Krummrich
---
drivers/gpu/drm/nouveau/nouveau_sched.c | 20 +++-
drivers/gpu/drm/nouveau/nouveau_sched.h | 9 +++--
drivers/gpu/drm/nouveau/nouveau_uvmm.c
There is a new callback for always tearing the scheduler down in a
leak-free, deadlock-free manner.
Port Nouveau as its first user by providing the scheduler with a
callback that ensures the fence context gets killed in drm_sched_fini().
Signed-off-by: Philipp Stanner
Acked-by: Danilo Krummrich
drm_sched_fini() can leak jobs under certain circumstances.
Warn if that happens.
Signed-off-by: Philipp Stanner
---
drivers/gpu/drm/scheduler/sched_main.c | 3 +++
1 file changed, 3 insertions(+)
diff --git a/drivers/gpu/drm/scheduler/sched_main.c
b/drivers/gpu/drm/scheduler/sched_main.c
: Philipp Stanner
Acked-by: Danilo Krummrich
---
drivers/gpu/drm/nouveau/nouveau_fence.c | 20 +++-
drivers/gpu/drm/nouveau/nouveau_fence.h | 6 ++
2 files changed, 13 insertions(+), 13 deletions(-)
diff --git a/drivers/gpu/drm/nouveau/nouveau_fence.c
b/drivers/gpu/drm
The scheduler unit tests now provide a new callback, cancel_job(). This
callback gets used by drm_sched_fini() for all still pending jobs to
cancel them.
Implement a new unit test to test this.
Signed-off-by: Philipp Stanner
Reviewed-by: Tvrtko Ursulin
---
drivers/gpu/drm/scheduler/tests
the code where necessary.
Signed-off-by: Philipp Stanner
---
.../gpu/drm/scheduler/tests/mock_scheduler.c | 68 +++
drivers/gpu/drm/scheduler/tests/sched_tests.h | 1 -
2 files changed, 25 insertions(+), 44 deletions(-)
diff --git a/drivers/gpu/drm/scheduler/tests
-tvrtko.ursu...@igalia.com/
Signed-off-by: Philipp Stanner
Reviewed-by: Maíra Canal
---
drivers/gpu/drm/scheduler/sched_main.c | 34 --
include/drm/gpu_scheduler.h| 18 ++
2 files changed, 39 insertions(+), 13 deletions(-)
diff --git a/drivers/gpu/drm
manner by adding
a new, optional callback. If that callback is implemented, the scheduler
uses it to cancel all jobs from pending_list and then frees them.
Philipp Stanner (7):
drm/sched: Avoid memory leaks with cancel_job() callback
drm/sched/tests: Implement cancel_job() callback
drm/sched/
-4d55-aa47-c35cd7861...@igalia.com/
Signed-off-by: Philipp Stanner
---
drivers/gpu/drm/panfrost/panfrost_job.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/panfrost/panfrost_job.c
b/drivers/gpu/drm/panfrost/panfrost_job.c
index 5657106c2f7d..15e2d505550f 10
On Tue, 2025-07-08 at 14:02 +0100, Tvrtko Ursulin wrote:
>
>
> On 11/02/2025 11:14, Philipp Stanner wrote:
> > drm_sched_init() has a great many parameters and upcoming new
> > functionality for the scheduler might add even more. Generally, the
> > great number of par
On Tue, 2025-07-08 at 13:21 +0100, Tvrtko Ursulin wrote:
> Extract out two copies of the identical code to function epilogue to
> make
> it smaller and more readable.
>
> Signed-off-by: Tvrtko Ursulin
> Cc: Christian König
> Cc: Danilo Krummrich
> Cc: Matthew Bros
>
> Signed-off-by: Tvrtko Ursulin
> Cc: Christian König
> Cc: Danilo Krummrich
> Cc: Matthew Brost
> Cc: Philipp Stanner
> ---
> drivers/gpu/drm/scheduler/sched_internal.h | 2 -
> drivers/gpu/drm/scheduler/sched_main.c | 132 ++-
> --
>
1 - 100 of 728 matches
Mail list logo