panthor: Prepare the scheduler logic for FW events in IRQ context

Boris Brezillon Tue, 23 Jun 2026 05:52:15 -0700

On Mon, 22 Jun 2026 14:49:49 +0200
Boris Brezillon <[email protected]> wrote:


> On Wed, 20 May 2026 15:15:54 -0700
> Chia-I Wu <[email protected]> wrote:
> 
> > > > > I collected
> > > > > some numbers with baseline, with this series, and with patch 9
> > > > > reverted at 
> > > > > https://gitlab.freedesktop.org/panfrost/linux/-/work_items/85#note_3481308.
> > > > > Reposting the numbers here for reference
> > > > >
> > > > > |                    | baseline | entire series | patch 9 reverted |
> > > > > | -                  | -        | -             | -                |
> > > > > | frag job median    | 2.8ms    | 2.2ms         | 2.2ms            |
> > > > > | frag job 95%       | 4.5ms    | 2.8ms         | 2.8ms            |
> > > > > | frag job 99%       | 4.9ms    | 2.8ms         | 2.8ms            |
> > > > > | panthor-job median | 0.8us    | 6.2us         | 0.9us            |
> > > > > | panthor-job 95%    | 1.5us    | 16.6us        | 1.5us            |
> > > > > | panthor-job 99%    | 1.6us    | 28.0us        | 1.8us            |  
> > > > >   
> > > >
> > > > panthor-job rows are the durations of the raw irq handlers, collected
> > > > from irq/irq_handler_{entry,exit}.
> > > >
> > > > frag job rows are the durations from frag jobs, collected from
> > > > gpu_scheduler/drm_sched_job_{run,done}.
> > > >
> > > > The fence signaling paths of them are
> > > >
> > > >  - baseline: raw handler -> rt threaded handler -> wq job -> wq job ->
> > > > fence signal
> > > >  - entire series: raw handler -> fence signal
> > > >  - patch 9 reverted: raw handler -> rt threaded handler -> fence signal 
> > > >    
> > >
> > > Just did another set of throughput tests, and I confirm the gains are
> > > noticeable only with patch 9 applied (that's on rk3588, which embeds a
> > > G610, so not the exact same setup). As an example, on
> > > gfxbench/gl_manhattan, I get the following score bump 2391 -> 2457.
> > >
> > > Now I need to set things up to measure latency like you did and make
> > > sure I'm observing the same thing: threaded handlers providing roughly
> > > the same latency as hardirq handlers. If not it probably has to do with
> > > some config options that differ and change the preemptability of the
> > > system.
> > >
> > > I'll hold off on the submission of v3 until this is done, because if
> > > threaded handlers are roughly as efficient as hardirq ones, we probably
> > > want to stick to threaded handlers.   
> 
> Sorry for the delay, I only got back to this on Friday.
> 
> So, I've been using ftrace/function-graph with some noinline added to
> get a sense of where most of the time was spent in the hardirq handler
> after the transition to hardirqs, and unlike what I thought, it's not
> coming from the accesses to uncached mappings of the FW
> interface/syncobjs, but instead the various queue[_delayed]_work()
> and/or wake_up_all() on panthor_fw::req_waitqueue. I don't expect us to
> be able to optimize that anytime soon, so I guess we should just keep
> everything in the threaded handler for now and accept the extra delay
> (assuming 20+ usec for the hardirq handler is too long). This also
> means that a lot of the things I do in this series are moot
> (irqsave/restore, using spinlocks instead of mutexes, ...), but before
> I go and rework that, I'd like to get some feedback from Steve and
> Liviu to make sure this is okay with Arm.

I ended up sending a v3 doing that. I can easily go back to the
previous version if needed.

Re: [PATCH v2 06/11] drm/panthor: Prepare the scheduler logic for FW events in IRQ context

Reply via email to