On Tue, Jul 8, 2025 at 3:54 PM Peter Zijlstra <pet...@infradead.org> wrote:
>
> On Tue, Jul 08, 2025 at 09:54:06AM -0300, Wander Lairson Costa wrote:
> > O Mon, Jul 07, 2025 at 01:20:03PM +0200, Peter Zijlstra wrote:
> > > On Fri, Jul 04, 2025 at 02:07:43PM -0300, Wander Lairson Costa wrote:
> > > > Similar to the IRQ tracepoint, the preempt tracepoints are typically
> > > > disabled in production systems due to the significant overhead they
> > > > introduce even when not in use.
> > > >
> > > > The overhead primarily comes from two sources: First, when tracepoints
> > > > are compiled into the kernel, preempt_count_add() and 
> > > > preempt_count_sub()
> > > > become external function calls rather than inlined operations. Second,
> > > > these functions perform unnecessary preempt_count() checks even when the
> > > > tracepoint itself is disabled.
> > > >
> > > > This optimization introduces an early check of the tracepoint static 
> > > > key,
> > > > which allows us to skip both the function call overhead and the 
> > > > redundant
> > > > preempt_count() checks when tracing is disabled. The change maintains 
> > > > all
> > > > existing functionality when tracing is active while significantly
> > > > reducing overhead for the common case where tracing is inactive.
> > > >
> > >
> > > This one in particular I worry about the code gen impact. There are a
> > > *LOT* of preempt_{dis,en}able() sites in the kernel and now they all get
> > > this static branch and call crud on.
> > >
> > > We spend significant effort to make preempt_{dis,en}able() as small as
> > > possible.
> > >
> >
> > Thank you for the feedback, it's much appreciated. I just want to make sure
> > I'm on the right track. If I understand your concern correctly, it revolves
> > around the overhead this patch might introduce???specifically to the binary
> > size and its effect on the iCache???when the kernel is built with preempt
> > tracepoints enabled. Is that an accurate summary?
>
> Yes, specifically:
>
> preempt_disable()
>         incl    %gs:__preempt_count
>
>
>
> preempt_enable()
>         decl    %gs:__preempt_count
>         jz      do_schedule
> 1:      ...
>
> do_schedule:
>         call    __SCT__preemptible_schedule
>         jmp     1
>
>
> your proposal adds significantly to this.
>

Here is a breakdown of the patch's behavior under the different kernel
configurations:
* When DEBUG_PREEMPT is defined, the behavior is identical to the
current implementation, with calls to preempt_count_add/sub().
* When both DEBUG_PREEMPT and TRACE_PREEMPT_TOGGLE are disabled, the
generated code is also unchanged.
* The primary change occurs when only TRACE_PREEMPT_TOGGLE is defined.
In this case, the code uses a static key test instead of a function
call. As the benchmarks show, this approach is faster when the
tracepoints are disabled.
The main trade-off is that enabling or disabling these tracepoints
will require the kernel to patch more code locations due to the use of
static keys.


Reply via email to