Re: [PATCH v5] tracing: Guard __DECLARE_TRACE() use of __DO_TRACE_CALL() with SRCU-fast

Mathieu Desnoyers Fri, 09 Jan 2026 12:21:36 -0800

On 2026-01-09 14:19, Steven Rostedt wrote:

On Fri, 9 Jan 2026 11:10:16 -0800
Alexei Starovoitov <[email protected]> wrote:


\> >

We also have to consider that migrate disable is *not* cheap at all
compared to preempt disable.


Looks like your complaint comes from lack of engagement in kernel
development.


No need to make comments like that. The Linux kernel is an ocean of code.
It's very hard to keep up on everything that is happening. I knew of work
being done on migrate_disable but I didn't know what the impacts of that
work was. Mathieu is still very much involved and engaged in kernel
development.


Thanks Steven. I guess Alexei missed my recent involvement in other
areas of the kernel.

As Steven pointed out, the kernel is vast, so I cannot keep up with
the progress on every single topic. That being said, I very recently
(about 1 month ago) tried using migrate disable for the RSS tracking
improvements (hierarchical percpu counters), and found that the overhead
of migrate disable was large compared to preempt disable. The generated
assembler is also orders of magnitude larger (on x86-64).

Creating small placeholder functions which just call preempt/migrate
disable and enable for a preempt RT build:

0000000000002a20 <test_preempt_disable>:
    2a20:       f3 0f 1e fa             endbr64
    2a24:       65 ff 05 00 00 00 00    incl   %gs:0x0(%rip)        # 2a2b 
<test_preempt_disable+0xb>
    2a2b:       e9 00 00 00 00          jmp    2a30 <test_preempt_disable+0x10>

0000000000002a40 <test_preempt_enable>:
    2a40:       f3 0f 1e fa             endbr64
    2a44:       65 ff 0d 00 00 00 00    decl   %gs:0x0(%rip)        # 2a4b 
<test_preempt_enable+0xb>
    2a4b:       74 05                   je     2a52 <test_preempt_enable+0x12>
    2a4d:       e9 00 00 00 00          jmp    2a52 <test_preempt_enable+0x12>
    2a52:       e8 00 00 00 00          call   2a57 <test_preempt_enable+0x17>
    2a57:       e9 00 00 00 00          jmp    2a5c <test_preempt_enable+0x1c>

0000000000002920 <test_migrate_disable>:
    2920:       f3 0f 1e fa             endbr64
    2924:       65 48 8b 15 00 00 00    mov    %gs:0x0(%rip),%rdx        # 292c 
<test_migrate_disable+0xc>
    292b:       00
    292c:       0f b7 82 38 07 00 00    movzwl 0x738(%rdx),%eax
    2933:       66 85 c0                test   %ax,%ax
    2936:       74 0f                   je     2947 <test_migrate_disable+0x27>
    2938:       83 c0 01                add    $0x1,%eax
    293b:       66 89 82 38 07 00 00    mov    %ax,0x738(%rdx)
    2942:       e9 00 00 00 00          jmp    2947 <test_migrate_disable+0x27>
    2947:       65 ff 05 00 00 00 00    incl   %gs:0x0(%rip)        # 294e 
<test_migrate_disable+0x2e>
    294e:       65 48 8b 05 00 00 00    mov    %gs:0x0(%rip),%rax        # 2956 
<test_migrate_disable+0x36>
    2955:       00
    2956:       83 80 00 00 00 00 01    addl   $0x1,0x0(%rax)
    295d:       b8 01 00 00 00          mov    $0x1,%eax
    2962:       66 89 82 38 07 00 00    mov    %ax,0x738(%rdx)
    2969:       65 ff 0d 00 00 00 00    decl   %gs:0x0(%rip)        # 2970 
<test_migrate_disable+0x50>
    2970:       74 05                   je     2977 <test_migrate_disable+0x57>
    2972:       e9 00 00 00 00          jmp    2977 <test_migrate_disable+0x57>
    2977:       e8 00 00 00 00          call   297c <test_migrate_disable+0x5c>
    297c:       e9 00 00 00 00          jmp    2981 <test_migrate_disable+0x61>

00000000000029a0 <test_migrate_enable>:
    29a0:       f3 0f 1e fa             endbr64
    29a4:       65 48 8b 15 00 00 00    mov    %gs:0x0(%rip),%rdx        # 29ac 
<test_migrate_enable+0xc>
    29ab:       00
    29ac:       0f b7 82 38 07 00 00    movzwl 0x738(%rdx),%eax
    29b3:       66 85 c0                test   %ax,%ax
    29b6:       74 0f                   je     29c7 <test_migrate_enable+0x27>
    29b8:       83 c0 01                add    $0x1,%eax
    29bb:       66 89 82 38 07 00 00    mov    %ax,0x738(%rdx)
    29c2:       e9 00 00 00 00          jmp    29c7 <test_migrate_enable+0x27>
    29c7:       65 ff 05 00 00 00 00    incl   %gs:0x0(%rip)        # 29ce 
<test_migrate_enable+0x2e>
    29ce:       65 48 8b 05 00 00 00    mov    %gs:0x0(%rip),%rax        # 29d6 
<test_migrate_enable+0x36>
    29d5:       00
    29d6:       83 80 00 00 00 00 01    addl   $0x1,0x0(%rax)
    29dd:       b8 01 00 00 00          mov    $0x1,%eax
    29e2:       66 89 82 38 07 00 00    mov    %ax,0x738(%rdx)
    29e9:       65 ff 0d 00 00 00 00    decl   %gs:0x0(%rip)        # 29f0 
<test_migrate_enable+0x50>
    29f0:       74 05                   je     29f7 <test_migrate_enable+0x57>
    29f2:       e9 00 00 00 00          jmp    29f7 <test_migrate_enable+0x57>
    29f7:       e8 00 00 00 00          call   29fc <test_migrate_enable+0x5c>
    29fc:       e9 00 00 00 00          jmp    2a01 <test_migrate_enable+0x61>

migrate_disable _was_ not cheap.
Try to benchmark it now.
It's inlined. It's a fraction of extra overhead on top of preempt_disable.


It would be good to have a benchmark of the two. What about fast_srcu? Is
that fast enough to replace the preempt_disable()? If so, then could we
just make this the same for both RT and !RT?


I've modified kernel/rcu/refscale.c to compare those:

AMD EPYC 9654 96-Core Processor, kernel baseline: v6.18.1
CONFIG_PREEMPT=y
# CONFIG_PREEMPT_LAZY is not set
# CONFIG_PREEMPT_RT is not set

* preempt disable/enable pair:                                     1.1 ns
* srcu-fast lock/unlock:                                           1.5 ns

CONFIG_RCU_REF_SCALE_TEST=y
* migrate disable/enable pair:                                     3.0 ns
* calls to migrate disable/enable pair within noinline functions: 17.0 ns

CONFIG_RCU_REF_SCALE_TEST=m
* migrate disable/enable pair:                                    22.0 ns

When I attempted using migrate disable, I configured refscale as
a module, which gave me the appalling 22 ns overhead. It looks like
the implementation of migrate disable/enable now differs depending on
whether it's used from the core kernel or from a module. That's rather
unexpected.

It seems to be done on purpose though (INSTANTIATE_EXPORTED_MIGRATE_DISABLE)
to work around the fact that it is not possible to export the runqueues
variable.

That's the kind of compilation context dependent overhead variability I'd
rather avoid in the implementation of the tracepoint instrumentation API.

Thanks,

Mathieu

--
Mathieu Desnoyers
EfficiOS Inc.
https://www.efficios.com

Re: [PATCH v5] tracing: Guard __DECLARE_TRACE() use of __DO_TRACE_CALL() with SRCU-fast

Reply via email to