On Wed, Jun 03, 2026 at 02:17:27PM +0000, Dmitry Ilvokhin wrote: > > Something a little like so, which is completely untested, except to > > build kernel/locking/spinlock.o (with clang-23). > > Thanks a lot for taking a look, Peter. > > I like the static_call idea. It's truly zero cost on x86 (and, as you > note, even a byte smaller). The one caveat is that it relies on > HAVE_STATIC_CALL_INLINE to stay free. > > So my plan would be: static_call where HAVE_STATIC_CALL_INLINE is > available (x86), and a static branch fallback elsewhere, gated behind a > default-off config so it imposes nothing on arches/kernels that don't > opt in. I'm mostly interested in x86, but would like arm64 to work too, > which would use the fallback.
(i386 doesn't have STATIC_CALL_INLINE, but nobody cares about the performance on that target, so anything goes really ;-) > > Concretely: > > 1. Split the sleepable-lock patches out and send them separately. > They're independent of the static call work and look far less > controversial. > > 2. Convert the paravirt spinlock unlock to a static_call, as the > foundation for the unlock tracepoint. I'm happy to take a stab at it. > Let me know if you'd rather do it yourself. Yeah, I think that patch as-is *should* work, but like said, I haven't even tried it, so it could be terribly broken :-) > 3. Build the unlock tracepoint on top: static_call where it's cheap, > config-gated static_branch fallback where it isn't. Right, so I think we need some sort of custom callback for tracepoint enable/disable. Its been a minute since I dug through the tracepoint code, but I don't think it provides that with a convenient wrapper, but it should be doable. One thing to note is that when you set the tracepoint unlock function, it should either tail-call into the original function, or you have to create two unlock_trace functions, one for native and one for paravirt and pick the right one. > Does this plan sound reasonable to you? Yeah, should work. > > Also, I think someone should go do some performance runs with > > ARCH_INLINE_SPIN_* set for x86 just like for s390. > > That's a good point, I'll run benchmarks and report back with the > results. Thanks!
