On Fri, May 08, 2020 at 10:24:43PM +0000, m...@netbsd.org wrote: > The indirection only applies to the first call. The magic is within > rtld.
You are comparing PLT calls with ifunc (where even normal PLT calls have initial resolution overhead, but very tiny - while ifuncs may have arbitrary first time overhead). Thor is talking about something like the LOCK prefix in x86 asm (that is: real atomic ops). Kamil pointed out that the difference is known to the program at compile time or run time. Problem is that the is-not-lockless case will be rarely tested (if at all). Kamil said we should leave that trouble to the programmers and not make decisions for them. Everyone is kind of right here, it is a matter of balancing "costs" and benefits. What I still not get is why it would be hard (or bad) to just have a pkgsrc version (or two, one gcc one llvm) and what costs that would bring (and I don't buy Kamil's "it is an integrated part of the toolchain" argument). Martin