On Monday, 19 July 2021 at 17:20:21 UTC, kinke wrote:
Compiling with `-O -mtriple=i686-linux-gnu -mcpu=i686` (=> no SSE2 by default) shows that the inlined version inside `wrapper()` is the mega slow one, so the extra instructions aren't applied transitively unfortunately.

Erm sorry should have looked more closely - it's not inlined, and the call seems extremely expensive too, with state pushing and popping going on, apparently to account for the different targets. Brrr, to be avoided at all costs for such tiny functions. :)

Reply via email to