On Friday, 28 October 2022 at 09:48:14 UTC, ab wrote:
Thanks to H.S. Teoh and Dennis for the suggestions, they both work. I like the empty asm block a bit more because it is less invasive, but it only works with ldc.
I used the volatileLoad/volatileStore functions to ensure that the compiler doesn't find a way to optimize out the code (for example, move repetitive calculations out of the loop or even do them at compile time) and the RDTSC/RDTSCP instruction via inline assembly for measurements: https://gist.github.com/ssvb/5c926ed9bc755900fdaac3b71a0f7cfd
The goal was to have a very fast way to check (with no measurable overhead) whether reasonable optimization options had been supplied to the compiler.