Thx for the tips, recompiling with -d:release make a big difference, it is then roughly 5 times faster. But then, the Python is still roughly 3 times faster.
This is not a proper benchmark, it is a "quick wins" investigation. Was hoping that using a compiled language would make a giant difference.