While even further from the default and also probably a bit of work to go from IDE to using it, I sometimes get ~2x run time improvements from [gcc profile-guided-optimization](https://forum.nim-lang.org/t/6295).
Even this is not a panacea, as I have (rarely) seen a PGO build actually run slower than an ordinary `-d:danger --passC:-flto` build when running against the very data you train it against, even! But this is like a <5% occurrence in my experience. 1.5-1.75x better is typical.