I made a couple tweaks and added LTO and PGO to the compilation, about halved the runtime on my machine: <https://github.com/jinyus/related_post_gen/pull/91>
I also tried with `-d:danger --boundChecks:on` which ran faster than `-d:release`, but I didn't add it in because I wasn't sure if it somehow disabled something else. int32 also sped it up slightly, but this of course messes with readability. While I haven't profiled this I believe the table stuff is likely the slowest part. @Araq you mention faster table implementations, anything particular in mind?
