Update: I tried PGO'ing the Nim status compiler which is I think based around 1.6.10 with a bit of custom commits, so it uses refc. Judging by the results, it seems like compiler built with refc (which is the case for Nim 1.6.x) benefits much less from PGO - the Nim compilation part for the compiler itself takes 6 seconds with default compiler, and 5 seconds with Clang PGO'd compiler with danger.
Compare that with devel, where the normal compiler built with gcc and release takes same 6 seconds, but PGO'd compiler takes **3.5 seconds**! So, the conclusion is simple: ARC/ORC are just much much better suited to optimization like PGO, I guess I should've known that :) Regarding nimbus - PGO'd version of the Status compiler fork actually takes longer than the default one ¯_(ツ)_/¯.
