One more idea: What is the option for gcc? Nim passes -O3 to gcc by default, while C programs most of the times use only -O2. -O3 can significantly grow the executable size -- generally O3 should be not slower than O2, but for rare cases it may be slower.
And you may try option -flto for link time optimation, or use clang instead of gcc. Can you test with --gc:nome ?