On 13-05-31 11:55 AM, Daniel Farina wrote:
> I've been poking around in rustc, after noting that compile times are
> much longer than I'd expect.
>
> So while some micro optimizations may help, I'm wondering if anyone
> has given a lot of thought as to how to make the llvm passes faster or
> parallel, because this is something like more than half the battle.
>
> What's worrisome is Rust's typical compilation model encourages large
> crates that are one compilation unit, unlike, say, a pile of
> individual C files each emitting an object file that admittedly must
> be linked... but at least the instruction selector can be run in
> parallel. Does this present a sticky problem?
Long term, we don't think so. Short term, it's annoying to all of us;
we're struggling with splitting time between completing features, fixing
bugs, working on residual bits of lingering design problems, and (sadly,
often last) optimizing the codegen. We should spend more time on it.
Most of the problem comes from us generating too much LLVM code (and
code that is hard for LLVM to work with). Much of _that_ comes from
technical debt in the lowering pass, src/librustc/middle/trans. By
necessity, trans is one of the oldest pieces of the compiler; and by
unfortunate historical reality, we've changed enough assumptions about
runtime representations enough times through the life of this compiler
that trans is now generating a lot of unnecessary code.
There is some instrumentation there to measure code generation by
source. A few useful incantations are:
# count LLVM instructions by trans function-call stack
$ rustc -Z count-llvm-insns foo.rs | xdu
and:
# list symbols in the resulting binary sorted by size
$ objdump -t libfoo.so |
awk '/\.text/ { printf("%d\t%s\n", strtonum("0x" $5), $6) }' |
c++filt |
sort -n
and:
# visualize the instruction graph of a given compilation unit
# first generate foo.bc, the llvm bitcode for foo.rs
$ rustc --save-temps foo.rs
# then generate a dot file for every function's CFG
$ opt -analyze -dot-cfg foo.bc
# then do graph layout on, say, the "::main" dot file
# and convert to SVG; pick other dot files for other functions
$ dot -Tsvg -ograph.svg <(c++filt <cfg._ZN4main17_*.dot)
# then take a look at it
$ firefox graph.svg
We could probably do with other instrumentation bits in trans to help
sort things out (eg. summing by basic block names, items, and categories
of item)
I may wind up having to spend time on this before the GC lands simply
because the GC increases resident memory set a bit, and with the (very
high) spike in memory use during the LLVM phase, 32bit rustc can't
presently bootstrap. Uses more than 4gb of memory.
I'd be happy to help anyone else learn how to profile and improve the
compilation performance. It's a lot of little bits that each need
improvement.
-Graydon
_______________________________________________
Rust-dev mailing list
[email protected]
https://mail.mozilla.org/listinfo/rust-dev