Brian McNamara <[EMAIL PROTECTED]> writes:

> Template libraries, especially those employing expression templates,
> take a long time to compile.  As an example, one of the example files
> for FC++ (parser.cpp) takes about 10 minutes to compile on a blazingly
> fast machine with tons of RAM.

Which compiler?  Have you seen
http://users.rcn.com/abrahams/instantiation_speed/index.html?

> I would like to reduce the compile-time.  I solicit any help/advice on
> the topic

Switching compilers may be your best bet.

> I am hoping some of the Boost contributors will have run into this
> same problem with their own libraries, and have found some ways to
> address it.

Generally speaking, the key is to reduce the number of template
instantiations, but it's also a good idea to eliminate unused
computations... avoid "traits blob" templates since all the nested
definitions need to be evaluated if you want to use just one; use
mpl::apply_if and logical operators and_/or_ to avoid needless
evaluations.

> Here is what I have already figured out.
>
> First off, in FC++, there are a number of templates whose sole
> purpose is to provide better compiler diagnostics (along the same
> general lines as concept_checks).  I rewrote the library code so
> that these checks are only enabled when a certain preprocessor flag
> is defined.  Turning off these checks reduced the compile-time of
> parser.cpp from 10 minutes to 8 minutes--a significant speedup.
>
> That was the most obvious piece of "low-hanging fruit"; since the
> code to produce the compile-time diagnostics doesn't do anything at
> run-time, it was straightforward to just have a switch to turn it on
> and off.
>
> I imagine there are other things I can do to rewrite some of the
> library templates that are doing "real work" so that they compile
> faster.  Specifically, I imagine that some templates can be
> rewritten so that they cause fewer auxiliary templates to be
> instantiated each time the main template gets instantiated.

Good plan.  On all but the most-recent EDG compilers the "nestedness"
of symbol names generated may have a significant impact on compile
times.

> However there are two issues that make this hard to do:
>
>   (1) Knowing which templates to focus on.  That is, which templates
>       are effectively the "inner loops" in the compilation process, and
>       thus deserve the most attention when it comes to optimizing
>       them?

Heh.  Welcome to the black box of C++ metaprogramming.

>   (2) Knowing how to rewrite templates to make them faster.  I imagine
>       that "fewer templates instantiated" will mean "faster compile
>       times", but I don't actually know this for sure.  I have no
>       window into what the compiler is actually doing, to know what
>       takes so long.  Maybe it's the template instantiation process;
>       maybe it's all the inlining; maybe it's the code generation for
>       lots of tiny functions.  I don't know.

Yep, it's a nasty problem.  I suggest some experimentation.

> I have made some headway with (1): the unix utility "nm" lists all the
> symbols compiled into an executable program, and by parsing the output,
> I am able to determine which templates have been instantiated with the
> most number of different types.  My little script yields output like
>     ...
>     313  boost::fcpp::lambda_impl::exp::Value
>     314  boost::fcpp::lambda_impl::BracketCallable
>     606  boost::fcpp::lambda_impl::exp::CONS
>     609  boost::fcpp::full1
>     610  boost::intrusive_ptr
>     670  boost::fcpp::lambda_impl::exp::Call
> which tells me that the "Call" template class has been instantiated 670
> different ways in parser.cpp.  This at least gives me some idea of
> which classes to focus my optimizing attention on.  However a drawback
> of using the "nm" approach is that it only shows templates with
> run-time storage.  There are tons of template classes which contain
> nothing but typedefs, and I imagine they're being instantiated lots of
> ways too, and I don't know if this slows stuff down significantly too.

It does; see the link at the top of my reply.

Also, if you're using something called CONS you're probably also using
"hand-rolled" metaprograms.  MPL contains some interesting techniques
designed to reduce the stress on compilers (e.g. compile-time
recursion unrolling, lazy evaluation); you might try using the
high-level interface of MPL to see if it improves things.

> As to (2), I know nothing, other than the speculation that "fewer
> instantiations is better".
>
> So, that's where I am.  Help!  :)

You're on the right track.

-- 
Dave Abrahams
Boost Consulting
www.boost-consulting.com

_______________________________________________
Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost

Reply via email to