Brian McNamara <[EMAIL PROTECTED]> writes: > Template libraries, especially those employing expression templates, > take a long time to compile. As an example, one of the example files > for FC++ (parser.cpp) takes about 10 minutes to compile on a blazingly > fast machine with tons of RAM.
Which compiler? Have you seen http://users.rcn.com/abrahams/instantiation_speed/index.html? > I would like to reduce the compile-time. I solicit any help/advice on > the topic Switching compilers may be your best bet. > I am hoping some of the Boost contributors will have run into this > same problem with their own libraries, and have found some ways to > address it. Generally speaking, the key is to reduce the number of template instantiations, but it's also a good idea to eliminate unused computations... avoid "traits blob" templates since all the nested definitions need to be evaluated if you want to use just one; use mpl::apply_if and logical operators and_/or_ to avoid needless evaluations. > Here is what I have already figured out. > > First off, in FC++, there are a number of templates whose sole > purpose is to provide better compiler diagnostics (along the same > general lines as concept_checks). I rewrote the library code so > that these checks are only enabled when a certain preprocessor flag > is defined. Turning off these checks reduced the compile-time of > parser.cpp from 10 minutes to 8 minutes--a significant speedup. > > That was the most obvious piece of "low-hanging fruit"; since the > code to produce the compile-time diagnostics doesn't do anything at > run-time, it was straightforward to just have a switch to turn it on > and off. > > I imagine there are other things I can do to rewrite some of the > library templates that are doing "real work" so that they compile > faster. Specifically, I imagine that some templates can be > rewritten so that they cause fewer auxiliary templates to be > instantiated each time the main template gets instantiated. Good plan. On all but the most-recent EDG compilers the "nestedness" of symbol names generated may have a significant impact on compile times. > However there are two issues that make this hard to do: > > (1) Knowing which templates to focus on. That is, which templates > are effectively the "inner loops" in the compilation process, and > thus deserve the most attention when it comes to optimizing > them? Heh. Welcome to the black box of C++ metaprogramming. > (2) Knowing how to rewrite templates to make them faster. I imagine > that "fewer templates instantiated" will mean "faster compile > times", but I don't actually know this for sure. I have no > window into what the compiler is actually doing, to know what > takes so long. Maybe it's the template instantiation process; > maybe it's all the inlining; maybe it's the code generation for > lots of tiny functions. I don't know. Yep, it's a nasty problem. I suggest some experimentation. > I have made some headway with (1): the unix utility "nm" lists all the > symbols compiled into an executable program, and by parsing the output, > I am able to determine which templates have been instantiated with the > most number of different types. My little script yields output like > ... > 313 boost::fcpp::lambda_impl::exp::Value > 314 boost::fcpp::lambda_impl::BracketCallable > 606 boost::fcpp::lambda_impl::exp::CONS > 609 boost::fcpp::full1 > 610 boost::intrusive_ptr > 670 boost::fcpp::lambda_impl::exp::Call > which tells me that the "Call" template class has been instantiated 670 > different ways in parser.cpp. This at least gives me some idea of > which classes to focus my optimizing attention on. However a drawback > of using the "nm" approach is that it only shows templates with > run-time storage. There are tons of template classes which contain > nothing but typedefs, and I imagine they're being instantiated lots of > ways too, and I don't know if this slows stuff down significantly too. It does; see the link at the top of my reply. Also, if you're using something called CONS you're probably also using "hand-rolled" metaprograms. MPL contains some interesting techniques designed to reduce the stress on compilers (e.g. compile-time recursion unrolling, lazy evaluation); you might try using the high-level interface of MPL to see if it improves things. > As to (2), I know nothing, other than the speculation that "fewer > instantiations is better". > > So, that's where I am. Help! :) You're on the right track. -- Dave Abrahams Boost Consulting www.boost-consulting.com _______________________________________________ Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost