On Sun, Feb 19, 2012 at 9:16 AM, David Cournapeau <courn...@gmail.com> wrote: > On Sun, Feb 19, 2012 at 8:08 AM, Mark Wiebe <mwwi...@gmail.com> wrote: >> Is there a specific >> target platform/compiler combination you're thinking of where we can do >> tests on this? I don't believe the compile times are as bad as many people >> suspect, can you give some simple examples of things we might do in NumPy >> you expect to compile slower in C++ vs C? > > Switching from gcc to g++ on the same codebase should not change much > compilation times. We should test, but that's not what worries me. > What worries me is when we start using C++ specific code, STL and co. > Today, scipy.sparse.sparsetools takes half of the build time of the > whole scipy, and it does not even use fancy features. It also takes Gb > of ram when building in parallel.
I like C++ but it definitely does have issues with compilation times. IIRC the main problem is very simple: STL and friends (e.g. Boost) are huge libraries, and because they use templates, the entire source code is in the header files. That means that as soon as you #include a few standard C++ headers, your innocent little source file has suddenly become hundreds of thousands of lines long, and it just takes the compiler a while to churn through megabytes of source code, no matter what it is. (Effectively you recompile some significant fraction of STL from scratch on every file, and then throw it away.) Precompiled headers can help some, but require complex and highly non-portable build-system support. (E.g., gcc's precompiled header constraints are here: http://gcc.gnu.org/onlinedocs/gcc/Precompiled-Headers.html -- only one per source file, etc.) To demonstrate: a trivial hello-world in C using <stdio.h>, versus a trivial version in C++ using <iostream>. On my laptop (gcc 4.5.2), compiling each program 100 times in a loop requires: C: 2.28 CPU seconds C compiled with C++ compiler: 4.61 CPU seconds C++: 17.66 CPU seconds Slowdown for using g++ instead of gcc: 2.0x Slowdown for using C++ standard library: 3.8x Total C++ penalty: 7.8x Lines of code compiled in each case: $ gcc -E hello.c | wc 855 2039 16934 $ g++ -E hello.cc | wc 18569 40994 437954 (I.e., the C++ hello world is almost half a megabyte.) Of course we won't be using <iostream>, but <vector>, <unordered_map> etc. all have the same basic character. -- Nathaniel (Test files attached, times were from: time sh -c 'for i in $(seq 100); do gcc hello.c -o hello-c; done' cp hello.c c-hello.cc time sh -c 'for i in $(seq 100); do g++ c-hello.cc -o c-hello-cc; done' time sh -c 'for i in $(seq 100); do g++ hello.cc -o hello-cc; done' and then summing the resulting user and system times.)
#include <stdio.h> int main(int argc, char **argv) { printf("Hello, world!\n"); return 0; }
#include <iostream> int main(int argc, char **argv) { std::cout << "Hello, world!" << std::endl; return 0; }
_______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion