------- Comment #2 from jacob at math dot jussieu dot fr  2007-09-30 09:16 
-------
Here are some thoughts about why it is so fast with g++-4.2, perhaps related to
why it segfaults.

My library is an Expression Templates library. So when you do m1+m2 with
matrices m1 and m2, instead of computing the sum of these two matrices, it
constructs a new object of type (roughly) Sum<Matrix,Matrix> and passes to its
contructor references to m1 and m2. So when you do m3=m1+m2 it (roughly) calls
Matrix::operator=(Sum<....>) which calls Sum<...>::read() to evaluate the
entries in the matrix sum.

It is very important that the compiler be clever enough to understand that the
objects of type Sum<...> are short-lived, so it doesn't need to emit any code
for them in the final binary.

g++ 4.1 didn't understand that, so it produced slow code. g++ 4.2 understands
that, so it optimizes accordingly. That explains why 4.2 produces 4x faster
code in my benchmarks. But I am afraid that I might be hitting a bug in this
optimization.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=33599

Reply via email to