Hum, I think the compiler is just removing the inner loop when I don't use 
tmp_3 since it's never used. So I guess 3.8s is the correct time, which 
still feel a bit slow.

Reply via email to