At 18:46 02/01/2008, willem wrote:
Vincent has made an improved Mandelbrot benchmark.
i did run the original mandelbrot benchmark with N =5000. It took 2: 20
minutes.
The improved version took 2:00 minutes.
If you see the gcc version you'll see that it relies on SSE2 instructions
to perform two double computations with one instruction:
http://shootout.alioth.debian.org/gp4/benchmark.php?test=mandelbrot&lang=gcc&id=3
"
Uses SSE packed doubles to run the inner loop computations in parallel.
I don't have a machine with SSE to test with, but the assembly looks
pretty nice. With gcc-3.4.2 there's no difference in the assembly
between -msse2 and -msse3, YMMV. It uses gcc's vector extentions
( http://gcc.gnu.org/onlinedocs/gcc-4.0.0/gcc/Vector-Extensions.html ),
so it will run (slowly) on hardware without SSE.
"
That is why it is twice as fast. The "-funroll-loops" may help a bit also,
because it is very aggressive and can eradicate some loop variables...
I think that the only "easy" way for fpc to be competitive in cases like
these is to have explicit types for MMX and SSEn data and the "functions"
to operate with them, like gcc does...
And, no I'm not the one who can send those patches :(
Paulo Costa
_________________________________________________________________
To unsubscribe: mail [EMAIL PROTECTED] with
"unsubscribe" as the Subject
archives at http://www.lazarus.freepascal.org/mailarchives