At 18:46 02/01/2008, willem wrote:
Vincent has made an improved Mandelbrot benchmark.

i did run the original mandelbrot benchmark with N =5000. It took 2: 20 minutes.

The improved version took 2:00 minutes.


If you see the gcc version you'll see that it relies on SSE2 instructions to perform two double computations with one instruction:

http://shootout.alioth.debian.org/gp4/benchmark.php?test=mandelbrot&lang=gcc&id=3
"
 Uses SSE packed doubles to run the inner loop computations in parallel.
  I don't have a machine with SSE to test with, but the assembly looks
  pretty nice.  With gcc-3.4.2 there's no difference in the assembly
  between -msse2 and -msse3, YMMV.  It uses gcc's vector extentions
  ( http://gcc.gnu.org/onlinedocs/gcc-4.0.0/gcc/Vector-Extensions.html ),
  so it will run (slowly) on hardware without SSE.
"
That is why it is twice as fast. The "-funroll-loops" may help a bit also, because it is very aggressive and can eradicate some loop variables...

I think that the only "easy" way for fpc to be competitive in cases like these is to have explicit types for MMX and SSEn data and the "functions" to operate with them, like gcc does...
And, no I'm not the one who can send those patches :(


Paulo Costa


_________________________________________________________________
    To unsubscribe: mail [EMAIL PROTECTED] with
               "unsubscribe" as the Subject
  archives at http://www.lazarus.freepascal.org/mailarchives

Reply via email to