Re[2]: GHC vs. GCC on raw vector addition

2006-01-19 Thread Bulat Ziganshin
Hello John, Thursday, January 19, 2006, 4:42:47 AM, you wrote: sorry, with the gcc -O3 -ffast-math -fstrict-aliasing -funroll-loops the C version is 50 times faster than best Haskell one... it's the loop from C version: JM I believe something similar to what I noted here is the culprit: JM

Re: GHC vs. GCC on raw vector addition

2006-01-19 Thread Simon Marlow
John Meacham wrote: On Wed, Jan 18, 2006 at 06:18:29PM +0300, Bulat Ziganshin wrote: :) even C version performs only 20 millions of additions in one second because this program is most limited by memory throughput - it access to 24 memory bytes per each addition. GHC just can't produce simple

Re: GHC vs. GCC on raw vector addition

2006-01-19 Thread Simon Marlow
John Meacham wrote: On Wed, Jan 18, 2006 at 08:54:43PM +0300, Bulat Ziganshin wrote: sorry, with the gcc -O3 -ffast-math -fstrict-aliasing -funroll-loops the C version is 50 times faster than best Haskell one... it's the loop from C version: I believe something similar to what I noted here is

GHC vs. GCC on raw vector addition

2006-01-18 Thread Sven Moritz Hallberg
Hi List, I'm running GHC and GCC head-to-head on the task of adding a bunch of long IOUArray-Vectors really fast. My machine is a Linux-ppc PowerBook and gets a runtime for the GHC-compiled binary that's about 10x as long as for GCC. Simon M. tells me this should be much better. Here are the

Re: GHC vs. GCC on raw vector addition

2006-01-18 Thread Malcolm Wallace
Sven Moritz Hallberg [EMAIL PROTECTED] writes: I'm running GHC and GCC head-to-head on the task of adding a bunch of long IOUArray-Vectors really fast. My machine is a Linux-ppc PowerBook and gets a runtime for the GHC-compiled binary that's about 10x as long as for GCC. Is it possible that

Re: GHC vs. GCC on raw vector addition

2006-01-18 Thread Bulat Ziganshin
Hello Sven, Wednesday, January 18, 2006, 3:33:40 PM, you wrote: SMH and gets a runtime for the GHC-compiled binary that's about 10x as long SMH as for GCC. Simon M. tells me this should be much better. Here are the attached version is only 5 times slower :) please note that 1)

Re: GHC vs. GCC on raw vector addition

2006-01-18 Thread Simon Marlow
Bulat Ziganshin wrote: Wednesday, January 18, 2006, 3:33:40 PM, you wrote: SMH and gets a runtime for the GHC-compiled binary that's about 10x as long SMH as for GCC. Simon M. tells me this should be much better. Here are the attached version is only 5 times slower :) please note that 1)

Re[2]: GHC vs. GCC on raw vector addition

2006-01-18 Thread Bulat Ziganshin
Hello Malcolm, Wednesday, January 18, 2006, 4:22:23 PM, you wrote: I'm running GHC and GCC head-to-head on the task of adding a bunch of long IOUArray-Vectors really fast. My machine is a Linux-ppc PowerBook and gets a runtime for the GHC-compiled binary that's about 10x as long as for GCC.

Re[2]: GHC vs. GCC on raw vector addition

2006-01-18 Thread Bulat Ziganshin
Hello Simon, Wednesday, January 18, 2006, 5:31:25 PM, you wrote: 2) generating random values takes about 1.5-2 seconds by itself. Haskell's RNG is very different from C's one SM I squeezed a bit more out (see attached). x `seq` v `seq` return () it's new trick for me :) now the

Re[3]: GHC vs. GCC on raw vector addition

2006-01-18 Thread Bulat Ziganshin
Hello Bulat, Wednesday, January 18, 2006, 8:34:54 PM, you wrote: BZ the only cause that this code is only 3 times slower is that C version BZ is really limited by memory speed. when tested on 1000-element BZ arrays, it is 20 times slower. i'm not yet tried SSE optimization for BZ gcc ;) sorry,

Re: GHC vs. GCC on raw vector addition

2006-01-18 Thread John Meacham
On Wed, Jan 18, 2006 at 06:18:29PM +0300, Bulat Ziganshin wrote: :) even C version performs only 20 millions of additions in one second because this program is most limited by memory throughput - it access to 24 memory bytes per each addition. GHC just can't produce simple loops even for

Re: GHC vs. GCC on raw vector addition

2006-01-18 Thread John Meacham
On Wed, Jan 18, 2006 at 08:54:43PM +0300, Bulat Ziganshin wrote: sorry, with the gcc -O3 -ffast-math -fstrict-aliasing -funroll-loops the C version is 50 times faster than best Haskell one... it's the loop from C version: I believe something similar to what I noted here is the culprit: