On Tue, Aug 21, 2007 at 01:14:20PM +0100, Rodrigo Queiro wrote:
> On my system, the C version runs about 9x faster than the haskell
> version (with -O3 and -O2 -fvia-c -optc-O3 respectively). However, GCC
> seems to produce about 70 lines of assembly for the main loop,
> compared to about 10 from GHC. I suspect the speed difference is the
> result of some heavy optimisation by GCC, which would need to be
> hand-tuned for GHC. (I would be interested to see what this would be.
> Unfortunately I don't know x86 assembly well enough to understand the
> GCC output.)

The fundamental problem is that GHC doesn't have enough registers to to
a good job with Haskell.  Normal Haskell code  makes extensive use of
the GHC stack for function calls, the C stack for signal handlers, the
capability base pointer for thread state, and the heap for everything
else.  Which doesn't really leave us in a good state for optimizing.  In
particular, x86 ghc ALWAYS passes parameters on the stack, even for tail
calls.  I didn't actually bother to check, but I'm pretty sure that's
what the OP was noticing - if you look carefully it's not actually
pushing or popping anything, just using stack memory.

Situations are far better on x86_64 (16 registers) and ppc (32
registers).  There is some work being done on the backend to improve
this (in particular, a new and much better register allocator and a
parameter-aware Cmm system).

Stefan

Attachment: signature.asc
Description: Digital signature

_______________________________________________
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

Reply via email to