[Haskell-cafe] Re: What I learned from my first serious attempt low-level Haskell programming

2007-04-11 Thread Lennart Augustsson
OK, so there are few extra costs for unknown tail calls. First, there might be a pipeline stall if the jump address isn't loaded into a register far enough in advance before the jump. Second, you need to pass information about the number of arguments in the caller. And finally, you need to

[Haskell-cafe] Re: What I learned from my first serious attempt low-level Haskell programming

2007-04-11 Thread Simon Marlow
Lennart Augustsson wrote: So then tail calls should be very cheap when most of the arguments don't change. Yes, but the problem tends to be the arguments that change, and the fact that they are passed on the stack. A C loop would keep the loop-carried variables in registers. On x86_64 you

[Haskell-cafe] Re: What I learned from my first serious attempt low-level Haskell programming

2007-04-10 Thread Simon Marlow
Lennart Augustsson wrote: It's not that hard to figure out an order to permute the arguments on the stack before a tail call that minimizes that number of moves and temporary locations. Lmlc did this 20 years ago. :) Right, and that's what GHC does too, with a strongly-connected-component

[Haskell-cafe] Re: What I learned from my first serious attempt low-level Haskell programming

2007-04-10 Thread Lennart Augustsson
So then tail calls should be very cheap when most of the arguments don't change. On Apr 10, 2007, at 10:17 , Simon Marlow wrote: Lennart Augustsson wrote: It's not that hard to figure out an order to permute the arguments on the stack before a tail call that minimizes that number of

[Haskell-cafe] Re: What I learned from my first serious attempt low-level Haskell programming

2007-04-10 Thread Stefan O'Rear
On Tue, Apr 10, 2007 at 07:59:04PM +0100, Lennart Augustsson wrote: So then tail calls should be very cheap when most of the arguments don't change. On Apr 10, 2007, at 10:17 , Simon Marlow wrote: Lennart Augustsson wrote: It's not that hard to figure out an order to permute the

Re: [Haskell-cafe] RE: What I learned from my first serious attempt low-level Haskell programming

2007-04-08 Thread Thomas Conway
| 6. The inliner is a bit too greedy. Removing the slow-path code from |singleton doesn't help because popSingleton is only used once; but |if I explicitly {-# NOINLINE popSingleton #-}, the code for |singleton itself becomes much smaller, and inlinable (15% perf |gain). Plus

[Haskell-cafe] Re: What I learned from my first serious attempt low-level Haskell programming

2007-04-05 Thread Simon Marlow
Stefan O'Rear wrote: 2. Parameters are very expensive. Our type of functions that build (ignoring CPS for the time being) was MBA# - Int# - [ByteString], where the Int# is the current write pointer. Adding an extra Int# to cache the size of the array (rather than calling sMBA# each

[Haskell-cafe] RE: What I learned from my first serious attempt low-level Haskell programming

2007-04-05 Thread Simon Peyton-Jones
| 5. State# threads clog the optimizer quite effectively. Replacing |st(n-1)# with realWorld# everywhere I could count on data |dependencies to do the same job doubled performance. The idea is that the optimiser should allow you to write at a high level, and do the book keeping for you.

Re: [Haskell-cafe] Re: What I learned from my first serious attempt low-level Haskell programming

2007-04-05 Thread Lennart Augustsson
It's not that hard to figure out an order to permute the arguments on the stack before a tail call that minimizes that number of moves and temporary locations. Lmlc did this 20 years ago. :) -- Lennart On Apr 5, 2007, at 19:17 , Claus Reinke wrote: Stefan O'Rear wrote: 2.