I'm curious how much of the unboxing helped performance and how much
didn't. In my experience playing with this stuff, GHC's strictness
analyzer has consistently been really excellent, given the right
hints. Unboxed tuples are generally a win, but GHC was often smarter
at unboxing ints than I was. Also, higher-order functions can be used
fine, assuming they're not awful recursive, as long as you set GHC's
unfolding threshold high enough. You can also manually force their
inlining, to really get control. I found there tended to be a "sweet
spot" for any given program, where much higher or lower would degrade
performance. As far as removing the word2int# and soforth, you could
probably just use words throughout?
Also, as we discovered the other week, you may want to be careful
with bang patterns, as forcing strictness on already evaluated values
takes some time. Although, SPJ did point out that this was
significantly improved in the new GHC.
But yes, I found that going through and manually unboxing a working
implementation with the help of type errors was actually a sort of
relaxing and zenlike exercise.
--S
On Dec 20, 2007, at 6:07 PM, Justin Bailey wrote:
I'm back with another version of my cellular automata simulator.
Since the last iteration I discovered GHC's unlifted types and the
primitive operations that go with them. Using these types, rather
than Ints, sped my code up by 2x or more.
http://hpaste.org/4151#a2 -- first half of program
http://hpaste.org/4151#a3 -- remaining (note 3 lines or so from
first half are repeated)
The key observation came from looking at the Core files, which
showed a lot of int2word# and word2int# conversions going on.
Figuring out how to remove those led me to the unlifted types.
Coding with these types is not easy (for example, I couldn't see a
way to write a Word# value directly - I had to write stuff like
"int2Word# 1#"), but because I had an existing algorithm to guide
me, combined with type checking, it was a fairly straightforward
implementation.
At first I felt kind of bad about using these operations, but then
I noticed they are used pretty heavily by GHC itself. If it's good
enough for GHC, it's good enough for me. The 2x performance gain
didn't hurt either. Finally, the safety that comes from using the
ST monad is just awesome. I've got low-level bit munging combined
with high-level abstraction where I need it. So cool!
I was disappointed to find early on that using higher-order
functions in tight loops was a performance killer. It's unfortunate
because I started with a very elegant, short implementation based
on a simple Ring buffer and map. The current version is certainly
harder to understand and has some weird limitations. However,
having the simple implementation let me use quickcheck to compare
their results on random rules and inputs, which gave me high
confidence that my complex implemenation is correct.
One thing I absolutely love about this program is its memory
performance. It manages to stay within 1 - 10 MB of memory,
depending on how much output is produced. How cool is that?
On Dec 3, 2007 2:44 AM, Mirko Rahn <[EMAIL PROTECTED] > wrote:
It is interesting, that the naive implementation
...
is only 3 times slower than your quite complex, hard to follow and
hard
to debug implementation.
Now the naive implementation is 100x slower, so I don't feel so bad
about this comment any more.
As always, I prefer to write most code in Haskell, quick, easy, nice,
reasonable fast, ... If speed matters, I switch to some lower level
language, as you did staying inside Haskell.
I have to take exception here - I *want* to write my code in
Haskell. If Haskell isn't fast enough to beat the C implementation,
I'd rather find a way to make my Haskell program faster than switch
to some other language.
Justin
_______________________________________________
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe
_______________________________________________
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe