Don Stewart wrote:
jeff.polakow:
Hello,
One frequent criticism of Haskell (and by extension GHC) is that it has
unpredictable performance and memory consumption. I personally do not find
this to be the case. I suspect that most programmer confusion is rooted in
shaky knowledge of lazy evaluation; and I have been able to fix, with
relative ease, the various performance problems I've run into. However I
am not doing any sort of performance critical computing (I care about
minutes or seconds, but not about milliseconds).
I would like to know what others think about this. Is GHC predictable? Is
a thorough knowledge of lazy evaluation good enough to write efficient
(whatever that means to you) code? Or is intimate knowledge of GHC's
innards necessary?
thanks,
Jeff
PS I am conflating Haskell and GHC because I use GHC (with its extensions)
and it produces (to my knowledge) the fastest code.
This has been my experience to. I'm not even sure where
"unpredicatiblity" would even come in, other than though not
understanding the demand patterns of the code.
It's relatively easy to look at the Core to get a precise understanding
of the runtime behaviour.
I've also not found the GC unpredicatble either.
I offer up the following example:
mean xs = sum xs / length xs
Now try, say, "mean [1.. 1e9]", and watch GHC eat several GB of RAM. (!!)
If we now rearrange this to
mean = (\(s,n) -> s / n) . foldr (\x (s,n) -> let s' = s+x; n' = n+1
in s' `seq` n' `seq` (s', n')) (0,0)
and run the same example, and watch it run in constant space.
Of course, the first version is clearly readable, while the second one
is almost utterly incomprehensible, especially to a beginner. (It's even
more fun that you need all those seq calls in there to make it work
properly.)
The sad fact is that if you just write something in Haskell in a nice,
declarative style, then roughly 20% of the time you get good
performance, and 80% of the time you get laughably poor performance. For
example, I sat down and spent the best part of a day writing an MD5
implementation. Eventually I got it so that all the test vectors work
right. (Stupid little-endian nonsense... mutter mutter...) When I tried
it on a file containing more than 1 MB of data... ooooohhhh dear... I
gave up after waiting several minutes for an operation that the C
implementation can do in milliseconds. I'm sure there's some way of
fixing this, but... the source code is pretty damn large, and very messy
as it is. I shudder to think what you'd need to do to it to speed it up.
Of course, the first step in any serious attempt at performance
improvement is to actually profile the code to figure out where the time
is being spent. Laziness is *not* your friend here. I've more or less
given up trying to comprehend the numbers I get back from the GHC
profiles, because they apparently defy logic. I'm sure there's a reason
to the madness somewhere, but... for nontrivial programs, it's just too
hard to figure out what's going on.
Probably the best part is that almost any nontrivial program you write
spends 60% or more of its time doing GC rather than actual work. Good
luck with the heap profiler. It's even more mysterious than the time
profiles. ;-)
In short, as a fairly new Haskell programmer, I find it completely
impossibly to write code that doesn't crawl along at a snail's pace.
Even when I manage to make it faster, I usually have no clue why. (E.g.,
adding a seq to a mergesort made it 10x faster. Why? Changing from
strict ByteString to lazy ByteString made one program 100x faster. Why?)
Note that I'm not *blaming* GHC for this - I think it's just inherantly
very hard to predict performance in a lazy language. (Remember,
deterministic isn't the same as predictable - see Chaos Theory for why.)
I wish it wasn't - becuase I really *really* want to write all my
complex compute-bounded programs in Haskell, because it makes algorithms
so beautifully easy to express. But when you're trying to implement
something that takes hours to run even in C...
_______________________________________________
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe