Arjan,

I supervise a student who uses Haskell for simulating neural nets. A lot of
computation goes on there and the program cannot handle as many iterations
as we would like. We have improved performance by a factor of more than ten
by putting strictness annotations in all data types. But the problem is that
there are lists within these data structures and they are not strict. By
using a trick I have now found that making the whole program state strict
the program gets a much better looking heap profile. The trick I use is by
writing a function hyperseq and then applying it before the next iterations
begins:

hyperseq x y = if x==x then y else error "this is very unlikely"

This is very expensive and I chose to apply the function only once every 100
iterations. This gives reasonable performance.

...
1) Is there a more efficient definition of hyperseq that does not traverse
the data structure twice? The "show" function traverses the structure once
but I found it to be much slower.


I hit a similar problem with my very first Haskell application! As you say, the brute solution of comparing structures for equality with themselves is very expensive, and may be wrong if some components really should be evaluated lazily. My solution was to introduce a single-method class

class Norm a where normal :: a -> Bool

The intention is that 'normal' functions are defined in such a way that the result of an application 'normal x' is always true, but computing this truth guarantees that x is evaluated to at least a desired minimal extent. For example, supppose we have

data Tree a b = Branch (Tree a) a (Tree a) | Leaf b

we might then define

instance Norm (Tree a b) where
 normal  (Leaf _) = True
 normal  (Branch _ _ _) = True

hiding the Leaf and Branch constructors and providing instead

leaf :: Norm b => b  -> Tree a b
leaf x | normal x = Leaf x

branch :: Tree a b -> b -> Tree a b -> Tree a b
branch lt x rt | normal lt && normal rt = Branch lt x rt

we obtain trees that are path-strict in the branch and leaf structure, do not interfere at all with the lazy evaluation of internal labels at branches but ensure 'normal' evaluation of leaf labels.

Of course there are many other possible definitions of 'normal' for trees. All sorts of clever tricks could be programmed into the normalising evaluations. But there should be no need for repeated traversals (let alone repeated double traversals, using ==) of already-evaluated structure.

With a suitable instance of Norm, your hyperseq function can be redefined

hyperseq :: Norm a => a -> b -> b
hyperseq x y | normal x = y

Regards
Colin R

Reference: Colin Runciman, "Tip in Haskell -- another exercise in functional programming", pp 278-292 in Proc. 1991 Glasgow Workshop on Functional Programming, Springer-Verlag, 1992.

_______________________________________________
Haskell mailing list
[EMAIL PROTECTED]
http://www.haskell.org/mailman/listinfo/haskell

Reply via email to