On Wed, May 14, 2008 at 06:08:28PM +0100, Paul Johnson wrote: > We've had a big debate over > > > mean xs = foldl' (+) 0 xs / fromIntegral (length xs) > > For anyone who didn't follow it, the problem is that "mean" needs to > traverse its argument twice, so the entire list has to be held in > memory. So if "xs = [1..1000000000]" then "mean xs" uses all your > memory, but "foldl' (+) 0 xs" and "length xs" by themselves do not. > > The solution is for the programmer to rewrite "mean" to accumulate a > pair containing the running total and count together, then do the > division. This makes me wonder: could there be a compiler optimisation > rule for this, collapsing two iterations over a list into one. I've > never tried writing GHC rules, but something like: > > > f (foldl' g x) (foldl' h x) = (uncurry f) (foldl' (\i (gt,ht) -> (g i > gt, h i ht))) > > The first problem that occurs to me is the number of permutations of > fold and map functions. > > Paul.
There are two problems with this rule. The first is that the function is not strict in 'gt' and 'ht' -- this is easily fixed with a little bit of seq. The other problem is that 'f' must be strict in both parameters for this rule to hold. As far as I know, there is no access to strictness information in rule pragmas. Cheers, Spencer Janssen _______________________________________________ Haskell-Cafe mailing list [email protected] http://www.haskell.org/mailman/listinfo/haskell-cafe
