[Haskell-cafe] Re: [Haskell] simple function: stack overflow in hugs vs none in ghc

apfelmus Mon, 24 Sep 2007 03:32:42 -0700

john lask wrote:

test1 = readFile "big.dat" >>= (\x->print $ parse x)
test2 = readFile "big.dat" >>= (\x->print $ fst $ parse x)


test1 (on a large file) will succeed in ghc but fail in hugs

test2 on same file will succeed in both ghc and hugs

big.dat is just some large data file say 1MB.
(not particularly large by todays standards!)

The question: is there any changes that can be made to the code to
make test1 work in hugs without changing the essence of the function?

parse x = sqnc item x
  where

    item =( \ ts -> case ts of
                   [] -> ( Nothing, [])
                   ts -> ( Just (head ts), tail ts) )

    sqnc p ts =
       let ( r, ts' ) = p ts in case r of
            Nothing -> ([],ts')
            Just x -> let (r',ts'') = (sqnc p ts')  in ( x:r', ts'' )


Strange, this shouldn't happen :) You may want to try

  item []     = (Nothing, [])
  item (t:ts) = (Just t , ts)

but that shouldn't help ;)

Let's try to find out what's going on by doing graph reduction with ourbare hands. The preliminary material on

  http://en.wikibooks.org/wiki/Haskell/Graph_reduction

should help a bit. Ideally, there would be tool support (hat? otherdebugger?) but when things become too complicated, tools can only keepyou a few minutes longer above the water before drowning in complexity, too.

The main point is that (print $ parse x) and (print $ fst $ parse x)differ in that the latter only computes the answer but not the remainingtokens. So, the stack overflow is triggered when evaluating theremaining tokens, but I don't see why. What happens for (print $ snd $parse x) ?


Let's rewrite your code to figure out what's going on

  item [] = (Nothing, [])
  item ts = (Just (head ts), tail ts)

For sqnc , we need to translate stuff like let (a,b) = e in .Let-bound patterns aren't explained in the wikibook and in fact they'retricky. When done wrong, there may be space leaks, see also


 J. Sparud. Fixing Some Space Leaks without a Garbage Collector.
 http://citeseer.ist.psu.edu/sparud93fixing.html

I don't know whether its implemented in Hugs (probably not?) and GHC(probably, but maybe with bugs?). We'll use the not so good translatation


  let (a,b) = e  in e'
<=>
  let x = e; a = fst x; b = snd x; in e'


I'd like to call  sqnc  differently, namely  many . We get

  many p ts =
    let z   = p ts
        r   = fst z
        ts' = snd z
    in case r of
      Nothing -> ([], ts')
      Just x  ->
        let z'  = many p ts'
            r'  = fst z'
            ts''= snd z'
        in (x:r', ts'')

Intimidating, no? :) Now, let's evaluate an example expression, like

 many item (1:2:3:...)

(the list is intended to be finite, but we'll decide later about itslength). To preserve space and stay sane, we'll only focus on the thingsthat get evaluated and write ... for the rest. Let's start:


 many item (1:2:3:...)
 => let ts = 1:2:3:... in
      let ... z = item ts; r = fst z; ... in  case r of ...
 => let ... z = (Just (head ts), tail ts); r = fst z ...
 => let ... z = (r, tail ts); r = Just (head ts) ... in  case r of

The above step is not clear from the description in the wikibook, butit's a handy notation of saying that the first component and r point tothe same thing. Expanding the case expression yields (in full form)


 => let ts = 1:2:3: ... in
    let z  = (r, tail ts)
        r  = Just x
        x  = head ts
        ts'= snd z
    in
      let z'  = many item ts'
          r'  = fst z'
          ts''= snd z'
      in (x:r', ts'')

This is the weak head normal form of our expression. Of course, wewanted print (many item ts) = putStrLn (show ...) which meansevaluating the first component and then the second component in the pairto full normal form. So, the next redex to be reduced is x followed byr' which forces z' which at least forces ts'


 => ...
 => let ts  = x:ts'
        x   = 1
        ts' = 2:3:...
    in
      let z   = (r, ts')
          r   = Just x
      in
        let z'  = let ... in (..,..)
            r'  = fst z'
            ts''= snd z'
        in (x:r',ts'')

To stay sane, we garbage collect z and r and rename variables beforeexpanding the expression for z' which is obtained in the same way weobtained it before


 let ts0 = x0 : ts1
     x0  = 1
     ts1 = 2:3:...

     z0 = let z  = (r, tail ts1)
              r  = Just x
              x  = head ts1
              ts'= snd z
          in
            let z'  = many item ts'
                r'  = fst z'
                ts''= snd z'
            in (x:r', ts'')

     r0  = fst z0
     us0 = snd z0

 in (x0:r0, us0)

Collecting  lets  and renaming yields

 let ts0 = x0 : ts1
     x0  = 1
     ts1 = 2:3:...

     z   = (r, tail ts1)
     r   = Just x1
     x1  = head ts1
     ts' = snd z

     z1  = many item ts'
     r1  = fst z1
     us1 = snd z1

     z0  = (x1:r1, us1)
     r0  = fst z0
     us0 = snd z0

 in (x0:r0, us0)

The insight is that the original naming was bad, r and z are quitedifferent from r0 and z0. Reducing r0 and x1 yields


 =>
 let ts0 = x0 : ts1
     x0  = 1
     ts1 = x1 : ts2
     x1  = 2
     ts2 = 3:...

     z   = (r, tail ts1)
     r   = Just x1
     ts' = snd z

     z1  = many item ts'
     r1  = fst z1
     us1 = snd z1

     z0  = (r0, us1)
     r0  = x1:r1
     us0 = snd z0

 in (x0:r0, us0)

The general scheme should be clear now: z,r and ts' are temporaryvariables and further reduction of r1, r2 and so on leads to a chain


 let x0 = 1; ts0 = x0 : ts1
     x1 = 2; ts1 = x1 : ts2
     x2 = 3; ts2 = x2 : ts3
     ...
     x8 = ..

     z   = (r, tail ts8)
     r   = Just x8
     ts' = snd z

     z8  = many item ts'
     r8  = fst z8
     us8 = snd z8

     z7  = (r7, us8)
     r7  = x8:r8
     us7 = snd z7
     ...
     z0  = (r0, us1)
     r0  = x1:r1
     us0 = snd z0

 in (x0:r0, us0)

So, after forcing the first component of the overall result to normalform, the result looks like


 (1:2:3:..., snd (_,snd (_,snd (_,...))) )

and it seems that Hugs fails to evaluate the tail recursive chain ofsnd ??

In the end, here's our decisive result: either Hugs or my analysis has abug :D


Regards,
apfelmus

_______________________________________________
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

[Haskell-cafe] Re: [Haskell] simple function: stack overflow in hugs vs none in ghc

Reply via email to