On May 19, 2006, at 6:16 PM, Henning Thielemann wrote:
let ~(ts, strm') = idX strm
    ~(us, strm'') = idX strm'

I seem to have found a partial solution to the problem.
It's rather ugly, however, and I think there should be
a better way.

The original definition for one of the clauses was like
this:

  idX (StartEvent a : strm) =
      let (ts, strm') = idX strm
          (us, strm'') = idX strm'
      in (StartEvent a [] : ts ++ EndEvent a : us, strm'')

The intention was to thread strm through the recursive calls
and free the items in strm as soon as they are processed. However,
with this definition I needed about 8Mb of heap for a 1Mb input
file. Since I suspected that delayed evaluation of us and strm''
keeps the pointer to strm around for too long, the natural solution
seems to be forcing their evaluation. So I tried:

  idX (StartEvent a : strm) =
      case idX strm of
         (ts, strm') ->
             case idX strm' of
               (us, strm'') ->
                 (StartEvent a [] : ts ++ EndEvent a : us, strm'')

which made things worse. The heap size increased a bit more
and I get a typical "bell" shaped heap profile suggesting idX
cannot output anything before processing the whole input.

So I have to allow idX to output something while consuming
the input. What eventually worked was this messy mixture
of let and case:

  idX (StartEvent a : strm) =
    let (xs, rest) =
         case idX strm of
           (ts, strm') ->
              let (ws, rest) =
                    case idX strm' of
                      (us, strm'') -> (us, strm'')
              in (ts ++ EndEvent a : ws, rest)
    in (StartEvent a [] : xs, rest)

The heap residency reduced from about 8MB to around 80Kb.

This is rather tricky, however. The following "nicer looking"
version has a heap profile as bad as the worse case.

   idX (StartEvent a : strm) =
     let (ts,us,strm'') =
         case idXsp strm of
           (ts, strm') ->
               case idXsp strm' of
                 (us, strm'') -> (ts, us, strm'')
     in (StartEvent a [] : ts ++ EndEvent a : us, strm'')

And I cannot quite explain why (Anyone?).

Is there a more structured solution? Can I leave the hard
work to the compiler?

sincerely,
Shin-Cheng Mu
_______________________________________________
Haskell-Cafe mailing list
[email protected]
http://www.haskell.org/mailman/listinfo/haskell-cafe

Reply via email to