I was looking at Michael Snoyman's post on pipes parsers, 

    http://www.yesodweb.com/blog/2014/02/ideas-pipes-parse

and working through some of the reddit controversy 

  
 http://www.reddit.com/r/haskell/comments/1xmmtn/some_ideas_for_pipesparse/ 

trying to think if it might matter to `pipes-text`

I worked from the attempted pipes variant of his conduit
parser for a special archival format for a 'File' type.
Here is the result I will discuss below

https://gist.github.com/michaelt/9122379

The first form of the parser (`fileParser0`) was 
not so bad when I used it on simple files.
Then I increased the length of the file to be parsed
into a succession of `Files` resulting in a 1M file.
(You can write a suitable file by uncommenting 
the first line in `main`)

On 1M of data, the obvious ways of 
formulating the parser in the pipes-parse style
seem to go wildly wrong. The extravagance of 
my computer's response -- "swap space full", etc -- 
may have something to do with os x vs. ghc-7.6 
but they are obviously bad programs. I couldn't
get them to complete.

That is the result if I use either of `fileParser0` or
`fileParser1`.  

If however I use `parse2`, where I limit
each use of `zoom utf8` by a prior
application of `PB.splitAt` (or break)
the 1M file is done in a half-second and 
uses no significant memory.

Is there a mistake I am making, a mistake
in the implementation of something, 
or some simple rule that can be stated 
to avoid such things?  

Also I wonder if the result of using e.g `fileParser1` is as pathological 
on other os's.

yours Michael 


-- 
You received this message because you are subscribed to the Google Groups 
"Haskell Pipes" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].

Reply via email to