I was looking at Michael Snoyman's post on pipes parsers,
http://www.yesodweb.com/blog/2014/02/ideas-pipes-parse
and working through some of the reddit controversy
http://www.reddit.com/r/haskell/comments/1xmmtn/some_ideas_for_pipesparse/
trying to think if it might matter to `pipes-text`
I worked from the attempted pipes variant of his conduit
parser for a special archival format for a 'File' type.
Here is the result I will discuss below
https://gist.github.com/michaelt/9122379
The first form of the parser (`fileParser0`) was
not so bad when I used it on simple files.
Then I increased the length of the file to be parsed
into a succession of `Files` resulting in a 1M file.
(You can write a suitable file by uncommenting
the first line in `main`)
On 1M of data, the obvious ways of
formulating the parser in the pipes-parse style
seem to go wildly wrong. The extravagance of
my computer's response -- "swap space full", etc --
may have something to do with os x vs. ghc-7.6
but they are obviously bad programs. I couldn't
get them to complete.
That is the result if I use either of `fileParser0` or
`fileParser1`.
If however I use `parse2`, where I limit
each use of `zoom utf8` by a prior
application of `PB.splitAt` (or break)
the 1M file is done in a half-second and
uses no significant memory.
Is there a mistake I am making, a mistake
in the implementation of something,
or some simple rule that can be stated
to avoid such things?
Also I wonder if the result of using e.g `fileParser1` is as pathological
on other os's.
yours Michael
--
You received this message because you are subscribed to the Google Groups
"Haskell Pipes" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
To post to this group, send email to [email protected].