On 11/03/2010 01:25, John D. Earle wrote:
Hi, Ben! Thanks for the input. I went to the Parsec and Attoparsec
parser links. Attoparsec was new to me. From the Parsec link:
Combinator parsers are written and used within the same programming
language as the rest of the program. The parsers are first-class
citizens of the language, unlike Happy parsers, which must be generated
via a preprocessor.
End Quote
I may want to go with Parsec or cousin just because the approach seems
elegant. What I am fishing for is something you only learn from
experience or learn from talking to people who have experience. My
impression is if you want speed, Happy is the way to go. I will be
making an investment in one or the other paradigm so I need to
understand the relative merits of each in order to make a decision.
I would think Happy is a dark place whereas Parsec is a place of light
which becomes important as correctness that I can personally attest to
is concerned. What I am getting at is my impression is with Parsec the
concepts involved are more important than the actual code itself. I
suspect with Happy you could understand the concepts involved and the
tool will continue to be necessary. I value the educational experience.
Parsec or cousin may provide a better educational experience. Is Parsec
slow like a snail compared to Happy? or are they similar? Parsec
certainly seems more flexible, but my question concerns performance. You
know the embarrassing facts that you won't find in the brochure.
Happy/Alex tend to work well for programming-language type tasks where
there is a clear lexical syntax and the grammar is close to LALR(1).
Parsec works well for more ad-hoc grammars, and where there is
ambiguity; Parsec tends to be more flexible in that regard. Happy on
the other hand will guarantee that your grammar has no ambiguity, and
Alex will be more efficient than parsing the equivalent regular
expressions in a combinator library because it is doing the NFA->DFA
conversion beforehand (although IIRC the Utrecht combinator library does
this at runtime?).
Personally I find Parsec and the other parser combinator libraries quite
difficult to use when it comes to deciding where to put 'try'; ReadP is
the exception here, because it does general backtracking and doesn't
make you decide where to use 'try'. I had an interesting experience
with writing GHC's parser for the strings inside foreign import
declarations recently - I tried various different combinator libraries
to see how it came out, and none of them made it easy. I must write a
blog post about that sometime.
As for performance, I don't think anyone has done a rigorous comparison
- it's hard to do, becuase you have to implement a whole parser twice in
very different ways, and then you only get results for one grammar.
Alex and Happy are fast enough for GHC - parsing is almost never the
bottleneck.
Cheers,
Simon
Thank you this has helped clarify my thinking.
--------------------------------------------------
From: "Ben Lippmeier" <b...@ouroborus.net>
Sent: 10 Wednesday March 2010 1734
To: "John D. Earle" <johndea...@cox.net>
Cc: <haskell-c...@haskell.org>
Subject: Re: Fast Haskell Parser
Hi John,
Doing a Google search for "haskell parser" returns the following link
as its first result. That's the parser that GHC uses.
http://www.haskell.org/happy/
You could also check out the following:
http://www.haskell.org/haskellwiki/Parsec
http://hackage.haskell.org/package/attoparsec
This would also be a perfect question to ask on the haskell-cafe
mailing list...
Cheers,
Ben.
On 11/03/2010, at 10:39 AM, John D. Earle wrote:
I was thinking of ways to create an efficient Haskell parser. My
initial thinking was to write a top down parser in Haskell, but if
you want speed a table driven approach may make greater sense.
Due to the existence of build bots there is a certain degree of
compliancy concerning build times. I feel that convenience is an
important factor. It should be convenient to build the source. Build
bots make an assumption, namely the existence of a formal
infrastructure. I believe that it should be possible to build
something from source casually.
This is a less demanding goal than high performance incremental
builds. It would be nice to out perform make files because if you
fail to do this, can it really be said that you are making progress?
Correctness appears to be a low priority among computer programmers.
That said, it may be worth investing some time in advance to figuring
out how to best achieve both objectives, namely correctness and
performance. Who knows skills acquired in one project may be useful
in another and performance is usually welcome.
So my question is, What sort of tools and methodologies exist in
Haskell to create high performance parsers? My impression is the
speed at which the parser performs its task is not the bottle-neck,
but the parser might as well be designed to be efficient so as not to
be intellectually lazy. It may even turn out that the parser may need
to be efficient merely to compensate for the spawn of correctness,
namely slow builds.
_______________________________________________
Cvs-ghc mailing list
Cvs-ghc@haskell.org
http://www.haskell.org/mailman/listinfo/cvs-ghc
_______________________________________________
Cvs-ghc mailing list
Cvs-ghc@haskell.org
http://www.haskell.org/mailman/listinfo/cvs-ghc
_______________________________________________
Cvs-ghc mailing list
Cvs-ghc@haskell.org
http://www.haskell.org/mailman/listinfo/cvs-ghc