Sven Panne <[EMAIL PROTECTED]> writes:

> A long-standing gap in Haskell environments is the lack of tools
> (written in Haskell, of course :-) for the manipulation or analysis of
> Haskell source programs. Other languages have a rather rich collection
> of such tools:
>
> [blah blah blah]
> 
> One of the biggest obstacles for building similar tools for Haskell is
> the daunting task of writing a parser (a recurring theme in this
> mailing list).

Quit right - it is a daunting task, but one that *really* needs to be
done.  The latest version of happy should have the necessary bells and
whistles to be able to parse Haskell (i.e. simple error recovery and
better monad support).  I intend to use it to write a Happy parser
for GHC - in fact this is already underway, I have a grammar, but the
lexer still needs to be written.  However, I don't think this will be
usable as a generic standalone Haskell parser - for a couple of reasons

        * we need it to go *fast* in GHC, so we'll use non-standard
          extensions to implement hash-tables and the like in the
          lexer (this is already done in the interface lexer in GHC).

        * GHC has a number of extensions to the Haskell grammar,
          which tend to bloat it a bit.

On the other hand, I'm willing to contribute some time, effort and
existing bits of code to this project.  Several other people have
mentioned to me that they need a Happy grammar for Haskell, some of
whom probably have time to help out.

The project can be split into the following sections:

        * An abstract syntax definition for Haskell.

        * A Happy grammar

        * A lexer

and some thought needs to go into the interface between the parser and
lexer, mainly to handle layout.  

There are some difficulties in parsing Haskell in that the BNF in the
report isn't LALR(1), so yacc-style parsers generally end up combining
the rules for expressions and patterns, which means we might have to
do the same thing in the abstract syntax (or have a separate data type
for expressions/patterns).  Some things are better done post-parsing,
like resolving operator precedences and collecting equations into
groups. 

Ok, let's get going.  Anyone willing to help out, please contact
either Sven or myself...

Cheers,
        Simon



> Although there are parsing libraries and a
> lexer/parser-generator available, it is nonetheless a difficult and
> time-consuming thing to do for several reasons:
> 
>    * The grammar for full Haskell is quite large (note that some
>      productions in the report are indexed by precedence and
>      associativity).
> 
>    * The grammar in the report is highly ambiguous, therefore making
>      it unsuitable for the input of parser generators.
> 
>    * Handling fixity declarations and the layout rule further
>      complicates matters.
> 
> Why not simply reuse parts of existing Haskell systems for this task?
> They are not very well suited for inclusion in other projects because
> of several reasons:
> 
>    * Hugs is a C-only system, GHC and HBC use an awesome mixture of C
>      and Haskell. This makes them hard to use if the rest of the tool
>      is written in Haskell. Furthermore, the tool couldn't be labeled
>      "100% pure Haskell" anymore...   :-)
> 
>    * The yacc/bison specs in GHC/HBC contain *lots* of conflicts
>      (HBC's curry.y produces more than 80 conflicts). This certainly
>      lowers the confidence that the generated parser does the right
>      thing, even more if one wants to incorporate small changes. One
>      has to check for every conflict if the default action taken is
>      the right one. Not much fun...
> 
>    * There are often strong interdependencies between the parser and
>      the rest of the compiler (used libraries etc.).
> 
>    * It's hard to figure out the generated abstract syntax and the
>      mapping between it and the corresponding Haskell constructs.
> 
> So what do we need for the easy construction of useful tools for
> Haskell? First of all a well-documented and easily comprehensible
> abstract syntax in the form of algebraic datatype declarations and a
> Haskell-only parser are needed. The emphasis should be on
> maintainability, completeness and ease of inclusion in other programs,
> not speed. Furthermore, a pretty-printer (aka unparser) would be
> useful, too.
> 
> So here is my call for contribution:
> 
>    Send an abstract syntax and/or a parser specification!
> 
> It doesn't matter if a parser generator is used or recursive descent
> techniques are applied.
> 
> If there is enough echo, I'd like to setup a web page for this
> project, containing things to download, documentation, and giving a
> forum for discussions and applications of the parsers. Ideas range
> from simple formatting tools to library browsers ("I know there is a
> function with a similar type signature somewhere...") or computer
> aided source-to-source transformations.
> 
> Don't hesitate, contribute!
> 
> -- 
> Sven Panne                                        Tel.: +49/89/2178-2235
> LMU, Institut fuer Informatik                     FAX : +49/89/2178-2211
> LFE Programmier- und Modellierungssprachen              Oettingenstr. 67
> mailto:[EMAIL PROTECTED]            D-80538 Muenchen
> http://www.pms.informatik.uni-muenchen.de/mitarbeiter/panne
> 

-- 
-- 
Simon Marlow                                             [EMAIL PROTECTED]
University of Glasgow                       http://www.dcs.gla.ac.uk/~simonm/
finger for PGP public key



Reply via email to