I actually have some experience in this department, having authored both madlang <http://hackage.haskell.org/package/madlang> and language-ats <http://hackage.haskell.org/package/language-ats>. Parsers using combinators alone are more brittle than parsers using Happy, at least for human-facing languages.
I'm also not sure what exactly parser combinators provide over Happy. It has macros that can emulate e.g. between, many. Drawing up a minimal example might be a good idea. On 10/08/2018 05:24 PM, Vladislav Zavialov wrote: >> complex parsers written using parsing combinators is that they tend to be >> quite difficult to modify and have any kind of assurance that now you >> haven't broken something else > That's true regardless of implementation technique, parsers are rather > delicate. A LALR-based parser generator does provide more information > when it detects shift/reduce and reduce/reduce conflicts, but I never > found this information useful. It was always quite the opposite of > being helpful - an indication that a LALR parser could not handle my > change and I had to look for workarounds. > >> With a combinator based parser, you basically have to do program >> verification, or more pragmatically, have a large test suite and hope that >> you tested everything. > Even when doing modifications to Parser.y, I relied mainly on the test > suite to determine whether my change was right (and the test suite > always caught many issues). A large test suite is the best approach > both for 'happy'-based parsers and for combinator-based parsers. > >> and then have a separate pass that validates and fixes up the results > That's where my concern lies. This separate pass is confusing (at > least for me - it's not the most straightforward thing to parse > something incorrectly and then restructure it), it is hard to modify, > it does not handle corner cases (e.g. #1087). > > Since we have all this Haskell code that does a significant portion of > processing, why even bother with having a LALR pass before it? > >> namely we can report better parse errors > I don't think that's true, we can achieve better error messages with > recursive descent. > >> Also, with the new rewrite of HsSyn, we should be able to mark such >> constructors as only usable in the parsing pass, so later passes wouldn't >> need to worry about them. > Not completely true, GhcPs-parametrized structures are the final > output of parsing, so at least the renamer will face these > constructors. > > On Tue, Oct 9, 2018 at 1:00 AM Iavor Diatchki <iavor.diatc...@gmail.com> > wrote: >> Hello, >> >> my experience with complex parsers written using parsing combinators is that >> they tend to be quite difficult to modify and have any kind of assurance >> that now you haven't broken something else. While reduce-reduce errors are >> indeed annoying, you at least know that there is some sort of issue you need >> to address. With a combinator based parser, you basically have to do >> program verification, or more pragmatically, have a large test suite and >> hope that you tested everything. >> >> I think the current approach is actually quite reasonable: use the Happy >> grammar to parse out the basic structure of the program, without trying to >> be completely precise, and then have a separate pass that validates and >> fixes up the results. While this has the draw-back of some constructors >> being in the "wrong place", there are also benefits---namely we can report >> better parse errors. Also, with the new rewrite of HsSyn, we should be able >> to mark such constructors as only usable in the parsing pass, so later >> passes wouldn't need to worry about them. >> >> -Iavor >> >> >> >> >> >> >> >> >> >> >> >> On Mon, Oct 8, 2018 at 2:26 PM Simon Peyton Jones via ghc-devs >> <ghc-devs@haskell.org> wrote: >>> I'm no parser expert, but a parser that was easier to understand and >>> modify, and was as fast as the current one, sounds good to me. >>> >>> It's a tricky area though; e.g. the layout rule. >>> >>> Worth talking to Simon Marlow. >>> >>> Simon >>> >>> >>> >>> | -----Original Message----- >>> | From: ghc-devs <ghc-devs-boun...@haskell.org> On Behalf Of Vladislav >>> | Zavialov >>> | Sent: 08 October 2018 21:44 >>> | To: ghc-devs <ghc-devs@haskell.org> >>> | Subject: Parser.y rewrite with parser combinators >>> | >>> | Hello devs, >>> | >>> | Recently I've been working on a couple of parsing-related issues in >>> | GHC. I implemented support for the -XStarIsType extension, fixed >>> | parsing of the (!) type operator (Trac #15457), allowed using type >>> | operators in existential contexts (Trac #15675). >>> | >>> | Doing these tasks required way more engineering effort than I expected >>> | from my prior experience working with parsers due to complexities of >>> | GHC's grammar. >>> | >>> | In the last couple of days, I've been working on Trac #1087 - a >>> | 12-year old parsing bug. After trying out a couple of approaches, to >>> | my dismay I realised that fixing it properly (including support for >>> | bang patterns inside infix constructors, etc) would require a complete >>> | rewrite of expression and pattern parsing logic. >>> | >>> | Worse yet, most of the work would be done outside Parser.y in Haskell >>> | code instead, in RdrHsSyn helpers. When I try to keep the logic inside >>> | Parser.y, in every design direction I face reduce/reduce conflicts. >>> | >>> | The reduce/reduce conflicts are the worst. >>> | >>> | Perhaps it is finally time to admit that Haskell syntax with all of >>> | the GHC cannot fit into a LALR grammar? >>> | >>> | The extent of hacks that we have right now just to make parsing >>> | possible is astonishing. For instance, we have dedicated constructors >>> | in HsExpr to make parsing patterns possible (EWildPat, EAsPat, >>> | EViewPat, ELazyPat). That is, one of the fundamental types (that the >>> | type checker operates on) has four additional constructors that exist >>> | due to a reduce/reduce conflict between patterns and expressions. >>> | >>> | I propose a complete rewrite of GHC's parser to use recursive descent >>> | parsing with monadic parser combinators. >>> | >>> | 1. We could significantly simplify parsing logic by doing things in a >>> | more direct manner. For instance, instead of parsing patterns as >>> | expressions and then post-processing them, we could have separate >>> | parsing logic for patterns and expressions. >>> | >>> | 2. We could fix long-standing parsing bugs like Trac #1087 because >>> | recursive descent offers more expressive power than LALR (at the cost >>> | of support for left recursion, which is not much of a loss in >>> | practice). >>> | >>> | 3. New extensions to the grammar would require less engineering effort. >>> | >>> | Of course, this rewrite is a huge chunk of work, so before I start, I >>> | would like to know that this work would be accepted if done well. >>> | Here's what I want to achieve: >>> | >>> | * Comparable performance. The new parser could turn out to be faster >>> | because it would do less post-processing, but it could be slower >>> | because 'happy' does all the sorts of low-level optimisations. I will >>> | consider this project a success only if comparable performance is >>> | achieved. >>> | >>> | * Correctness. The new parser should handle 100% of the syntactic >>> | constructs that the current parser can handle. >>> | >>> | * Error messages. The new error messages should be of equal or better >>> | quality than existing ones. >>> | >>> | * Elegance. The new parser should bring simplification to other parts >>> | of the compiler (e.g. removal of pattern constructors from HsExpr). >>> | And one of the design principles is to represent things by dedicated >>> | data structures, in contrast to the current state of affairs where we >>> | represent patterns as expressions, data constructor declarations as >>> | types (before D5180), etc. >>> | >>> | Let me know if this is a good/acceptable direction of travel. That's >>> | definitely something that I personally would like to see happen. >>> | >>> | All the best, >>> | - Vladislav >>> | _______________________________________________ >>> | ghc-devs mailing list >>> | ghc-devs@haskell.org >>> | https://na01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fmail.hask >>> | ell.org%2Fcgi-bin%2Fmailman%2Flistinfo%2Fghc- >>> | devs&data=02%7C01%7Csimonpj%40microsoft.com%7C19181de5c6bd493ab07a08d >>> | 62d5edbe0%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C636746282778542095 >>> | &sdata=lFRt1t4k3BuuRdyOqwOYTZcLPRB%2BtFJwfFtgMpNLxW0%3D&reserved= >>> | 0 >>> _______________________________________________ >>> ghc-devs mailing list >>> ghc-devs@haskell.org >>> http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs > _______________________________________________ > ghc-devs mailing list > ghc-devs@haskell.org > http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs
signature.asc
Description: OpenPGP digital signature
_______________________________________________ ghc-devs mailing list ghc-devs@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs