Re: [Haskell-cafe] Parser left recursion
You can find our experiments here: https://github.com/feuerbach/hagll But don't set your expectations high :-) Roman * Paul Callaghan paulcc@gmail.com [2013-03-01 06:27:46+] Hi Another alternative is this Haskell library: https://github.com/paulcc/xsaiga This is a combinator library which is suitable for mid-scale NLP work, so handles left recursion and (high amounts of) ambiguity to produce a packed result (which can be decoded to a list of results if required). It uses a technique similar to Danielsson's for termination. The technical details (incl papers) can be found on http://cs.uwindsor.ca/~hafiz/xsaiga/pub.html, particularly in the IWPT paper http://cs.uwindsor.ca/~hafiz/pub/iwpt-07.pdf I'd be interested to see a GLL impl for Haskell, particularly for comparison with the above. Did Roman C. publish some code for this a while back? Paul On Wed, Feb 27, 2013 at 9:50 AM, Paul Callaghan paulcc@gmail.com wrote: Another alternative is this Haskell library: https://github.com/paulcc/xsaiga This is a combinator library which is suitable for mid-scale NLP work, so handles left recursion and (high amounts of) ambiguity to produce a packed result (which can be decoded to a list of results if required). It uses a technique similar to Danielsson's for termination. The technical details (incl papers) can be found on http://cs.uwindsor.ca/~hafiz/xsaiga/pub.html, particularly in the IWPT paper http://cs.uwindsor.ca/~hafiz/pub/iwpt-07.pdf I'd be interested to see a GLL impl for Haskell, particularly for comparison with the above. Did Roman C. publish some code for this a while back? Paul On Tue, Feb 26, 2013 at 7:43 PM, Dominique Devriese dominique.devri...@cs.kuleuven.be wrote: 2013/2/26 Martin Drautzburg martin.drautzb...@web.de: I wonder if I can enforce the nonNr property somehow, i.e. enforce the rule will not consider the same nonterminal again without having consumed any input. You might be interested in this paper: Danielsson, Nils Anders. Total parser combinators. ACM Sigplan Notices. Vol. 45. No. 9. ACM, 2010. Regards, Dominique ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Parser left recursion
Hi Another alternative is this Haskell library: https://github.com/paulcc/xsaiga This is a combinator library which is suitable for mid-scale NLP work, so handles left recursion and (high amounts of) ambiguity to produce a packed result (which can be decoded to a list of results if required). It uses a technique similar to Danielsson's for termination. The technical details (incl papers) can be found on http://cs.uwindsor.ca/~hafiz/xsaiga/pub.html, particularly in the IWPT paper http://cs.uwindsor.ca/~hafiz/pub/iwpt-07.pdf I'd be interested to see a GLL impl for Haskell, particularly for comparison with the above. Did Roman C. publish some code for this a while back? Paul On Wed, Feb 27, 2013 at 9:50 AM, Paul Callaghan paulcc@gmail.com wrote: Another alternative is this Haskell library: https://github.com/paulcc/xsaiga This is a combinator library which is suitable for mid-scale NLP work, so handles left recursion and (high amounts of) ambiguity to produce a packed result (which can be decoded to a list of results if required). It uses a technique similar to Danielsson's for termination. The technical details (incl papers) can be found on http://cs.uwindsor.ca/~hafiz/xsaiga/pub.html, particularly in the IWPT paper http://cs.uwindsor.ca/~hafiz/pub/iwpt-07.pdf I'd be interested to see a GLL impl for Haskell, particularly for comparison with the above. Did Roman C. publish some code for this a while back? Paul On Tue, Feb 26, 2013 at 7:43 PM, Dominique Devriese dominique.devri...@cs.kuleuven.be wrote: 2013/2/26 Martin Drautzburg martin.drautzb...@web.de: I wonder if I can enforce the nonNr property somehow, i.e. enforce the rule will not consider the same nonterminal again without having consumed any input. You might be interested in this paper: Danielsson, Nils Anders. Total parser combinators. ACM Sigplan Notices. Vol. 45. No. 9. ACM, 2010. Regards, Dominique ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Parser left recursion
On Sunday, 24. February 2013 16:04:11 Tillmann Rendel wrote: Both approaches are essentially equivalent, of course: Before considering the very same nonterminal again, we should have consumed at least one token. I see. Thanks So for the laymen: expr ::= expr + expr is a problem, because the parser considers expr again without having consumed any input. expr ::= literal pluses pluses ::= many (+ expr) is not a problem, because by the time the parser gets to the rightmost expr is has consumes at least one +. Instead of literal we can put anything which promises not to be left-recursive expr ::= exprNonLr + expr exprNonLr := ... As exprNonLr gets more complicated, we may end up with a whole set of nonLr parsers. I wonder if I can enforce the nonNr property somehow, i.e. enforce the rule will not consider the same nonterminal again without having consumed any input. -- Martin ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Parser left recursion
2013/2/26 Martin Drautzburg martin.drautzb...@web.de: I wonder if I can enforce the nonNr property somehow, i.e. enforce the rule will not consider the same nonterminal again without having consumed any input. You might be interested in this paper: Danielsson, Nils Anders. Total parser combinators. ACM Sigplan Notices. Vol. 45. No. 9. ACM, 2010. Regards, Dominique ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Parser left recursion
On Wednesday, 20. February 2013 09:59:47 Tillmann Rendel wrote: So the grammar is: Exp ::= Int | Exp + Exp My naive parser enters an infinite recursion, when I try to parse 1+2. I do understand why: hmm, this expression could be a plus, but then it must start with an expression, lets check. and it tries to parse expression again and again considers Plus. Indeed. Twan van Laarhoven told me that: Left-recursion is always a problem for recursive-descend parsers. Note that the left recursion is already visible in the grammar above, no need to convert to parser combinators. The problem is that the nonterminal Exp occurs at the left of a rule for itself. Just a silly quick question: why isn't right-recursion a similar problem? -- Martin ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Parser left recursion
Hi Martin, Martin Drautzburg wrote: Note that the left recursion is already visible in the grammar above, no need to convert to parser combinators. The problem is that the nonterminal Exp occurs at the left of a rule for itself. Just a silly quick question: why isn't right-recursion a similar problem? I think the situation is symmetric: If you match the token stream right-to-left, right-recursion is problematic. Tillmann ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Parser left recursion
* Martin Drautzburg martin.drautzb...@web.de [2013-02-24 12:31:37+0100] Twan van Laarhoven told me that: Left-recursion is always a problem for recursive-descend parsers. Note that the left recursion is already visible in the grammar above, no need to convert to parser combinators. The problem is that the nonterminal Exp occurs at the left of a rule for itself. Just a silly quick question: why isn't right-recursion a similar problem? Right recursion is recursion of the form A ::= B A There are two cases to consider here. The language defined by B may or may not contain the empty string. If it contains the empty string, then we have an indirect left recursion, since the rule A ::= A is an instance of the original rule. If it doesn't contain the empty string, then, by the time you get to A, you necessarily have to consume some part of the input. Thus, your recursion is well-founded — you enter the recursion with the input strictly smaller than you had in the beginning. Roman ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Parser left recursion
On Sun, Feb 24, 2013 at 7:09 PM, Roman Cheplyaka r...@ro-che.info wrote: Thus, your recursion is well-founded — you enter the recursion with the input strictly smaller than you had in the beginning. Perhaps you meant /productive/ corecursion? Because the definition A ::= B A you gave is codata. -- Kim-Ee ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Parser left recursion
* Kim-Ee Yeoh k...@atamo.com [2013-02-24 19:22:33+0700] On Sun, Feb 24, 2013 at 7:09 PM, Roman Cheplyaka r...@ro-che.info wrote: Thus, your recursion is well-founded — you enter the recursion with the input strictly smaller than you had in the beginning. Perhaps you meant /productive/ corecursion? Because the definition A ::= B A you gave is codata. It is just a grammar production, which I chose to interpret recursively. Whether it's productive when interpreted corecursively depends on the particular interpretation, I guess, and may not actually depend on factors like left recursion. I haven't thought about it much. Roman ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Parser left recursion
* Kim-Ee Yeoh k...@atamo.com [2013-02-24 19:22:33+0700] On Sun, Feb 24, 2013 at 7:09 PM, Roman Cheplyaka r...@ro-che.info wrote: Thus, your recursion is well-founded — you enter the recursion with the input strictly smaller than you had in the beginning. Perhaps you meant /productive/ corecursion? Because the definition A ::= B A you gave is codata. Or perhaps you meant that the production itself, when interpreted as a definition, is corecursive? Well, yes, and so is any CFG written in BNF. But that doesn't buy us much, and is irrelevant to the discussion of parsing left-recursive grammars. Roman ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Parser left recursion
On Sun, Feb 24, 2013 at 7:47 PM, Roman Cheplyaka r...@ro-che.info wrote: Or perhaps you meant that the production itself, when interpreted as a definition, is corecursive? I was merely thrown off by your mention of well-founded and the assertion that you're left with a strictly smaller input. I don't see any of this. That's when I remembered that well-founded recursion (a desirable) is sometimes confused with productive corecursion (another desirable). -- Kim-Ee ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Parser left recursion
* Kim-Ee Yeoh k...@atamo.com [2013-02-24 19:56:13+0700] I was merely thrown off by your mention of well-founded and the assertion that you're left with a strictly smaller input. I don't see any of this. It may become more obvious if you try to write two recursive descent parsers (as recursive functions) which parse a left-recursive and a non-left-recursive grammars, and see in which case the recursion is well-founded and why. Roman ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Parser left recursion
On Sun, Feb 24, 2013 at 8:03 PM, Roman Cheplyaka r...@ro-che.info wrote: It may become more obvious if you try to write two recursive descent parsers (as recursive functions) which parse a left-recursive and a non-left-recursive grammars, and see in which case the recursion is well-founded and why. Which grammar are we discussing here? The one you outlined? A ::= B A As I pointed out earlier, this doesn't model a program (e.g. a compiler). It's an OS! What am I missing? -- Kim-Ee ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Parser left recursion
On Sun, Feb 24, 2013 at 6:31 AM, Martin Drautzburg martin.drautzb...@web.de wrote: Just a silly quick question: why isn't right-recursion a similar problem? Very roughly: Left recursion is: let foo n = n + foo n in ... Right recursion is: let foo 1 = 1; foo n = n + foo (n - 1) in ... In short, matching the tokens before the right recursion will constitute an end condition that will stop infinite recursion --- if only because you'll hit the end of the input. Left recursion doesn't consume anything, just re-executes itself. -- brandon s allbery kf8nh sine nomine associates allber...@gmail.com ballb...@sinenomine.net unix, openafs, kerberos, infrastructure, xmonadhttp://sinenomine.net ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Parser left recursion
Hi, Kim-Ee Yeoh wrote: Perhaps you meant /productive/ corecursion? Because the definition A ::= B A you gave is codata. If you write a recursive descent parser, it takes the token stream as an input and consumes some of this input. For example, the parser could return an integer that says how many tokens it consumed: parseA :: String - Maybe Int parseB :: String - Maybe Int Now, if we implement parseA according to the grammar rule A ::= B A we have, for example, the following: parseA text = case parseB text of Nothing - Nothing Just n1 - case parseA (drop n1 text) of Nothing - Nothing Just n2 - Just (n1 + n2) Note that parseA is recursive. The recursion is well-founded if (drop n1 text) is smaller then text. So we have two cases, as Roman wrote: If the language defined by B contains the empty string, then n1 can be 0, so the recursion is not well-founded and the parser might loop. If the language defined by B does not contain the empty string, then n1 is always bigger than 0, so (drop n1 text) is always smaller than text, so the recursion is well-founded and the parser cannot loop. So I believe the key to understanding Roman's remark about well-founded recursion is to consider the token stream as an additional argument to the parser. However, the difference between hidden left recursion and unproblematic recursion in grammars can also be understood in terms of productive corecursion. In that view, a parser is a process that can request input tokens from the token stream: data Parser = Input (Char - Parser) | Success | Failure parseA :: Parser parseB :: Parser Now, if we implement parseA according to the grammar rule A ::= B A we have, for example, the following: andThen :: Parser - Parser - Parser andThen (Input f) p = Input (\c - f c `andThen` p) andThen (Success) p = p andThen Failure p = p parseA = parseB `andThen` parseA Note that parseA is corecursive. The corecursion is productive if the corecursive call to parseA is guarded with an Input constructor. Again, there are two cases: If the language described by B contains the empty word, then parseB = Success, and (parseB `andThen` parseA) = parseA, so the corecursive call to parseA is not guarded and the parser is not productive. If the langauge described by B does not contain the empty word, then parseB = Input ..., and (parseB `andThen` parseA) = Input (... parseA ...), so the corecursive call to parseA is guarded and the parse is productive. So I believe the key to understanding left recursion via productive corecursion is to model the consumption of the token stream with a codata constructor. Both approaches are essentially equivalent, of course: Before considering the very same nonterminal again, we should have consumed at least one token. Tillmann ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Parser left recursion
On Sun, Feb 24, 2013 at 10:04 PM, Tillmann Rendel ren...@informatik.uni-marburg.de wrote: The recursion is well-founded if (drop n1 text) is smaller then text. So we have two cases, as Roman wrote: If the language defined by B contains the empty string, then n1 can be 0, so the recursion is not well-founded and the parser might loop. Ah! So A ::= B A is really /not/ the full grammar of the language but an abbreviated one, minus terminals. At the very least, partial parses make sense and the input stream is assumed finite. Because A ::= B A could be understood, not so much as a parsing rule, but as the full definition of a language which comprises only one word: B ... ad infinitum. So all that mention of well-foundedness was confusing. Thanks, Tillmann! -- Kim-Ee ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Parser left recursion
On 2/24/13 7:56 AM, Kim-Ee Yeoh wrote: On Sun, Feb 24, 2013 at 7:47 PM, Roman Cheplyaka r...@ro-che.info wrote: Or perhaps you meant that the production itself, when interpreted as a definition, is corecursive? I was merely thrown off by your mention of well-founded and the assertion that you're left with a strictly smaller input. I don't see any of this. But the problem here really is an issue of well-foundedness rather than productivity. Consider the grammar, A ::= epsilon | A B B ::= ...whatever... In order to use this grammar as-is, when processing a string we must be able to magically determine how many times to recurse in order to get our epsilon. This is magic because the correct number of times to recurse is defined by the rest of the string--- which we haven't looked at! Proof theoretically speaking, this is the same as magically knowing when to use the inductive hypothesis vs when to bottom out at a base case. Using this grammar as-is, the recursive descent parser always decides to use the inductive hypothesis. Hence, infinite loop. It should be apparent that this isn't an issue with left-recursion (when reading the string from right to left), because on left-recursion we're always consuming some of the string, and therefore the input string is getting smaller on each recursive call, and therefore it is guaranteed that we will eventually run out of string, at which point we can backtrack as necessary. The problem with right-recursion is that we never know when to stop recurring. If we stop at some arbitrary point, well maybe the parse will end up failing when it could've succeeded if only we had recursed a bit further. But recursive descent doesn't allow the kind of backtracking that would be required to exhaust the search space for right-recursion. -- Live well, ~wren ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Parser left recursion
As mentioned before, the way to handle this specific problem is to use either the pChainl or pChainr parser combinators, as e.g. found on: http://hackage.haskell.org/packages/archive/uu-parsinglib/2.7.4.1/doc/html/Text-ParserCombinators-UU-Derived.html and many similar libraries. So one can write: pExpr = pChainl ( (+) $ pSym ' ')) pFactor pFactor = iI '(' pExpr ')' Ii | pInteger | pIdentifier What is even nicer is that one can easily extend this to deal with many different operators: pExpr = foldr nextop [((+),'+'), ((*), '*'))] pGactor where nextop (sem,sym) = pChainl sem $ pSym sym)) It is obvious how to extend this further into operators with the same priority or being right associative. See furthermore: @inproceedings{Fokker95:0, title = {Functional Parsers}, author = {Jeroen Fokker}, year = {1995}, tags = {parsing}, researchr = {http://dutieq.st.ewi.tudelft.nl/publication/Fokker95%3A0}, cites = {0}, citedby = {0}, pages = {1-23}, booktitle = {Advanced Functional Programming, First International Spring School on Advanced Functional Programming Techniques, Båstad, Sweden, May 24-30, 1995, Tutorial Text}, editor = {Johan Jeuring and Erik Meijer}, volume = {925}, series = {Lecture Notes in Computer Science}, publisher = {Springer}, isbn = {3-540-59451-5}, } Most left recursion stems from the fact that conventional CFG notation is sufficient, but unfortunately not ideally suited, to express oft occurring patterns. This is where parser combinators come in: they allow one to express what one wants to say instead of having to encode it using recursion, etc. If you have a really nasty grammar where left recursion removal by hand would ruin your grammar, you may use a transform like the LeftCornerTransform as used e.g. in the ChristmasTree package, which removes the problem of exponential time behaviour of reading Haskell data types with infix operators. See: http://hackage.haskell.org/package/ChristmasTree-0.2, and which has been described in: @article{DBLP :journals/entcs/BaarsSV10, author= {Arthur I. Baars and S. Doaitse Swierstra and Marcos Viera}, title = {Typed Transformations of Typed Grammars: The Left Corner Transform}, journal = {Electr. Notes Theor. Comput. Sci.}, volume= {253}, number= {7}, year = {2010}, pages = {51-64}, ee= {http://dx.doi.org/10.1016/j.entcs.2010.08.031}, bibsource = {DBLP, http://dblp.uni-trier.de} } Doaitse On Feb 20, 2013, at 8:13 , Martin Drautzburg martin.drautzb...@web.de wrote: Hello all, this was previously asked on haskell-beginners, but only partially answered. As an exercise I am writing a parser roughly following the expamples in Graham Hutton's book. The language contains things like: data Exp = Lit Int -- literal integer | Plus Exp Exp My naive parser enters an infinite recursion, when I try to parse 1+2. I do understand why: hmm, this expression could be a plus, but then it must start with an expression, lets check. and it tries to parse expression again and again considers Plus. Twan van Laarhoven told me that: Left-recursion is always a problem for recursive-descend parsers. and suggested to do something like: parseExp = do lit - parseLit pluses - many (parsePlusToken * parseLit) return (combinePlusesWithLit lit pluses) combinePlusesWithLit = foldr Plus -- or foldl This indeed does the trick, but only when the first token is a Lit (literal integer). I then added the possibility to optionally put things in parentheses. But then I cannot parse (1+2)+3. The original code fails, because (1+2) is not a Lit and when I allow an expression as the first argument to + I get infinite recursion again. I am generally confused, that saying a plus expression is an integer followed by many plus somethings is not what the language says. So this requires a lot of paying attention to get right. I'd much rather say a plus expression is two expressions with a '+' in between. I do know for sure, that it is possible to parse (1+2)+3 (ghci does it just fine). But I seem to be missing a trick. Can anyone shed some light on this? -- Martin ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Parser left recursion
Did you see expression parser in parsec packagehttp://hackage.haskell.org/packages/archive/parsec/3.1.3/doc/html/Text-Parsec-Expr.html? Is it not enough? 2013/2/20 Martin Drautzburg martin.drautzb...@web.de Hello all, this was previously asked on haskell-beginners, but only partially answered. As an exercise I am writing a parser roughly following the expamples in Graham Hutton's book. The language contains things like: data Exp = Lit Int -- literal integer | Plus Exp Exp My naive parser enters an infinite recursion, when I try to parse 1+2. I do understand why: hmm, this expression could be a plus, but then it must start with an expression, lets check. and it tries to parse expression again and again considers Plus. Twan van Laarhoven told me that: Left-recursion is always a problem for recursive-descend parsers. and suggested to do something like: parseExp = do lit - parseLit pluses - many (parsePlusToken * parseLit) return (combinePlusesWithLit lit pluses) combinePlusesWithLit = foldr Plus -- or foldl This indeed does the trick, but only when the first token is a Lit (literal integer). I then added the possibility to optionally put things in parentheses. But then I cannot parse (1+2)+3. The original code fails, because (1+2) is not a Lit and when I allow an expression as the first argument to + I get infinite recursion again. I am generally confused, that saying a plus expression is an integer followed by many plus somethings is not what the language says. So this requires a lot of paying attention to get right. I'd much rather say a plus expression is two expressions with a '+' in between. I do know for sure, that it is possible to parse (1+2)+3 (ghci does it just fine). But I seem to be missing a trick. Can anyone shed some light on this? -- Martin ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Parser left recursion
Hi, Martin Drautzburg wrote: As an exercise I am writing a parser roughly following the expamples in Graham Hutton's book. The language contains things like: data Exp = Lit Int -- literal integer | Plus Exp Exp So the grammar is: Exp ::= Int | Exp + Exp My naive parser enters an infinite recursion, when I try to parse 1+2. I do understand why: hmm, this expression could be a plus, but then it must start with an expression, lets check. and it tries to parse expression again and again considers Plus. Indeed. Twan van Laarhoven told me that: Left-recursion is always a problem for recursive-descend parsers. Note that the left recursion is already visible in the grammar above, no need to convert to parser combinators. The problem is that the nonterminal Exp occurs at the left of a rule for itself. One way to fix this problem is to refactor the grammar in order to avoid left recursion. So let's distinguish expressions that can start with expressions and expressions that cannot start with expressions: Exp-norec ::= Int Exp-rec ::= Exp-norec | Exp-norec + Exp-rec Note that Exp-rec describes a list of Exp-norec with + in-between, so you can implement it with the combinator many. Now if you want to add a rule like Exp ::= ( Exp ) you need to figure out whether to add it to Exp-norec or Exp-rec. Since the new rule is not left recursive, you can add it to Exp-norec: Exp-norec ::= Int | ( Exp-rec ) Exp-rec ::= Exp-norec | Exp-norec + Exp-rec If you implement this grammar with parser combinators, you should be able to parse (1+2)+3 just fine. Tillmann PS. Try adding multiplication to your grammar. You will need a similar trick to get the priorities right. ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Parser left recursion
* Tillmann Rendel ren...@informatik.uni-marburg.de [2013-02-20 09:59:47+0100] One way to fix this problem is to refactor the grammar in order to avoid left recursion. So let's distinguish expressions that can start with expressions and expressions that cannot start with expressions: [...] PS. Try adding multiplication to your grammar. You will need a similar trick to get the priorities right. And then try adding subtraction ;-) Roman ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Parser left recursion
Hi, Roman Cheplyaka wrote: Another workaround is to use memoization of some sort — see e.g. GLL (Generalized LL) parsing. Is there a GLL parser combinator library for Haskell? I know about the gll-combinators for Scala, but havn't seen anything for Haskell. Bonus points for providing the graph-structured stack (for maximal sharing in the computation) and the shared packed parse forest (for maximal sharing in the results) as reusable components. Tillmann ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Parser left recursion
* Tillmann Rendel ren...@informatik.uni-marburg.de [2013-02-20 12:39:35+0100] Hi, Roman Cheplyaka wrote: Another workaround is to use memoization of some sort — see e.g. GLL (Generalized LL) parsing. Is there a GLL parser combinator library for Haskell? I know about the gll-combinators for Scala, but havn't seen anything for Haskell. I am not aware of any. Dmitry Astapov and I played with this idea a long time ago, but we didn't succeed. Might be a good time for someone interested to have another go at it. Roman ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Parser left recursion
All, Many (but not all) of the parsing algorithms that support left recursion cannot be implemented in Haskell using the standard representation of recursion in parser combinators. The problem can be avoided in Scala because it has imperative features like referential identity and/or mutable references. The most practical solution currently is probably to manually transform your grammars to a non-left-recursive form (as suggested above) and then use a standard parser combinator library with a top-down parsing algorithm (I suggest uu-parsinglib). That being said, there is active research into alternative functional representations of recursion in grammars/parsers that support a wider range of algorithms. If you want to read up on such research, I suggest the following papers to get an idea of some of the approaches: Baars, Arthur, S. Doaitse Swierstra, and Marcos Viera. Typed transformations of typed grammars: The left corner transform. Electronic Notes in Theoretical Computer Science 253.7 (2010): 51-64. Devriese, Dominique, et al. Fixing idioms: A recursion primitive for applicative dsls. Proceedings of the ACM SIGPLAN 2013 workshop on Partial evaluation and program manipulation. ACM, 2013. Oliveira, Bruno CdS, and William R. Cook. Functional programming with structured graphs. Proceedings of the 17th ACM SIGPLAN international conference on Functional programming. ACM, 2012. Oliveira, Bruno C. D. S., and Andres Löh. Abstract syntax graphs for domain specific languages. Proceedings of the ACM SIGPLAN 2013 workshop on Partial evaluation and program manipulation. ACM, 2013. DEVRIESE, DOMINIQUE, and FRANK PIESSENS. Finally tagless observable recursion for an abstract grammar model. Journal of Functional Programming 1.1: 1-40. For the last one, you can check out http://projects.haskell.org/grammar-combinators/ about the grammar-combinators library on Hackage. It has a packrat parser that can deal with left-recursion and a grammar transformation that transforms it away. There is a tutorial you can checkout. Dominique 2013/2/20 Tillmann Rendel ren...@informatik.uni-marburg.de: Hi, Roman Cheplyaka wrote: Another workaround is to use memoization of some sort — see e.g. GLL (Generalized LL) parsing. Is there a GLL parser combinator library for Haskell? I know about the gll-combinators for Scala, but havn't seen anything for Haskell. Bonus points for providing the graph-structured stack (for maximal sharing in the computation) and the shared packed parse forest (for maximal sharing in the results) as reusable components. Tillmann ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Parser left recursion
More primitively, Parsec and its predecessor Hutton-Meijer provide the chainl/chainr combinators, these automatically remove left recursion within the parser - i.e. you don't have to rewrite the grammar. On 20 February 2013 08:19, Dmitry Olshansky olshansk...@gmail.com wrote: Did you see expression parser in parsec package? Is it not enough? ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Parser left recursion
Thank you very much. To clarify: I am not in need of a parser, I just wanted to understand why left recursion is an issue (that was easy) and what techniques help to circumvent the problem. So your answer was spot-on (though I haven't implemented it yet) On Wednesday, 20. February 2013 09:59:47 Tillmann Rendel wrote: Hi, Martin Drautzburg wrote: As an exercise I am writing a parser roughly following the expamples in Graham Hutton's book. The language contains things like: data Exp = Lit Int -- literal integer | Plus Exp Exp So the grammar is: Exp ::= Int | Exp + Exp My naive parser enters an infinite recursion, when I try to parse 1+2. I do understand why: hmm, this expression could be a plus, but then it must start with an expression, lets check. and it tries to parse expression again and again considers Plus. Indeed. Twan van Laarhoven told me that: Left-recursion is always a problem for recursive-descend parsers. Note that the left recursion is already visible in the grammar above, no need to convert to parser combinators. The problem is that the nonterminal Exp occurs at the left of a rule for itself. One way to fix this problem is to refactor the grammar in order to avoid left recursion. So let's distinguish expressions that can start with expressions and expressions that cannot start with expressions: Exp-norec ::= Int Exp-rec ::= Exp-norec | Exp-norec + Exp-rec Note that Exp-rec describes a list of Exp-norec with + in-between, so you can implement it with the combinator many. Now if you want to add a rule like Exp ::= ( Exp ) you need to figure out whether to add it to Exp-norec or Exp-rec. Since the new rule is not left recursive, you can add it to Exp-norec: Exp-norec ::= Int | ( Exp-rec ) Exp-rec ::= Exp-norec | Exp-norec + Exp-rec If you implement this grammar with parser combinators, you should be able to parse (1+2)+3 just fine. Tillmann PS. Try adding multiplication to your grammar. You will need a similar trick to get the priorities right. -- Martin ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
[Haskell-cafe] Parser left recursion
Hello all, this was previously asked on haskell-beginners, but only partially answered. As an exercise I am writing a parser roughly following the expamples in Graham Hutton's book. The language contains things like: data Exp = Lit Int -- literal integer | Plus Exp Exp My naive parser enters an infinite recursion, when I try to parse 1+2. I do understand why: hmm, this expression could be a plus, but then it must start with an expression, lets check. and it tries to parse expression again and again considers Plus. Twan van Laarhoven told me that: Left-recursion is always a problem for recursive-descend parsers. and suggested to do something like: parseExp = do lit - parseLit pluses - many (parsePlusToken * parseLit) return (combinePlusesWithLit lit pluses) combinePlusesWithLit = foldr Plus -- or foldl This indeed does the trick, but only when the first token is a Lit (literal integer). I then added the possibility to optionally put things in parentheses. But then I cannot parse (1+2)+3. The original code fails, because (1+2) is not a Lit and when I allow an expression as the first argument to + I get infinite recursion again. I am generally confused, that saying a plus expression is an integer followed by many plus somethings is not what the language says. So this requires a lot of paying attention to get right. I'd much rather say a plus expression is two expressions with a '+' in between. I do know for sure, that it is possible to parse (1+2)+3 (ghci does it just fine). But I seem to be missing a trick. Can anyone shed some light on this? -- Martin ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Parser left recursion
* Martin Drautzburg martin.drautzb...@web.de [2013-02-20 08:13:16+0100] I do know for sure, that it is possible to parse (1+2)+3 (ghci does it just fine). But I seem to be missing a trick. Can anyone shed some light on this? The trick in this case is that ghci doesn't use a recursive descent parser — it uses an LR parser (generated by Happy). Another workaround is to use memoization of some sort — see e.g. GLL (Generalized LL) parsing. Roman ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe