@Nicholas: Just to confirm that I have not forgotten and that I *do* intend to reply to the other points in your post. But since you're not the only one who raises some of these questions, I thought it worthwhile to do a careful, well-researched job of it, so I've been doing a bit of reading. Not too much longer, I hope.
Thanks, jeffrey On Wednesday, February 27, 2019 at 10:12:52 PM UTC-5, Jeffrey Kegler wrote: > > Your PEG work should be interesting and I plan to follow it. You made > quite a number of other significant (non-PEG) points and I am aware that I > still owe you responses to them. That's underway. It's taking some time > because I am doing some reading for it. > > best, jeffrey > > On Wed, Feb 27, 2019 at 5:12 PM <[email protected]> wrote: > >> Thanks for your answer! >> >> I do mostly agree, PEG is not a cure-all. But to make my opinion clear >> (and I do >> like to think it's an informed opinion rooted in experience) neither is it >> significantly worse in usability than CFGs. >> >> I must admit I missed that note in the footnotes. It doesn't do much for >> me in >> terms of nuance, honestly. It's in a reference after a phrase that reads >> "PEG, >> in safe use, would essentially be LL(1)". If that's the part of saying >> that "PEG >> has safe uses" then it isn't nuance, but rather a judgement that says >> that PEG >> is either dangerous or crippled (as LL(1) is). >> >> It all comes back to this idea that PEG is "dangerous". I'm open to >> discussing >> that — what I currently think is that it really means that it is >> **unintuitive** >> to some people (as I do explain in my answer to Leszek above). >> >> One of the (multiple) things that make me believe this is precisely this >> reference you cited in support of "PEG, in safe use, would essentially be >> LL(1)". I think this is a gross misrepresentation of the claims of the >> paper. >> I'd stake 99% confidence that if you'd ask the authors if the paper says >> or >> shows that, they'd reply with an emphatic "no". >> >> Consider the abstract: >> >> > Context-Free Grammars (CFGs) and Parsing Expression Grammars (PEGs)have >> > several similarities and a few differences in both their syntax and >> semantics, >> > but they are usually presented through formalisms that hinder a proper >> > comparison. In this paper we present a new formalism for CFGs that >> highlights >> > the similarities and differences between them. The new formalism >> borrows from >> > PEGs the use of parsing expressions and the recognition-based >> semantics. We >> > show how one way of removing non-determinism from this formalism yields >> a >> > formalism with the semantics of PEGs. We also prove, based on these new >> > formalisms, how LL(1) grammars define the same language whether >> interpreted as >> > CFGs or as PEGs, and also show how strong-LL(k), right-linear, and >> LL-regular >> > grammars have simple language-preserving translations from CFGs to >> PEGs. Once >> > these classes of CFGs can be automatically translated to equivalent >> PEGs, we >> > can reuse classic top-down grammars in PEG-based tools. >> >> This essentially says that LL(1) grammars are the subset of both PEG and >> CFG >> that share the same behaviour. >> >> Hence your definition of "safe" seems to be "behaving similarly to CFGs". >> >> But like I said, and you said in turn, everything is entitled to their >> own opinions. >> >> I do however think that if a sizeable portion of the parsing community >> would >> agree that PEG are valuable, it might be worthwhile to point out that your >> opinion is in fact, an opinion that does is not consensual by any stretch. >> >> Similarly, why not mention some work building on PEG? In particular >> left-recursion and packrat parsing. Even if you do think they build on a >> flawed >> base, certainly we can agree these results are very significant for those >> of us >> who do use PEG? >> >> About quips, while I don't mind them in general, they seem to me here to >> be >> precisely an example of the "rope-pulling" that you decry. Wouldn't it be >> more >> high and mighty to simply list facts and publications and let people form >> their >> own opinion? >> >> Now, why do I care? I don't mind much about people being "wrong on the >> internet" >> (™) — that ship sailed long ago — but the idea of a parsing timeline is >> exciting to me. I'd love to recommend it and advertise it, but I find the >> bias >> towards the end prevents me to do so. I think many people might share this >> opinion. >> >> The way I tend to perceive "timelines" and other "histories" is as >> somewhat >> authoritative and unbiased — at least, that's the platonic ideal. Think >> about >> the standard that Wikipedia tries to enforce for instance. Wikipedia does >> not >> only list consensus — opinions and disputes are relevant, after all — but >> clearly does mark them as such. >> >> Do you think there is a way for the timeline to move in that direction? >> >> On Wednesday, February 20, 2019 at 1:53:31 AM UTC+1, Jeffrey Kegler wrote: >>> >>> Thanks for your kind words. In this, I'll just respond to the PEG >>> points. >>> >>> That part of my timeline has caused more blowback than any other part, >>> not unexpectedly. The quips are aimed, not at people like yourself who, >>> while PEG optimists, know its difficulties and do not mislead others. >>> Unfortunately, at the time I wrote that version of the timeline (and still >>> I suspect) you are the minority of pro-PEG voices out on the Web. >>> >>> Many presentations of PEG present it as a "throw anything at me" >>> solution. These less careful voices were the loudest and cause a lot of >>> folks to waste time pushing a rope. My quips were aimed to catch the >>> attention of these folks and they did their job. >>> >>> I've done this before, with yacc, for the same reason, and caught the >>> same kind of heat. People routinely were recommending yacc for "serious >>> parsing" either in ignorance or reckless disregard of the actual experience >>> with it. So I did a "bovicide" series of blog posts. Nowadays I don't >>> diss yacc, because it's no longer the danger to parsing newbies that it was. >>> >>> If you read my entry carefully, and get beyond the quips (as I am sure >>> you did), you see that I *do* say there are safe uses of PEG, and in the >>> footnote concede that my description in the text is "negative". I go on to >>> point the reader toward the pro-PEG literature -- a literature of which the >>> less careful PEG advocates are usually unaware. >>> >>> So to my careless reader, a warning about PEG gets across, and to my >>> careful reader, I point out that, while I am more optimistic about >>> Earley/Leo, there is safe use of PEG and a literature worth consulting. >>> >>> In fact, I think there are 5 "forever" algorithms, parsing algorihms >>> that are of permanent significance, and PEG is one of them. (The other 4 >>> are regular expressions, recursive descent, Earley/Leo and Sakai's >>> algorihm, aka CYK). >>> >>> Getting to the merits, I've heard before that some find the PEG syntax >>> more intuitive than BNF. Everyone is the ultimate authority on their own >>> intuitions, so I won't dispute this. And you are more optimistic about >>> research into PEG than I am, clearly. But as Yogi Berra says, "predictions >>> are difficult, especially about the future". My timeline does not engage >>> in predictions, although it certainly reflects my optimism wrt Earley/Leo. >>> I do claim to be entitled to greater confidence in my predictions on the >>> grounds, at 65, I am much less likely to be around if they fall flat. :-) >>> >>> I'll tackle the points about why the timeline includes/excludes what it >>> does separately. >>> >>> On Tue, Feb 19, 2019 at 5:47 AM <[email protected]> wrote: >>> >>>> (Following up from >>>> https://twitter.com/jeffreykegler/status/1097628806091816960) >>>> >>>> Dear Jeffrey, >>>> >>>> I was delighted when I found about your parsing timeline a couple of >>>> months ago. >>>> I'm a PhD student researching parsing myself and it's great to see >>>> resources >>>> like this bringing awareness to what happened in the field, and it's a >>>> great >>>> reference for people in the field as well. It also got me more >>>> interested in >>>> Earley and your own work of it, which is quite cool. >>>> >>>> That being said, I do have some insatisfactions about the timeline, >>>> especially >>>> the end of it. It's always easy to criticize, so I would like to offer >>>> my help >>>> instead, to see if I can contribute something useful to the timeline. >>>> >>>> According to me, there are two points that could use improvement. >>>> First, I feel >>>> top-down recursive-descent formalizations, especially PEG, are being >>>> unfairly >>>> dismissed. Second, it seems to me like there are quite a few new >>>> developments >>>> that would deserve inclusion at the end of the timeline (especially by >>>> comparison to the level of details towards the start of the timeline). >>>> >>>> I don't pretend to have the objective truth, but the goal is to have a >>>> productive conversation that can hopefully lead to improved accuracy. >>>> >>>> --- >>>> >>>> Regarding PEG, I find the whole description minus the first paragraph >>>> to be >>>> rather contestable and (I feel) uncharitable. >>>> >>>> It's entirely correct that PEG does not cover BNF. But BNF is >>>> essentially a >>>> notation for the CFG formalism, which PEG is not. >>>> >>>> It seems your main beef with PEG is the problem that has been described >>>> as >>>> "language hiding" or "prefix capture": the fact that a rule like `A ::= >>>> a | aa` >>>> will not match the input "aa". >>>> >>>> This is indeed an important problem of PEGs. But the way I see it, it's >>>> a dual >>>> to the problem of ambiguity in CFGs. Both preclude one another: PEGs are >>>> unambiguous but have prefix capture, CFGs do not have prefix capture >>>> but are >>>> ambiguous. The detection of both issue is provably intractable >>>> (statically) and >>>> may be more related than we know, as a relatively similar machinery can >>>> be put >>>> to work to do partial detection of both ambiguity in CFGs and prefix >>>> capture in >>>> PEGs, cf. "Modular Syntax Demands Verification" by Sylvain Schmitz (*). >>>> >>>> (*) http://www.i3s.unice.fr/~mh/RR/2006/RR-06.32-S.SCHMITZ.pdf >>>> >>>> I have written a fair deal of both PEGs and CFGs and both seem about >>>> equally >>>> difficult to write. Non-general CFG formalism (LALR, LL) are much >>>> harder, and so >>>> are PEG tools that have no support to build left-associative parse >>>> trees. >>>> >>>> I also would never advise anyone not to test a grammar, whatever the >>>> formalism. >>>> I've certainly never written a big chunk of grammar without making a >>>> mistake >>>> whatever the formalism. And in practice, the overwhelming majority of >>>> these >>>> mistakes weren't about either prefix capture nor ambiguity. >>>> >>>> If your practical experience differs, I'd be interested to hear about >>>> it. >>>> >>>> --- >>>> >>>> Regarding new developments, here are a couple of ideas of the top of my >>>> head. >>>> The first two items of the list are not random, since those were things >>>> I worked >>>> on in my research (http://norswap.com/publications/) >>>> >>>> - Left-recursion and left-associativity handling in PEG, since it fixes >>>> the >>>> foremost problem (in my opinion) with PEGs >>>> >>>> - Context-sensitive parsing >>>> >>>> You already already mention monadic parsing (but do not emphasize its >>>> context-sensitive properties). Another oldie that deserves to be >>>> mentionned >>>> are Prolog-style definite clause grammars (DCGs). >>>> >>>> There are some other works worth mentioning if this is a topic of >>>> interest for >>>> you. >>>> >>>> My own work in the area is essentially pushing an idea which >>>> expressed in a >>>> lovely manner: >>>> >>>> > Recursive descent does have a huge advantage, one which, despite >>>> its severe >>>> > limitations, will save it from obsolescence time and again. >>>> Hand-written >>>> > recursive descent is essentially calling subroutines. Adding custom >>>> > modification to recursive descent is very straight-forward. >>>> >>>> But making it so that these subroutines may manipulate state in a way >>>> that is >>>> safe in the presence of backtracking, enabling context-sensitive >>>> parsing. >>>> >>>> - Packrat parsing, if only because it enables O(1) parsing of PEG, a >>>> class of >>>> languages that may still strictly include the whole of CFG — as far >>>> as I know, >>>> we do not know of a simple language that can be expressed in CFG but >>>> not in >>>> PEG (we do know language expressible in PEG but not in CFGs). >>>> >>>> Note that, even if PEG does indeed include CFG, that does not imply >>>> the >>>> existance of an algorithm to convert from one to the other. >>>> >>>> - Tackling the problem of ambiguity formally, >>>> "Disambiguating Grammars with Tree Automata" >>>> https://michaeldadams.org/papers/disambiguating-grammars/ >>>> >>>> - GLL parsing probable deserves a mention for mention for taking a >>>> LL-style >>>> machinery to the point of finally parsing fully general CFGs through >>>> the use >>>> of Tomita-style graph structured stacks >>>> http://www.cs.rhul.ac.uk/research/languages/csle/GLLparsers.html >>>> >>>> - Similarly, and while I'm not a fan of the way this work is marketed, >>>> parsing >>>> using the Brzozowski derivative is certainly an interesting direction >>>> in >>>> parsing: http://matt.might.net/articles/parsing-with-derivatives/ >>>> >>>> - This might be outside the scope of your timeline, but there is also a >>>> whole >>>> lot of work (some quite old) on "semi-parsing", i.e. fuzzy or partial >>>> parsing: >>>> http://grammarware.net/text/2014/semiparsing.pdf >>>> >>>> --- >>>> >>>> There you have it. Let me know what you think, and I'm looking forward >>>> to >>>> talking with you! I hope I can help the project in any capacity that >>>> I'm able. >>>> >>>> Cheers, >>>> >>>> Nicolas LAURENT >>>> >>>> -- >>>> You received this message because you are subscribed to the Google >>>> Groups "marpa parser" group. >>>> To unsubscribe from this group and stop receiving emails from it, send >>>> an email to [email protected]. >>>> For more options, visit https://groups.google.com/d/optout. >>>> >>> -- >> You received this message because you are subscribed to the Google Groups >> "marpa parser" group. >> To unsubscribe from this group and stop receiving emails from it, send an >> email to [email protected]. >> For more options, visit https://groups.google.com/d/optout. >> > -- You received this message because you are subscribed to the Google Groups "marpa parser" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. For more options, visit https://groups.google.com/d/optout.
