On Sat, Jul 7, 2012 at 6:06 PM, Roman D. Boiko <[email protected]> wrote:
> The most critical was the one you implemented: ability to define a custom > parse tree. I also needed the ability to use immutable structures (almost) > everywhere, while currently they must be mutable. Also it would be great to > avoid UTF conversions (instead use functions and types templated by the > string type). I added dstrings because 1- at the time (a few months ago), the lists here were awash in UTF-32 discussions and I thought that'd be the way to go anyway 2- other D parsing libraries seemed to go to UTF32 also (CTPG) 3- I wanted to be able to parse mathematical notation like nabla, derivatives, etc. which all have UTF32 symbols. > However, as I found a few hours ago, Packrat parsing (typically used to > handle PEG) has serious disadvantages: it complicates debugging because of > frequent backtracking, it has problems with error recovery, and typically > disallows to add actions with side effects (because of possibility of > backtracking). These are important enough to reconsider my plans of using > Pegged. I will try to analyze whether the issues are so fundamental that I > (or somebody else) will have to create an ANTLR-like parser instead, or > whether it is possible to introduce changes into Pegged that would fix these > problems. Note that PEG does not impose to use packrat parsing, even though it was developed to use it. I think it's a historical 'accident' that put the two together: Bryan Ford thesis used the two together. Note that many PEG parsers do not rely on packrat (Pegged does not). There are a bunch of articles on Bryan Ford's website by a guy writting a PEG parser for Java, and who found that storing the last rules was enought to get a slight speed improvement, buth that doing anymore sotrage was detrimental to the parser's overall efficiency.
