On Wed, Jul 23, 2014 at 7:39 AM, Roman Redziejowski <roman.r...@swipnet.se> wrote:
> Once B in Ger's examples sucessfully consumed A, it indeed does not need > to backtrack (as this must fail). > But G in the second example has to backtrack after failing B, even if B > passed its cut point. So B has no right to erase the history because it may > be used by the containing expression. > So the cut points should somehow be considered in the entire hierarchy of > parsing expressions. I just made this experiment in Grako: http://goo.gl/SFzD5S Instead of ditching all the memos with positions less than the cut position, it limits the pruning to the positions above that of the last fork (optional, choice, closure). The interesting part is this: def prune_cache(cache): - prune_dict(cache, lambda k, _: k[0] < cutpos) + prune_dict(cache, lambda k, _: forkpos < k[0] < cutpos) In theory, the change should have had no impact on performance as it could not have diminished the number of cache hits, as more information than before is kept in the cache. The reality is that the time to translate 500 KLOC of COBOL to C# went from about 2 minutes to some unknown amount of time above 30 minutes. I don't know what happened exactly because I didn't let the experiment finish, nor did I research further. I speculate that the considerably increased is time due to an increase in several "constant" time factors (like CPU cache hits) due to the increased memory use. I may try the experiment again with a smaller test suite, but, for now, the conclusion seems to be that performance in PEG parsing may be affected more by mundane computing factors than by theoretical ones. It just occured to me that a compromise solution to the problem exposed in this thread and the result of my experiment would be to have a generational memo cache: push an empty cache at each fork point. Cheers, -- Juancarlo *Añez*
_______________________________________________ PEG mailing list PEG@lists.csail.mit.edu https://lists.csail.mit.edu/mailman/listinfo/peg