> Changing the input stream would throw off memoization in a Packrat parser.
Only some parts of the memoization table would be invalidated by an insert or a remove. You would need to update the rest of the memoization table in order to support the new indices. (The right data structure for the table would help keep costs down.) This is precisely why I think Packrat parsing works well for incremental parsing. > Mizushima and others on this list have work showing that memory use in Packrat parsers can be sublinear with the addition of cuts, and that cut insertion can be automated. I don't think this would work if one wants to optimize incremental parsing, as opposed to one-time parsing. You need to keep the old parsing data around in order to minimize future duplication of effort. I reckon that you would always need at least a linear amount of information in the memoization table, otherwise you'll have to reparse the entire string (at least linearly many bits of it) whenever a change is made. But this is speculation on my part. To give an example of the algorithm I laid out above, consider the (overly simple) PEG: A <- x B y B <- C / D / E C <- foo D <- bar E <- baz Now parse the string "xbazy" via Packrat parsing. You end up with the memoization table A (0:5) : SUCCESS(x, B, y) B (1:4) : SUCCESS(E) C (1:2) : ERROR D (1:4) : ERROR E (1:4) : SUCCESS(baz) Now insert "r" at index 4 (so you end up with "xbazry"). You need to invalidate the first entry because 4 is inside the range [0,5). We now have A (0:5) : ERROR B (1:4) : SUCCESS(E) C (1:2) : ERROR D (1:4) : ERROR E (1:4) : SUCCESS(baz) Note that this update only required testing the symbol A, so it took a constant amount of time (assuming the string insert has already been done). Next, let us say that we remove the "z" at index 3. This requires that we invalidate all entries except C (1:2). Then we end up with A (0:5) : SUCCESS(x, B, y) B (1:4) : SUCCESS(D) C (1:2) : ERROR D (1:4) : SUCCESS(bar) This fact that the entry C (1:2) didn't change is an example of how, in a large program, we don't need to update the vast majority of the syntax tree nor the memoization table when we update the code locally. Since this is a small example, we don't get the full force of the effect, but with a large program it comes in handy. For example, consider this simplified PEG for s-exps: SEXP <- ("(" SEXP* ")" / [a-z]+) " "* When we change the string "(foo bar baz foo bar baz foo bar baz)" to "(foo bar baz foo bar baz foo meow bar baz)", only three entries in the memoization table are invalidated (the outer entry, the last "foo" entry, and the last "bar" entry). We need to update last "baz" entry in the memoization table to give it new indices. After that, we parse the string, resulting in four new entries (the outer entry, the last "foo" entry, the "meow" entry, and the last "bar" entry). Apologies for the long email. I thought this illustration of the algorithm might be useful. On Mon, May 6, 2013 at 5:19 PM, Juancarlo Añez <apal...@gmail.com> wrote: > Hello Dinesh, > > Yeah! I was probably not quite on target. Sorry. > > I don't see the connection of your answer Juancarlo and Francisco's >> question on incremental parsing. >> >> Who is talking about error recovery? >> > > The principle is the same. PEG parsers are free to manipulate the input > stream. > >> I am not aware of an existing PEG implementation that saves the >> memoization table in order to allow efficient re-parsing of a slightly >> altered version of the input. >> > PEG doesn't have to be Packrat. But you're right: changing the input > stream would throw off memoization in a Packrat parser. > > >> In fact it seems that Packrat is fairly memory hungry, and that attempts >> are made to drop this table as soon as possible. >> > > Mizushima and others on this list have work showing that memory use in > Packrat parsers can be sublinear with the addition of cuts, and that cut > insertion can be automated. > > Cheers, > > -- > Juancarlo *Añez* >
_______________________________________________ PEG mailing list PEG@lists.csail.mit.edu https://lists.csail.mit.edu/mailman/listinfo/peg