On Saturday, 7 July 2012 at 22:05:05 UTC, Dmitry Olshansky wrote:
That would not always work, but yes that's what we should do IMHO.
I didn't do a deeper research on this, just shared my current understanding. When that doesn't work in a particular case, we will need to find the problem and solve that.

It's not what packrats do however (when I say algorithm I mean specifically packrat). And PEGs are commonly associated with packrats.
As Philippe Sigaud admitted, that is an incorrect association.

it's obvious how backtracking comes in ... whan say A fails somewhere in the middle of it.
Simply stop tracking respective state. Process others in parallel as I described above.

Yeah it could be done, but it doesn't buy you a thing in a lot of cases unlike in regular grammar. The problem is that state is not as simple as in regex for instance (even in regex that must capture some submatches DFA won't do BTW). Thus you can't merge seemingly identical states because they depend on (different) interpretation of previous part of input.
Well, if you must track the path to a particular state, then DFA simply doesn't match the problem and there's nothing we can do.

That's why DFA in ANTLR is only used to do lookahead or lexing, it doesn't maintain path through states during parsing.
See above. Is maintaining path really necessary? I thought that DFA with additional states could be created for any particular situation of ambiguity, etc. This needs further research.

Of course there is a fair amount of bookkeeping that reduces doing
work all over again but it does ... allocate.

The amount of used memory would be proportional to the length of input after the branching point times number of rules (actually, of states in the DFA, which is likely larger but still finite for a particular grammar).

Finite and proportional to input size? Another way to say why an O(n) memory requirement for packrats?
Sorry, I incorrectly included the O(n) multiplier. It should be finite. When you go to the next character, you don't need previous states any more.

Yup. But I'm not talking definitions here I'm talking the notion of "integrated lexing" and lexing is certainly about characters or was it?

I thought integrated lexing was about doing it along with parsing and specifying its rules directly inside the grammar. From the former we get structural context to narrow-down available options, and from the latter we have a uniform grammar description.

Reply via email to