On Saturday, 7 July 2012 at 22:05:05 UTC, Dmitry Olshansky wrote:
That would not always work, but yes that's what we should do
IMHO.
I didn't do a deeper research on this, just shared my current
understanding. When that doesn't work in a particular case, we
will need to find the problem and solve that.
It's not what packrats do however (when I say algorithm I mean
specifically packrat). And PEGs are commonly associated with
packrats.
As Philippe Sigaud admitted, that is an incorrect association.
it's obvious how backtracking comes in ... whan say A fails
somewhere in the middle of it.
Simply stop tracking respective state. Process others in
parallel as I described above.
Yeah it could be done, but it doesn't buy you a thing in a lot
of cases unlike in regular grammar. The problem is that state
is not as simple as in regex for instance (even in regex that
must capture some submatches DFA won't do BTW). Thus you can't
merge seemingly identical states because they depend on
(different) interpretation of previous part of input.
Well, if you must track the path to a particular state, then DFA
simply doesn't match the problem and there's nothing we can do.
That's why DFA in ANTLR is only used to do lookahead or lexing,
it doesn't maintain path through states during parsing.
See above. Is maintaining path really necessary? I thought that
DFA with additional states could be created for any particular
situation of ambiguity, etc. This needs further research.
Of course there is a fair amount of bookkeeping that reduces
doing
work all over again but it does ... allocate.
The amount of used memory would be proportional to the length
of input
after the branching point times number of rules (actually, of
states in
the DFA, which is likely larger but still finite for a
particular grammar).
Finite and proportional to input size? Another way to say why
an O(n) memory requirement for packrats?
Sorry, I incorrectly included the O(n) multiplier. It should be
finite. When you go to the next character, you don't need
previous states any more.
Yup. But I'm not talking definitions here I'm talking the
notion of "integrated lexing" and lexing is certainly about
characters or was it?
I thought integrated lexing was about doing it along with parsing
and specifying its rules directly inside the grammar. From the
former we get structural context to narrow-down available
options, and from the latter we have a uniform grammar
description.