On 2014-09-20 02:27PM, Loup Vaillant-David wrote:
>In any case, they should come soon.  First I'll finish the
>recogniser (I have to correct the empty rules bug using Aycock &
>Horsepool (that's easy), and optimise right recursive rules using
>Leo (haven't figured that out yet).

Cool. I'll look forward to it.

>Actually, you don't need the back pointers.  Plain Earley items are
>enough.  Even better, you don't need all the items.  You only need the
>completed ones.

Sure, it's just a classic space/time tradeoff. The recognizer
knows how it is deriving each match. So you can either save that
information as back pointers, or you can reconstruct the
connections later by searching.

>This business with back pointers is needlessly complicated, I found.

Hmm...did you try drawing out where they point to? For instance,
with the example you gave back in June, if you start with the
top-level match and follow the pointers, you get this:

http://www.arestlessmind.org/2014/09/20/parse-tree.jpg

You can see that you don't care about the contents of the left
back-pointer -- it's just reversing through the rule. But they
form the structure which puts the right sub-trees in order.

Actually, you shouldn't even need to check whether matches are
complete: if only the completion operation adds back pointers,
then you know that the right side is a complete match and the left
side isn't.


>But there's another, more annoying ambiguity:
>
>   A -> ?? ??? (i)
>   A -> ?? ??? (j)
>
>Sometimes, the same rule can span different lengths.  That's typical
>of repetition rules.  So you have two choices:
>
>1. Just apply the longest match rule.
>2. Dig both rules in parallel, search for the "topmost, first"
>   difference (whatever that means), and apply prioritised choice
>   there.

Oh, right. Now that you mention it, I *have* seen examples of
practical grammars with that sort of ambiguity. I don't know that
there's any good solution to that. IIRC it tends to be "real"
ambiguity where the user (the person writing the source, not the
person writing the grammar) could mean either thing. So I suspect
you have to just go with the longest match, unless you want to
provide for context-sensitive disambiguation or something...

That's my wild-ass guess, anyway. :)

--Josh
_______________________________________________
fonc mailing list
fonc@vpri.org
http://vpri.org/mailman/listinfo/fonc

Reply via email to