With great interest I have read the paper on the Marpa parser. This summer
> I have created my own Earley parser, a probabilistic one
> <https://github.com/digitalheir/java-probabilistic-earley-parser/>, and I
> am looking to implement some of Marpa's innovations in my own.
>
> Right now I am working on error handling. One person suggested I
> implement synchronizing rules like in shift-reduce parsers
> <https://github.com/digitalheir/java-probabilistic-earley-parser/issues/5#issuecomment-285617145>.
> I figured panic mode should behave a bit differently for probabilistic
> ambiguous grammars, so I settled on the following (text taken from issue on
> Github):
>
> Consider the following grammar:
>
> S -> AB C; (0.5)
> S -> A B ; (0.5)
>
> A -> a a ; (0.5)
> B -> b b ; (0.5)
> C -> c c ; (0.5)
>
> AB -> a a ; b b ; (0.5)
>
> A -> a <error> ; (0.5)
> B -> b <error> ; (0.5)
> AB -> a <error> ; (0.5)
>
> On the input a a ; b z ; c c ; with goal S. When encountering z, the
> parser could apply both B -> b <error> ; and AB -> a <error> ;. We would
> like to consider AB because that's the only way to make S. So we need to
> consider both error rules.
>
> Now consider the same grammar, except the first rule is S -> AB; on the
> same input. We have established that we want to consider both error rules,
> and in this case both make a valid parse. However, we want the probability
> of AB -> a <error> ; being applied to be lower than that of B -> b
> <error> ;, because we want to keep the number of ignored tokens minimal.
>
> That's why I have decided to make the error rule probability multiply
> every time a token is ignored. The inner score of applying B -> <error> ; is
> then be 0.5^2, and the inner score of AB -> <error> ; is 0.5^4.
>
>
> I suspect that you have more sophisticated knowledge than I do about
> Earley parsing, so I seek out your opinion: do you think this is a good
> idea? It also makes me a bit uncomfortable that above grammar fails
> terminally on a a ; b z ; c c ; cc ; , because the parser gets out of
> panic mode after encountering a semicolon. But maybe that should be a
> problem for another error handling strategy.
>
> Which bring me to the Ruby slippers error handling. I don't know if I have
> a good idea of how this works. The way I interpret is as follows: the
> parser might stall on input which prohibits it from succeeding (no states
> follow from predict, no states follow from scan or there are no states to
> complete). Because the chart will always contain some active states, you
> know what the parser expects to succeed and you could let the user decide
> whether to manipulate the input stream so that the parser may proceed given
> this information (eg, drop token, insert a token, etc). Is that correct?
>
> Thanks for your help!
>
> Maarten Trompper
>

-- 
You received this message because you are subscribed to the Google Groups 
"marpa parser" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
For more options, visit https://groups.google.com/d/optout.

Reply via email to