In general, it seems that Earley, especially with Marpa enhancements, 
matches or improves on most any parsing algorithm. But its complexity seems 
to be a judged with respect to the length of the string, not the size of 
the grammar.

I’m not great at complexity proofs, and I was wondering if Earley retains 
its performance advantages for very large numbers of rules, over CYK. If I 
understand correctly, it has to add Earley items for every rule, top down, 
and examine them. So for an NLP project I am doing, with many thousands of 
necessarily distinct rules, would its space/time complexity suffer greatly?

For example, if every rule started with a distinct terminal, would Earley 
have to add and scan every one in S(0), while CYK would in constant time 
(and less space) select just the relevant rule(s)? As in:

start: r1 | r2 | r3 | … | r_i

r1: "Print" …
r2: "Set" …
r3: "Walk" …
….
r_i: T_i …

Then in S(0) we have to add and scan rules r1 … r_i, while in CYK it would 
immediately be able to select a very small subset of r1 … r_i (in CNF), 
without examining all other rules.

Thanks for your help.

-- 
You received this message because you are subscribed to the Google Groups 
"marpa parser" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
For more options, visit https://groups.google.com/d/optout.

Reply via email to