i was recently asked about a possible extension of the Aycock-Horspool 
method.  In reply I noted that Marpa no longer uses Aycock-Horspool items, 
and gave the reasons:

I used the AH items, but eventually decided to go back and rip them out.  I 
discovered they bought little or nothing in terms of measurable speed-up, 
and at a very real cost.

AH's own implementation apparently had bugs due to duplication of Earley 
items.  I solved these, but it made the code very complex and slower.   
Eventually 
I ran some numbers and realized that most of the compression that the AH 
method bought was in predictions, which could be compressed in other, less 
complicated, ways.

Using AH items means that whenever something needs to be done in terms of 
the user's grammar, there has to be a translation.  Since the AH item to 
Earley item relationship is many-to-many, this is not trivial.  I was 
eventually able to implement Marpa features that required checking items at 
real-time, translating back and forth on the fly into traditional Earley 
items, but I realized that the complexity of dealing with AH items would be 
a very serious obstacle to future work.  By the time I backed out the AH 
items, I'd invested considerable more time into working them than anyone, 
including (I believe) Aycock and Horspool themselves.

Even if you don't want the flexiblity of working with Earley items in 
realtime, any compression of Earley items must be undone at evaluation 
time.  In addition to AH, a few papers have described ways to reduce the 
number of Earley items, but these focus on taking things as far as the 
recognizer -- they produce the Earley tables and stop.  Often the question 
of the cost of undoing the compression at evaluation time is not addressed, 
and the cost of the more complex evaluation phase seems likely to eliminate 
any speed-up in the recognizer.

I want to point out that in that same paper, Aycock and Horspool, as a side 
issue, describe a way of handling nulled symbols and productions via 
grammar rewrites.  This proved very valuable and Marpa *does* continue to 
use that.  It's a bit ironic that I ended up ignoring what AH's paper was 
intended to be about, but that what (I think) they present as little more 
than an implementation trick has proved central to my efforts.  But because 
of their solution to the question of nullable symbols and rules, I consider 
Marpa to be a parser in the Aycock-Horspool lineage, despite its 
renunciation of AH items.

-- 
You received this message because you are subscribed to the Google Groups 
"marpa parser" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to marpa-parser+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to