Hey, all,

 

I’ve been hoping for some time to replace an existing parser---implemented 
using YAPP and a custom lexer---with a parser implemented using Marpa::R2.

 

After a moderate amount of work, I finally got a working parser.

 

The grammar is far easier to understand and work with than the old YAPP 
parser; because we’re dealing with an indentation based format, I do have 
to use a discard event to track indentation depths and emit indentation 
tokens and the like.

 

However, it’s disappointingly slow, taking about twice as long to parse our 
full set of files.  That was *really* unexpected.

 

Immediately suspecting my code, I cranked my pathological case through 
Devel::NYTProf; imagine my surprise when the top entry in the list of ‘top 
15 subroutines’ was an xsub: Marpa::R2::Thin::V::*stack_step* 
<https://groups.google.com/forum/parser-1-line.html#Marpa__R2__Thin__V__stack_step>.
  
And this wasn’t by a small amount: that routine took 556s out of 590s or 
so, and the next most expensive routine is Marpa::R2::Thin::SLR::read at 
12.7s.  These completely dominated the total runtime.

 

While I will try to sanitize my parser and make it postable so people can 
perhaps point out problems, I thought I might first just ask: is that sort 
of high cost in stack_step generally representative of some sort of problem 
in the grammar?  Is there some well-known construct that leads to a blow-up 
that is easily eliminated, etc.?  Is there something I could log or examine 
that might shed some light?

 

Any guidance would be appreciated.

 

Michael Dorman

-- 
You received this message because you are subscribed to the Google Groups 
"marpa parser" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
For more options, visit https://groups.google.com/d/optout.

Reply via email to