@amon: A very nice summary. Thanks! @mdorman: At this point many people have more & more recent experience with Marpa::R2 applications than I do. [ Though I still know the internals better than anyone, I guess. :-) ] amon is an expert, and also the author of some very good tutorials.
On Tue, Nov 21, 2017 at 1:06 PM, amon <[email protected]> wrote: > A high cost in stack_step is not entirely unsurprising, but if it is > taking too long subjectively then there are some points that might be > fixable. Note that stack_step() implements the semantics of your parser, > which are only run *after* a successful parse. I don't think using events > in the recognizer phase contributes to this cost. > > One important strategy is to make sure you're handling as much of your > grammar as possible on the L0 level, not on the G1 level. This might not be > always possible since this can affect the language matched by locally > ambiguous language. However, a G1 rule such as "digit ::= '0' | '1' | ... | > '9'" is almost certainly wrong. The L0 level can handle character-level > matching more efficiently since it doesn't need to go through the > `stack_step()` evaluation. > > Where possible, prefer built-in semantics such as `::first` over providing > your own callbacks. Ignoring the values of unnecessary symbols (like "Rule > ::= Interesting (Boring)") may be necessary to use built-in semantics, but > is a great idea anyway since it avoids unnecessary evaluations. > > The structure of your grammar should describe the AST you're trying to > produce as closely as possible. With Marpa it is not necessary to rewrite > your grammar to satisfy the restrictions of the parser (aside from > extracting sequence rules). So if you translated your Yapp grammar > directly, there might be unnecessary rules. But this advice is already > about micro-optimization and unlikely to result in a phenomenal speedup. > > All in all, I am surprised by your 2x slowdown since the semantics of Yapp > and Marpa are comparable (doing less work is faster). Possibly, having > moved from your hand-written lexer to Marpa's scanless interface may have > caused some problem, e.g. suddenly having an order of magnitude more > lexemes per document than before. But this kind of issue can't be discussed > productively without seeing the grammar in question. > > > On Monday, 20 November 2017 22:52:13 UTC+1, [email protected] wrote: >> >> Hey, all, >> >> >> >> I’ve been hoping for some time to replace an existing >> parser---implemented using YAPP and a custom lexer---with a parser >> implemented using Marpa::R2. >> >> >> >> After a moderate amount of work, I finally got a working parser. >> >> >> >> The grammar is far easier to understand and work with than the old YAPP >> parser; because we’re dealing with an indentation based format, I do have >> to use a discard event to track indentation depths and emit indentation >> tokens and the like. >> >> >> >> However, it’s disappointingly slow, taking about twice as long to parse >> our full set of files. That was *really* unexpected. >> >> >> >> Immediately suspecting my code, I cranked my pathological case through >> Devel::NYTProf; imagine my surprise when the top entry in the list of ‘top >> 15 subroutines’ was an xsub: Marpa::R2::Thin::V::*stack_step* >> <https://groups.google.com/forum/parser-1-line.html#Marpa__R2__Thin__V__stack_step>. >> And this wasn’t by a small amount: that routine took 556s out of 590s or >> so, and the next most expensive routine is Marpa::R2::Thin::SLR::read at >> 12.7s. These completely dominated the total runtime. >> >> >> >> While I will try to sanitize my parser and make it postable so people can >> perhaps point out problems, I thought I might first just ask: is that sort >> of high cost in stack_step generally representative of some sort of problem >> in the grammar? Is there some well-known construct that leads to a blow-up >> that is easily eliminated, etc.? Is there something I could log or examine >> that might shed some light? >> >> >> >> Any guidance would be appreciated. >> >> >> >> Michael Dorman >> > -- > You received this message because you are subscribed to the Google Groups > "marpa parser" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to [email protected]. > For more options, visit https://groups.google.com/d/optout. > -- You received this message because you are subscribed to the Google Groups "marpa parser" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. For more options, visit https://groups.google.com/d/optout.
