@amon: A very nice summary.  Thanks!

@mdorman: At this point many people have more & more recent experience with
Marpa::R2 applications than I do.  [ Though I still know the internals
better than anyone, I guess. :-) ]  amon is an expert, and also the author
of some very good tutorials.

On Tue, Nov 21, 2017 at 1:06 PM, amon <[email protected]> wrote:

> A high cost in stack_step is not entirely unsurprising, but if it is
> taking too long subjectively then there are some points that might be
> fixable. Note that stack_step() implements the semantics of your parser,
> which are only run *after* a successful parse. I don't think using events
> in the recognizer phase contributes to this cost.
>
> One important strategy is to make sure you're handling as much of your
> grammar as possible on the L0 level, not on the G1 level. This might not be
> always possible since this can affect the language matched by locally
> ambiguous language. However, a G1 rule such as "digit ::= '0' | '1' | ... |
> '9'" is almost certainly wrong. The L0 level can handle character-level
> matching more efficiently since it doesn't need to go through the
> `stack_step()` evaluation.
>
> Where possible, prefer built-in semantics such as `::first` over providing
> your own callbacks. Ignoring the values of unnecessary symbols (like "Rule
> ::= Interesting (Boring)") may be necessary to use built-in semantics, but
> is a great idea anyway since it avoids unnecessary evaluations.
>
> The structure of your grammar should describe the AST you're trying to
> produce as closely as possible. With Marpa it is not necessary to rewrite
> your grammar to satisfy the restrictions of the parser (aside from
> extracting sequence rules). So if you translated your Yapp grammar
> directly, there might be unnecessary rules. But this advice is already
> about micro-optimization and unlikely to result in a phenomenal speedup.
>
> All in all, I am surprised by your 2x slowdown since the semantics of Yapp
> and Marpa are comparable (doing less work is faster). Possibly, having
> moved from your hand-written lexer to Marpa's scanless interface may have
> caused some problem, e.g. suddenly having an order of magnitude more
> lexemes per document than before. But this kind of issue can't be discussed
> productively without seeing the grammar in question.
>
>
> On Monday, 20 November 2017 22:52:13 UTC+1, [email protected] wrote:
>>
>> Hey, all,
>>
>>
>>
>> I’ve been hoping for some time to replace an existing
>> parser---implemented using YAPP and a custom lexer---with a parser
>> implemented using Marpa::R2.
>>
>>
>>
>> After a moderate amount of work, I finally got a working parser.
>>
>>
>>
>> The grammar is far easier to understand and work with than the old YAPP
>> parser; because we’re dealing with an indentation based format, I do have
>> to use a discard event to track indentation depths and emit indentation
>> tokens and the like.
>>
>>
>>
>> However, it’s disappointingly slow, taking about twice as long to parse
>> our full set of files.  That was *really* unexpected.
>>
>>
>>
>> Immediately suspecting my code, I cranked my pathological case through
>> Devel::NYTProf; imagine my surprise when the top entry in the list of ‘top
>> 15 subroutines’ was an xsub: Marpa::R2::Thin::V::*stack_step*
>> <https://groups.google.com/forum/parser-1-line.html#Marpa__R2__Thin__V__stack_step>.
>> And this wasn’t by a small amount: that routine took 556s out of 590s or
>> so, and the next most expensive routine is Marpa::R2::Thin::SLR::read at
>> 12.7s.  These completely dominated the total runtime.
>>
>>
>>
>> While I will try to sanitize my parser and make it postable so people can
>> perhaps point out problems, I thought I might first just ask: is that sort
>> of high cost in stack_step generally representative of some sort of problem
>> in the grammar?  Is there some well-known construct that leads to a blow-up
>> that is easily eliminated, etc.?  Is there something I could log or examine
>> that might shed some light?
>>
>>
>>
>> Any guidance would be appreciated.
>>
>>
>>
>> Michael Dorman
>>
> --
> You received this message because you are subscribed to the Google Groups
> "marpa parser" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to [email protected].
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"marpa parser" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
For more options, visit https://groups.google.com/d/optout.

Reply via email to