Re: Undo's

Jeffrey Kegler Sat, 25 Jan 2014 10:25:11 -0800

Yes, you're on the right track. There is really a vast potential forundo's -- the immediate reason that I rewrote the parse engine was tomake them easier to implement, which I wouldn't have done for anyordinary new feature.

Here's how they relate to "forgiving lexemes": Right now if a lexeme isnot acceptable to G1, but is "forgiven", the lexer backs up and looksfor an acceptable lexeme. With undos, I can start the lexer, but thenimmediately "undo" the predictions of any lexemes which will not beacceptable to G1. These lexemes won't even be looked for, which is moreefficient.

The ability for the parser to say, "you know, on second thought, nevermind" is useful in all sort of contexts, many of which I hope will occurto others before they occur to me.

Incidentally, phase 3 will offer a limited ability to "undo", targetedat the case I just mentioned -- forgiving predictions at location 0 inorder to implement "cheap forgiveness".

Undo's will also be available through the THIF, and those of you lookingat writing your own Marpa interfaces should be able to get a lot ofmileage out of them.


-- jeffrey

On 01/25/2014 05:22 AM, Ruslan Shvedov wrote:

I was thinking about use cases for Undo's and here is the braindump:

    this use case for forgiving adverb
    <https://gist.github.com/rns/8473261> to test for literals
    depending on the symbol which follows them


    MarpaX::Regex::Verbal I wrote about today in another thread on
    this list
    <https://groups.google.com/d/msg/marpa-parser/2TZn5nolyzk/CY597CfNT2IJ> to
    implement look-ahead per Regex features directly rather than
    invoke a regex for them.

    47 /*_maybe/ rules in Jean-Damien's MarpaX::Languages::C::AST
    
<https://github.com/jddurand/MarpaX-Languages-C-AST/blob/master/lib/MarpaX/Languages/C/AST/Grammar/ISO_ANSI_C_2011.pm>
    C grammar

    Marpa-powered Lex: Lex regexes with Marpa-powered extensions


Am I on the right track?

On Sun, Jan 19, 2014 at 4:25 AM, Jeffrey Kegler<[email protected]<mailto:[email protected]>> wrote:


    I am working at adding "undo's" to Libmarpa -- the ability, while
    a parse
    is in progress, to go back and erase or "undo" some of the rules it
    has recognized.

    Potential applications include zero-width assertions. Zero-width
    assertions are things like word boundaries and line endings, etc.
    Regexes think character by character and allow zero-width
    assertions about
    adjacent characters in an easy and general way. Regexes aren't really
    parsers -- they're state machines that run deal in character
    sequences.
    But if the problem you're solving solves easily that way (and many
    do),
    then regexes are perfect.

    Marpa is for languages that regexes won't handle well.  And everything
    regexes do, Marpa can do today, though it may require a different
    approach.  But there are cases where you want Marpa to, just for a
    moment,
    think like a regex engine.  You want to be able to put zero-width
    assertions into your BNF, just as if your BNF rule was a regex.
    Undo's will make this possible.

    A probable first application of undo's will be to the SLIF' s
    "forgiving" lexemes.  These were recently introduced to Marpa and
    allow, in addition to the traditional longest-tokens-matching (LTM),
    longest-acceptable-tokens-matching (LATM).  That is, if you use
    forgiving
    tokens, Marpa can be smart about lexing, and use the context of
    the parse
    to determine which lexemes are or are not relevant. But in the current
    implementation, it's safest to use LATM on a token by token basis
    -- in
    some cases forgiving tokens can cause an efficiency hit.  With undo's,
    Marpa will be able to do LATM efficiently in the general case.

    Undo's will be a by-product of the Marpa parse engine rewrite
    currently
    underway.  Undo's are, of course, on a rule by rule basis.  But
    Marpa's
    parse engine had used a technique from Aycock & Horspool's 2002 paper,
    and tracked, not rules and positions in them, but sets of
    rule-positions
    grouped in "LR(0) states".

    At this point I have considerable experience with the use of LR(0)
    states for Earley parsing and I believe that in practice and on
    average,
    when it comes to speed, their use is without benefit. In theoretical,
    time complexity, terms, LR(0) states never offered any gains, so
    that the potential gains were always limited.  And implementing LR(0)
    states within Earley's algorithm was suprisingly complex, and resulted
    in a parser that was far harder to understand, maintain and extend.
    So Marpa's parse engine is going back to the traditional Earley method
    of tracking progress by rule and position.

    As long as Marpa's parse engine used LR(0) states, implementing undo's
    was possible, but daunting.  Now that the parse engine works directly
    using rules, the road to undo's becomes much easier.

--You received this message because you are subscribed to the Google

    Groups "marpa parser" group.
    To unsubscribe from this group and stop receiving emails from it,
    send an email to [email protected]
    <mailto:marpa-parser%[email protected]>.
    For more options, visit https://groups.google.com/groups/opt_out.


--

You received this message because you are subscribed to the GoogleGroups "marpa parser" group.To unsubscribe from this group and stop receiving emails from it, sendan email to [email protected].

For more options, visit https://groups.google.com/groups/opt_out.


--
You received this message because you are subscribed to the Google Groups "marpa 
parser" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
For more options, visit https://groups.google.com/groups/opt_out.

Re: Undo's

Reply via email to