Javascript is not Java I know, but Jean-Damien Durand has written several
full language parsers, including ECMAScript:
https://github.com/jddurand/MarpaX-Languages-ECMAScript-AST

On Thu, Oct 13, 2016 at 8:00 AM, Harry <simonsha...@gmail.com> wrote:

> Hello,
>
> I'm very new to Marpa but, from its description, it looks extremely
> awesome.
>
> I'm also done playing with the beginner's example of the expression
> calculator; was also able to make small changes to it. So far, so good.
>
> However, now, I'm trying to write a Java 8 Parser using the grammar
> published here:
>     https://docs.oracle.com/javase/specs/jls/se8/html/jls-19.html
>
> While I think I'm able to map the above Oracle grammar spec to the G1
> rules (if I stub out some of the lexemes referenced the G1 rules) and
> create an instance of Marpa::R2::Scanless::G, I'm having a hard time
> writing the L0 lexer rules in SLIF for the Lexer grammar
> <https://docs.oracle.com/javase/specs/jls/se8/html/jls-3.html>.  Some
> issues that I will need to (but don't know how to) deal with are:
>
> 1. Keyword vs Identifier:
>
>   The Java spec defines Identifier
> <https://docs.oracle.com/javase/specs/jls/se8/html/jls-3.html#jls-3.8>
> thus:
> Identifier:
> IdentifierChars
> <https://docs.oracle.com/javase/specs/jls/se8/html/jls-3.html#jls-IdentifierChars>
>  but not a Keyword
> <https://docs.oracle.com/javase/specs/jls/se8/html/jls-3.html#jls-Keyword>
>  or BooleanLiteral
> <https://docs.oracle.com/javase/specs/jls/se8/html/jls-3.html#jls-BooleanLiteral>
>  or NullLiteral
> <https://docs.oracle.com/javase/specs/jls/se8/html/jls-3.html#jls-NullLiteral>
> IdentifierChars:
> JavaLetter
> <https://docs.oracle.com/javase/specs/jls/se8/html/jls-3.html#jls-JavaLetter>
>  {JavaLetterOrDigit
> <https://docs.oracle.com/javase/specs/jls/se8/html/jls-3.html#jls-JavaLetterOrDigit>
> }
> JavaLetter:
> any Unicode character that is a "Java letter"
> JavaLetterOrDigit:
> any Unicode character that is a "Java letter-or-digit"
>   *So, how do I do* the "not a Keyword or BooleanLiteral or NullLiteral"
> part? In Perl regex, one could do a negative lookahead assertion like so...
>
> if (m/ (?! $Keyword | $BooleanLiteral | $NullLiteral ) $IdentifierChars /x
> ) {
>     # this is an Identifier
> }
>
>
> ... but only if Marpa allowed such a rich, Perl regex syntax. Which it
> doesn't, apparently, in SLIF.
>
> 2. Comment (single- and multi-line versions)
> I could write a bunch of G1 rules to handle the multi-line Java comment,
> but I'm seeing it becoming very verbose. Is there an easier way to handle
> stuff like this in SLIF?
>
> 3. Since Marpa is Perl-based, is it possible to tap the full power of Perl
> regex engine, especially for lexing?
>
> 4. Notice that Java 8 spec for recognizing tokens is in the form of a Lexer
> grammar <https://docs.oracle.com/javase/specs/jls/se8/html/jls-3.html>...
> that is written in BNF style instead of a 'flat', regex style. If I were to
> mechanically replicate the Lexer grammar using G1 rules (instead of L0
> rules), would it entail a performance and space overhead by creating
> unnecessary tree nodes for what would otherwise be a flat lexeme in
> bison/flex?
>
> 5. Would Marpa experts recommend using SLIF (internal scanner) for Java 8,
> or should I abandon it in favor of a custom / external lexer?
>
>
> Regards,
> /Harry
>
> --
> You received this message because you are subscribed to the Google Groups
> "marpa parser" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to marpa-parser+unsubscr...@googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"marpa parser" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to marpa-parser+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to