I'm very new to Marpa but, from its description, it looks extremely 

I'm also done playing with the beginner's example of the expression 
calculator; was also able to make small changes to it. So far, so good.

However, now, I'm trying to write a Java 8 Parser using the grammar 
published here:

While I think I'm able to map the above Oracle grammar spec to the G1 rules 
(if I stub out some of the lexemes referenced the G1 rules) and create an 
instance of Marpa::R2::Scanless::G, I'm having a hard time writing the L0 
lexer rules in SLIF for the Lexer grammar 
<https://docs.oracle.com/javase/specs/jls/se8/html/jls-3.html>.  Some 
issues that I will need to (but don't know how to) deal with are:

1. Keyword vs Identifier: 

  The Java spec defines Identifier 
<https://docs.oracle.com/javase/specs/jls/se8/html/jls-3.html#jls-3.8> thus:
 but not a Keyword 
or BooleanLiteral 
 or NullLiteral 
any Unicode character that is a "Java letter"
any Unicode character that is a "Java letter-or-digit"
  *So, how do I do* the "not a Keyword or BooleanLiteral or NullLiteral" 
part? In Perl regex, one could do a negative lookahead assertion like so...
if (m/ (?! $Keyword | $BooleanLiteral | $NullLiteral ) $IdentifierChars /x) 
    # this is an Identifier

... but only if Marpa allowed such a rich, Perl regex syntax. Which it 
doesn't, apparently, in SLIF.

2. Comment (single- and multi-line versions)
I could write a bunch of G1 rules to handle the multi-line Java comment, 
but I'm seeing it becoming very verbose. Is there an easier way to handle 
stuff like this in SLIF?
3. Since Marpa is Perl-based, is it possible to tap the full power of Perl 
regex engine, especially for lexing? 

4. Notice that Java 8 spec for recognizing tokens is in the form of a Lexer 
grammar <https://docs.oracle.com/javase/specs/jls/se8/html/jls-3.html>... 
that is written in BNF style instead of a 'flat', regex style. If I were to 
mechanically replicate the Lexer grammar using G1 rules (instead of L0 
rules), would it entail a performance and space overhead by creating 
unnecessary tree nodes for what would otherwise be a flat lexeme in 

5. Would Marpa experts recommend using SLIF (internal scanner) for Java 8, 
or should I abandon it in favor of a custom / external lexer?


You received this message because you are subscribed to the Google Groups 
"marpa parser" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to marpa-parser+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to