Just passing along an example HTML subset lexer/parser using ANTLR v4; thanks
to debugging and moral support from Oliver Zeigermann, we got the code
generation and runtime support working sufficiently to use the following
grammars. generate some really nice code indeed. You will note that, except
for the enhancement of the lexer modes, the grammars are backward compatible
with v3 :)
I still have a long way to go, but it's looking more & more useful (only does
LL(1) code generation at this point).
Ter
---------------------------
lexer grammar HTMLLexer;
TAG_START : '<' {pushMode(INSIDE);} ;
COMMENT : '<!--' .* '-->' {skip();} ;
TEXT : ~'<'+ ;
mode INSIDE;
TAG_STOP : '>' {popMode();} ;
END_TAG : '/' ID '>' {popMode();} ;
ID : ('A'..'Z'|'a'..'z'|'0'..'9'|'_'|'#')+ ;
EQ : '=' ;
STRING : '"' .* '"'
;
WS : ' '+ {skip();} ;
------------------------
parser grammar HTMLParser;
options { tokenVocab=HTMLLexer; }
file : ( TAG_START (starttag | endtag) | TEXT)+ EOF ;
starttag : ID attr* TAG_STOP ;
attr : ID (EQ (ID|STRING))? ;
endtag
: END_TAG
;
List: http://www.antlr.org/mailman/listinfo/antlr-interest
Unsubscribe:
http://www.antlr.org/mailman/options/antlr-interest/your-email-address
--
You received this message because you are subscribed to the Google Groups
"il-antlr-interest" group.
To post to this group, send email to [email protected].
To unsubscribe from this group, send email to
[email protected].
For more options, visit this group at
http://groups.google.com/group/il-antlr-interest?hl=en.