I'm just starting Antlr after running into a wall trying to use
a state pattern with regular expressions to implement a DSL.
I have the first Antlr book, and this has been quite helpful so far.
One problem that I've run into is folded lines. The specification that
I'm trying to write a grammar for says in part:
Any sequence of CRLF followed immediately by a single linear white space
character is ignored (i.e., removed) when processing the content type.
When parsing a content line, folded lines MUST first be unfolded
according to the unfolding procedure described above.
So, the way I'm reading this is that a folding token (' '|'\t') CRLF can
come anywhere in the input stream and needs to be ignored before
processing.
I did the following to discard a folding token between other tokens in a
parsing rule.
id: (FOLD)=>
| ID '=' ID ';' NEWLINE
| NEWLINE
;
FOLD: (' '|'\t') NEWLINE {skip();} ;
NEWLINE: '\r'? '\n' ;
ID: ('a' .. 'z' | 'A' .. 'Z')+ ;
WS: (' '|'\t'|'\r'|'\n')+ {skip();} ;
This works fine when typing in:
cat=dog;
cat = dog;
cat
= dog;
It fails when typing in:
ca
t=dog;
I'm trying to get two ID tokens out of the last entry.
I'm obviously not understanding something fundamental. Hopefully I can
accomplish this without filtering the input before the Antlr-generated
code is used.
Pointers welcome.
Thanks in advance - /mde/
List: http://www.antlr.org/mailman/listinfo/antlr-interest
Unsubscribe:
http://www.antlr.org/mailman/options/antlr-interest/your-email-address
--
You received this message because you are subscribed to the Google Groups
"il-antlr-interest" group.
To post to this group, send email to [email protected].
To unsubscribe from this group, send email to
[email protected].
For more options, visit this group at
http://groups.google.com/group/il-antlr-interest?hl=en.