Hi, I have a problem with a parser which needs to interpret a comment in a 
command language. The CL uses commands inside an HTML command pair: '<!--' 
command '-->' and I can parse most commands, except for the REM command which 
is a comment remark and should be ignored.
I wrote a small test grammar, which shows the problem more or less:

grammar Remarks;

options {
  language = Java;
}

rule: commandLine+ ;

commandLine
    :   '<!--' command '-->'
    ;

command
    :   breakCommand 
    |   remarkCommand
    ;
    
remarkCommand
    :   REM (.)*
    ;
    
breakCommand
    :   BREAK
    ;
    
WS
    :   (' ' | '\t' | '\r' | '\n')+ { $channel = HIDDEN; }
    ;

REM
    :   '#' ('R'|'r') ('E'|'e') ('M'|'m')
    ;
    
BREAK
    :   '#' ('B'|'b')('R'|'r')('E'|'e')('A'|'a')('K'|'k');

IDENT : ('a'..'z' | 'A'..'Z')('a'..'z' | 'A'..'Z' | '0'..'9')*;

A sample command file might look like this:

<!-- #rem some comment -->
<!--        #break -->
<!-- #rem some comment with $AAA &*&^, A9a 5eee and 99922 and .<><> -->

The parser recognizes the rem commands and the break command, but some 
characters are lost. It also divides the "comment" text into other tokens 
(IDENT in this case). Ideally I would like to get all characters back as one 
part, but I tried several constructs without any result.
The last line is even parsed worse: all "special" characters like $, &, etc are 
generating warnings and not found back into the tokens. The errors/warnings 
generated are like this:

line 3:28 no viable alternative at character '$'
line 3:33 no viable alternative at character '&'
line 3:34 no viable alternative at character '*'
line 3:35 no viable alternative at character '&'
line 3:36 no viable alternative at character '^'
line 3:37 no viable alternative at character ','
line 3:43 no viable alternative at character '5'
line 3:52 no viable alternative at character '9'
line 3:53 no viable alternative at character '9'

How can I create the comment, so that all characters are either ignored or 
returned as one rule or token ? It should do so only when inside a comment. I 
looked at other grammars for comments, like C with /* */ and see they do about 
the same.
                                          
_________________________________________________________________
Your E-mail and More On-the-Go. Get Windows Live Hotmail Free.
https://signup.live.com/signup.aspx?id=60969

List: http://www.antlr.org/mailman/listinfo/antlr-interest
Unsubscribe: 
http://www.antlr.org/mailman/options/antlr-interest/your-email-address

-- 
You received this message because you are subscribed to the Google Groups 
"il-antlr-interest" group.
To post to this group, send email to [email protected].
To unsubscribe from this group, send email to 
[email protected].
For more options, visit this group at 
http://groups.google.com/group/il-antlr-interest?hl=en.

Reply via email to