Just use action code.

However you are committing the common error of trying to enforce things at
too low a level. You should let the \ escape any character at all, then
let the parser produce a tree then when walking the tree look through
strings and validate. This will give you:

Invalid escape sequence '\g', can be '\m' or '\n' ...

Whereas a lexer fail gives:

Unexpected character 'g'

And your users will have no idea why. Also the lex will fail so the parse
won't run and so syntax errors and validation errors won't get reported.
So, for the sake of one mistyped character the whole tool chain will
abort. Always push the errors as far down the chain as you can, preferably
to the semantic phase if technically possible. Basically a lexer should
never fail if at all possible, even if it is just because the last rule
is:

BAD : . { error(UNKNOWN_CHARACTER, $text); skip(); } ;

Jim

> -----Original Message-----
> From: [email protected] [mailto:antlr-interest-
> [email protected]] On Behalf Of Douglas Godfrey
> Sent: Friday, February 25, 2011 8:34 AM
> To: [email protected]
> Subject: [antlr-interest] Can Antlr use a variable in a lexer pattern?
>
> in the snippet below, can "escape_character" be a variable?
> it seems that this would not work because the "escape_character" is not
> known until it is too late.
> the alternate form below might work if the Antlr Lexer can use a
> variable in the pattern match.
> can the lexer apply the escape character as a post processing
> validation step?
>     i.e. accept anything within the quotes and then validate the
> sequence after the ESCAPE clause?
>
> Unicode_Identifier  =
>         U Ampersand
>         Double_Quote  ( Unicode_Identifier_Part )+ Double_Quote
>         ( ESCAPE escape_character )?
>         ;
>
>
> Alternate form:
>
> Unicode_Identifier  =
>         U Ampersand
>         ( ESCAPE escape_character )?
>         Double_Quote  ( Unicode_Identifier_Part )+ Double_Quote
>         ;
>
>
> fragment
> Unicode_Identifier_Part  = Unicode_Permitted_Identifier_Character  |
> Unicode_Escape_Value ;
>
> fragment
> Unicode_Escape_Value  = Unicode_4_Digit_Escape_Value  |
> Unicode_6_Digit_Escape_Value ;
>
> fragment
> Unicode_4_Digit_Escape_Value  = escape_character  Hexit  Hexit  Hexit
> Hexit ;
>
> fragment
> Unicode_6_Digit_Escape_Value  = escape_character  Plus_Sign Hexit
> Hexit Hexit  Hexit  Hexit  Hexit ;
>
> escape_character            = Back_Slash /*!! See the Syntax Rules*/; ;
>
> List: http://www.antlr.org/mailman/listinfo/antlr-interest
> Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-
> email-address

List: http://www.antlr.org/mailman/listinfo/antlr-interest
Unsubscribe: 
http://www.antlr.org/mailman/options/antlr-interest/your-email-address

-- 
You received this message because you are subscribed to the Google Groups 
"il-antlr-interest" group.
To post to this group, send email to [email protected].
To unsubscribe from this group, send email to 
[email protected].
For more options, visit this group at 
http://groups.google.com/group/il-antlr-interest?hl=en.

Reply via email to