I just recently noticed that the generated code from my lexer grammar
contains something like the following snippet:
.
.
else if ( (((LA17_0 >= 'A') && (LA17_0 <= 'Z'))) )
{
alt17=2;
}
else if ( (((LA17_0 >= 'a') && (LA17_0 <= 'z'))) )
{
alt17=3;
}
else if ( (((LA17_0 >= 0x00A0) && (LA17_0 <= 0xD7FF))) )
{
alt17=4;
}
.
.
The generated code seems to comfortably use 'A' ... 'Z' literals. This may
not be good if let's say I compile the generated code in an IBM z/OS EBCDIC
environment as ['A' .. 'Z'] range contains more than just the 26 alphabet
codes and the value of the codes are not the same as the ones in Unicode
character set.
I'm expecting something like in the third expression where 'A' is written
explicitly as 0x0041 (Unicode for 'A').
Please confirm.
-Lego
--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups
"il-antlr-interest" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to
[email protected]
For more options, visit this group at
http://groups.google.com/group/il-antlr-interest?hl=en
-~----------~----~----~----~------~----~------~--~---
List: http://www.antlr.org/mailman/listinfo/antlr-interest
Unsubscribe:
http://www.antlr.org/mailman/options/antlr-interest/your-email-address