[il-antlr-interest: 26283] Re: [antlr-interest] ANTLR C: Question regarding the portability of generated lexer C code

Jim Idle Thu, 15 Oct 2009 03:11:32 -0700

ANTLR works internally with 32 bit Unicode (UTF32), not EBCDIC, even if it is 
in 8 bit mode. So you need to convert the EBCDIC to Unicode 8 bits and use the 
‘ASCII’ input stream. A simple way to do this would be to write your own EBCDIC 
input stream that just converted to Unicode code points (essentially 
EBCDIC->ASCII) on the fly via a lookup table. Trivial and should be pretty 
quick.


 

Jim

 

From: [email protected] 
[mailto:[email protected]] On Behalf Of Lego Haryanto
Sent: Tuesday, October 13, 2009 3:51 AM
To: [email protected]
Subject: [antlr-interest] ANTLR C: Question regarding the portability of 
generated lexer C code

 

I just recently noticed that the generated code from my lexer grammar contains 
something like the following snippet:

            .
            .
            else if ( (((LA17_0 >= 'A') && (LA17_0 <= 'Z'))) ) 
            {
                alt17=2;
            }
            else if ( (((LA17_0 >= 'a') && (LA17_0 <= 'z'))) ) 
            {
                alt17=3;
            }
            else if ( (((LA17_0 >= 0x00A0) && (LA17_0 <= 0xD7FF))) ) 
            {
                alt17=4;
            }
            .
            .

The generated code seems to comfortably use 'A' ... 'Z' literals.  This may not 
be good if let's say I compile the generated code in an IBM z/OS EBCDIC 
environment as ['A' .. 'Z'] range contains more than just the 26 alphabet codes 
and the value of the codes are not the same as the ones in Unicode character 
set.

I'm expecting something like in the third expression where 'A' is written 
explicitly as 0x0041 (Unicode for 'A').

Please confirm.

-Lego




--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups 
"il-antlr-interest" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to 
[email protected]
For more options, visit this group at 
http://groups.google.com/group/il-antlr-interest?hl=en
-~----------~----~----~----~------~----~------~--~---

List: http://www.antlr.org/mailman/listinfo/antlr-interest
Unsubscribe: 
http://www.antlr.org/mailman/options/antlr-interest/your-email-address

[il-antlr-interest: 26283] Re: [antlr-interest] ANTLR C: Question regarding the portability of generated lexer C code

Reply via email to