[il-antlr-interest: 26336] Re: [antlr-interest] ANTLR C: Question regarding the portability of generated lexer C code

Jim Idle Fri, 16 Oct 2009 23:07:01 -0700

Well, you could pay me to make an EBCDIC version ;) However, there is in 
practice no problem with mixing this – I have done it before on zOS.

I think that you need to look at this in the opposite light in that it isn’t 
that ANTLR code isn’t portable, but your lexer specification (and the fact that 
EBCDIC is stupid).

Why are you specifying your rule as:

ID: ‘a’..’z’

When that is not a valid range in your target environment? 

Change the ranges to:

ID: ‘a’..’k’ | ‘l’..’t’ …

Or whatever the valid ranges are. ANLTR might be ‘clever’ here and assuming 
ASCII, may merge those ranges, so you might need to fold the ranges into 
fragments and so on. However, if you rework your lexer rules, I am sure that 
this can be done in portable fashion that does not require ASCII assumptions 
within the compiler.

Jim

From: [email protected] 
[mailto:[email protected]] On Behalf Of Lego Haryanto
Sent: Friday, October 16, 2009 2:59 AM
To: David-Sarah Hopwood
Cc: [email protected]
Subject: Re: [antlr-interest] ANTLR C: Question regarding the portability of 
generated lexer C code

Thanks for the response, ...

Unfortunately, it won't work in our situation without major changes.  We 
already have legacy C code which is compiled with default/native, and while we 
can use a different compile option for the ANTLR generated code, I'm not sure 
if it's good moving forward with mixed compilation rules.

The argument remains that it means the generated C lexer code has to be 
compiled by an ASCII-based compiler which may not be that portable.

Best,
-Lego

On Thu, Oct 15, 2009 at 12:30 PM, David-Sarah Hopwood 
<[email protected]> wrote:

Lego Haryanto wrote:
> Jim, thanks for your response ...
>
> I know that in the EBCDIC system we feed a Unicode stream into the lexer,
> thus I'm pretty sure when the generated lexer code I pasted before is
> executed, it is already operating on the 32-bit unicode stream.
>
> The problem is more about the native C compilation in an EBCDIC system like
> IBM z/OS mainframe.
>
> To see if a character from the Unicode stream is an 'A', we have to compare
> with a value 0x0041 ... If we match it with a native 'A' in the code, this
> will not be a match in an EBCDIC C compilation.

The z/OS C compiler is able to compile in a mode where string and character
literals are treated as ISO-8859-1.
<http://lists.gnupg.org/pipermail/gcrypt-devel/2009-July/001469.html>

--
David-Sarah Hopwood  ⚥  http://davidsarah.livejournal.com

List: http://www.antlr.org/mailman/listinfo/antlr-interest
Unsubscribe: 
http://www.antlr.org/mailman/options/antlr-interest/your-email-address

-- 
Fear of the LORD is the beginning of knowledge (Proverbs 1:7)

--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups 
"il-antlr-interest" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to 
[email protected]
For more options, visit this group at 
http://groups.google.com/group/il-antlr-interest?hl=en
-~----------~----~----~----~------~----~------~--~---

List: http://www.antlr.org/mailman/listinfo/antlr-interest
Unsubscribe: 
http://www.antlr.org/mailman/options/antlr-interest/your-email-address

[il-antlr-interest: 26336] Re: [antlr-interest] ANTLR C: Question regarding the portability of generated lexer C code

Reply via email to