You need to remember to state which target you are talking about. I have written a new universal input stream for the next version of the C runtime. It takes 8bit, 16 bit, UTF-8, UTF-16, UCS2, UTF32 and EBCDIC (code gen will change slightly to support this). It is not well tested right now but will be available as a snapshot 3.3 release shortly in the downloads page.
In the meantime the easiest thing to do is to convert to UCS2 using the supplied converter in the current runtime. Though this will not work with surrogate pairs in UTF-16 though but most people do not need that. If you really need UTf-8 without conversion then it is easy enough to write, or you can just steal the code from my check in of the code in about 10 minutes. Note that while the streams work, I have not provided ANTLR3_STRING support for UTF-8 and so on yet and so getting $text from such a stream may or may not work, Jim > -----Original Message----- > From: [email protected] [mailto:antlr-interest- > [email protected]] On Behalf Of Xie, Linlin > Sent: Wednesday, January 20, 2010 3:32 AM > To: [email protected] > Subject: [antlr-interest] UTF-8 input? > > Can anyone tell me if antlr3.1.3 generated parser works with UTF-8 > input? If it does, how should I configure in the grammar? I noticed > there are two macros ANTLR3_INLINE_INPUT_ASCII and > ANTLR3_INLINE_INPUT_UTF16, but no UTF-8 one. > > > > Many thanks! > > Linlin > > > List: http://www.antlr.org/mailman/listinfo/antlr-interest > Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your- > email-address List: http://www.antlr.org/mailman/listinfo/antlr-interest Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-email-address
-- You received this message because you are subscribed to the Google Groups "il-antlr-interest" group. To post to this group, send email to [email protected]. To unsubscribe from this group, send email to [email protected]. For more options, visit this group at http://groups.google.com/group/il-antlr-interest?hl=en.
