Thanks to Bart and Douglas for the hints.
I’ve discovered a lot of things in this process, including the encoding of my input file. I’m on the way now… Thanks all. De: Douglas Godfrey [mailto:[email protected]] Enviada em: quinta-feira, 2 de junho de 2011 04:48 Para: Nilo Roberto C Paim Cc: [email protected] Assunto: Re: [antlr-interest] Accentuated chars in brazilian portuguese Lookup the latin Unicode code pages on Wikipedia and add the Unicode code points for accented Latin1 to your rule WORD. fragmen Latin1_Supplement : '\u00A0' .. '\u00FF'; fragment Latin_ExtendedA : '\u0100' .. '\u017F'; fragment Latin_ExtendedB : '\u0180' .. '\u024F'; On Wed, Jun 1, 2011 at 4:53 PM, Nilo Roberto C Paim <[email protected]> wrote: Hi all, I'm newbie using Antlr and I'm facing a problem when trying to parse a text that contains accentuated chars in Brazilian Portuguese. I've put a word definition on my grammar as follows: WORD : ( '\u00c0'..'\u00ff' | 'a'..'z' | 'A'..'Z' | '-' )+ ; But have no success on parsing. Words like "não" ("no" in Portuguese) causes lexar throws "Antlr.Runtime.NoViableAltException". I'm trying to use C#. Any hint? TIA Nilo, from Brasil... List: http://www.antlr.org/mailman/listinfo/antlr-interest Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-email-address List: http://www.antlr.org/mailman/listinfo/antlr-interest Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-email-address -- You received this message because you are subscribed to the Google Groups "il-antlr-interest" group. To post to this group, send email to [email protected]. To unsubscribe from this group, send email to [email protected]. For more options, visit this group at http://groups.google.com/group/il-antlr-interest?hl=en.
