Lookup the latin Unicode code pages on Wikipedia and add the Unicode code points for accented Latin1 to your rule WORD.
fragmen Latin1_Supplement : '\u00A0' .. '\u00FF'; fragment Latin_ExtendedA : '\u0100' .. '\u017F'; fragment Latin_ExtendedB : '\u0180' .. '\u024F'; On Wed, Jun 1, 2011 at 4:53 PM, Nilo Roberto C Paim <[email protected]>wrote: > Hi all, > > I'm newbie using Antlr and I'm facing a problem when trying to parse a text > that contains accentuated chars in Brazilian Portuguese. > > I've put a word definition on my grammar as follows: > > WORD : ( '\u00c0'..'\u00ff' | 'a'..'z' | > 'A'..'Z' | '-' )+ ; > > But have no success on parsing. Words like "não" ("no" in Portuguese) > causes > lexar throws "Antlr.Runtime.NoViableAltException". > > I'm trying to use C#. > > Any hint? > > TIA > > Nilo, from Brasil... > > > List: http://www.antlr.org/mailman/listinfo/antlr-interest > Unsubscribe: > http://www.antlr.org/mailman/options/antlr-interest/your-email-address > List: http://www.antlr.org/mailman/listinfo/antlr-interest Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-email-address -- You received this message because you are subscribed to the Google Groups "il-antlr-interest" group. To post to this group, send email to [email protected]. To unsubscribe from this group, send email to [email protected]. For more options, visit this group at http://groups.google.com/group/il-antlr-interest?hl=en.
