Thanks to Bart and Douglas for the hints.

 

I’ve discovered a lot of things in this process, including the encoding of
my input file.

 

I’m on the way now…

 

Thanks all.

 

 

De: Douglas Godfrey [mailto:[email protected]] 
Enviada em: quinta-feira, 2 de junho de 2011 04:48
Para: Nilo Roberto C Paim
Cc: [email protected]
Assunto: Re: [antlr-interest] Accentuated chars in brazilian portuguese

 

Lookup the latin Unicode code pages on Wikipedia and add the Unicode code
points for 
accented Latin1 to your rule WORD.

fragmen
Latin1_Supplement                   :   '\u00A0' .. '\u00FF';
fragment
Latin_ExtendedA                     :   '\u0100' .. '\u017F';
fragment
Latin_ExtendedB                     :   '\u0180' .. '\u024F';



On Wed, Jun 1, 2011 at 4:53 PM, Nilo Roberto C Paim <[email protected]>
wrote:

Hi all,

I'm newbie using Antlr and I'm facing a problem when trying to parse a text
that contains accentuated chars in Brazilian Portuguese.

I've put a word definition on my grammar as follows:

               WORD :                  ( '\u00c0'..'\u00ff' | 'a'..'z' |
'A'..'Z' | '-' )+ ;

But have no success on parsing. Words like "não" ("no" in Portuguese) causes
lexar throws "Antlr.Runtime.NoViableAltException".

I'm trying to use C#.

Any hint?

TIA

Nilo, from Brasil...


List: http://www.antlr.org/mailman/listinfo/antlr-interest
Unsubscribe:
http://www.antlr.org/mailman/options/antlr-interest/your-email-address

 


List: http://www.antlr.org/mailman/listinfo/antlr-interest
Unsubscribe: 
http://www.antlr.org/mailman/options/antlr-interest/your-email-address

-- 
You received this message because you are subscribed to the Google Groups 
"il-antlr-interest" group.
To post to this group, send email to [email protected].
To unsubscribe from this group, send email to 
[email protected].
For more options, visit this group at 
http://groups.google.com/group/il-antlr-interest?hl=en.

Reply via email to