Im having an issue with parsing an input that contains unicode characters.

This is the code Im using to test the parser (messageBytes is an array
created by reading bytes from a binary file):

private static void parseMessage(byte[] messageBytes) throws IOException{

        ByteArrayInputStream input = new ByteArrayInputStream(messageBytes);
        ANTLRInputStream in = new ANTLRInputStream(input);
        UnitedToteLexer lexer = new UnitedToteLexer(in);
        CommonTokenStream tokens = new CommonTokenStream(lexer);
        UnitedToteParser parser = new UnitedToteParser(tokens);


        try {
            parser.message();

            printHexArray(messageBytes);

        } catch (Exception e){
            // TODO handle unrecognized message formats
            System.out.println("Unrecognized message format");
        }
    }

The main problem I have at the moment is that I get a number of these guys:

line 1:1 no viable alternative at character ' '
line 1:2 no viable alternative at character '�'
line 1:3 no viable alternative at character '�'
line 1:4 no viable alternative at character 'x'
line 1:5 no viable alternative at character '?'
...

Essentially, one for each character that is not explicitely defined as a
token in my grammar. Nonetheless, I do have the following rule:

BYTE_VALUE    :    '\u0000'..'\uFFFE';

Which should, if I understand correctly, include all unicode characters.

Now, I understand there was a charVocabulary option in previous versions of
ANTLR to aid with this problem, but it seems it was removed in ANTLR 3.

Was this problem solved in a different way?

[btw my grammar is rather large, Im not sure I should post 400 lines in this
message.]

List: http://www.antlr.org/mailman/listinfo/antlr-interest
Unsubscribe: 
http://www.antlr.org/mailman/options/antlr-interest/your-email-address

-- 
You received this message because you are subscribed to the Google Groups 
"il-antlr-interest" group.
To post to this group, send email to [email protected].
To unsubscribe from this group, send email to 
[email protected].
For more options, visit this group at 
http://groups.google.com/group/il-antlr-interest?hl=en.

Reply via email to