date:20110208

[il-antlr-interest: 31385] [antlr-interest] setTokenStream() on grammar with imports + feature request

2011-02-08 Thread Bill Andersen

Folks

Can anyone tell me quick if this makes sense...  I have grammar A that imports 
B.  I want to reuse the A parser so I call

AParser.setTokenStream(input);

Thing is setTokenStream() is inherited from Parser and by default does not 
communicate with delegate parsers.  Should I override in AParser with

public void setTokenStream(TokenStream input) {
super.setTokenStream(input);
gB.setTokenStream(input);
}

where gB is the delegate parser in AParser?

And a feature request...  I use grammar imports quite extensively and have 
already run into an issue where I needed a method in a parser to get its 
delegate parsers.  Is there some fundamental barrier to providing such a method 
on Lexer, Parser, and TreeParser that would return some collection of delegate 
parsers/lexers/tree parsers?


Bill Andersen 
Highfleet, Inc. (www.highfleet.com)
3600 O'Donnell Street, Suite 600
Baltimore, MD 21224
Office: +1.410.675.1201
Cell: +1.443.858.6444
Fax: +1.410.675.1204






List: http://www.antlr.org/mailman/listinfo/antlr-interest
Unsubscribe: 
http://www.antlr.org/mailman/options/antlr-interest/your-email-address

-- 
You received this message because you are subscribed to the Google Groups 
il-antlr-interest group.
To post to this group, send email to il-antlr-inter...@googlegroups.com.
To unsubscribe from this group, send email to 
il-antlr-interest+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/il-antlr-interest?hl=en.

[il-antlr-interest: 31386] [antlr-interest] Unicode input

2011-02-08 Thread Alex Lujan

Im having an issue with parsing an input that contains unicode characters.

This is the code Im using to test the parser (messageBytes is an array
created by reading bytes from a binary file):

private static void parseMessage(byte[] messageBytes) throws IOException{

ByteArrayInputStream input = new ByteArrayInputStream(messageBytes);
ANTLRInputStream in = new ANTLRInputStream(input);
UnitedToteLexer lexer = new UnitedToteLexer(in);
CommonTokenStream tokens = new CommonTokenStream(lexer);
UnitedToteParser parser = new UnitedToteParser(tokens);


try {
parser.message();

printHexArray(messageBytes);

} catch (Exception e){
// TODO handle unrecognized message formats
System.out.println(Unrecognized message format);
}
}

The main problem I have at the moment is that I get a number of these guys:

line 1:1 no viable alternative at character ' '
line 1:2 no viable alternative at character '�'
line 1:3 no viable alternative at character '�'
line 1:4 no viable alternative at character 'x'
line 1:5 no viable alternative at character '?'
...

Essentially, one for each character that is not explicitely defined as a
token in my grammar. Nonetheless, I do have the following rule:

BYTE_VALUE:'\u'..'\uFFFE';

Which should, if I understand correctly, include all unicode characters.

Now, I understand there was a charVocabulary option in previous versions of
ANTLR to aid with this problem, but it seems it was removed in ANTLR 3.

Was this problem solved in a different way?

[btw my grammar is rather large, Im not sure I should post 400 lines in this
message.]

List: http://www.antlr.org/mailman/listinfo/antlr-interest
Unsubscribe: 
http://www.antlr.org/mailman/options/antlr-interest/your-email-address

-- 
You received this message because you are subscribed to the Google Groups 
il-antlr-interest group.
To post to this group, send email to il-antlr-inter...@googlegroups.com.
To unsubscribe from this group, send email to 
il-antlr-interest+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/il-antlr-interest?hl=en.

[il-antlr-interest: 31387] Re: [antlr-interest] Unicode input

2011-02-08 Thread Bart Kiers

Hi,

On Tue, Feb 8, 2011 at 11:18 PM, Alex Lujan a...@apption.com wrote:

 Im having an issue with parsing an input that contains unicode characters.

 This is the code Im using to test the parser (messageBytes is an array
 created by reading bytes from a binary file):

 private static void parseMessage(byte[] messageBytes) throws IOException{

ByteArrayInputStream input = new ByteArrayInputStream(messageBytes);
ANTLRInputStream in = new ANTLRInputStream(input);
  ...


You'll probably want to set the *encoding* of the input using:

ANTLRInputStreamhttp://www.antlr.org/api/Java/classorg_1_1antlr_1_1runtime_1_1_a_n_t_l_r_input_stream.html#cc37ee52e581d61a2efef0413ae3366f
 (InputStream input, String encoding)


Regards,

Bart.

List: http://www.antlr.org/mailman/listinfo/antlr-interest
Unsubscribe: 
http://www.antlr.org/mailman/options/antlr-interest/your-email-address

-- 
You received this message because you are subscribed to the Google Groups 
il-antlr-interest group.
To post to this group, send email to il-antlr-inter...@googlegroups.com.
To unsubscribe from this group, send email to 
il-antlr-interest+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/il-antlr-interest?hl=en.

[il-antlr-interest: 31388] Re: [antlr-interest] [Bulk] Re: Correct way to handle custom errors?

2011-02-08 Thread Victor Giordano

Hello Sthepen: i was having problems with custom errors handling too .
So far, to catch the errors i added to my grammar.g file this snippet of 
code:


@lexer::members
{
 @Override
 protected Object recoverFromMismatchedToken (IntStream input,int 
ttype,BitSet follow) throws RecognitionException
 {
 throw new MismatchedTokenException (ttype, input);
 }

 @Override
 public Object recoverFromMismatchedSet (IntStream input, 
RecognitionException e, BitSet follow) throws RecognitionException
 {
 throw new MismatchedSetException (follow, input);
 }

 @Override
 public void reportError (RecognitionException e)
 {
 Class exceptionClass = e.getClass ();
 String msg = Línea invalida;
 if (exceptionClass == NoViableAltException.class)
 {
 NoViableAltException nvae = (NoViableAltException) e;
 msg = Símbolo invalido  + nvae.c;
 }
 throw new RuntimeException (msg);
 }  
}

@parser::members
{
 /** Para atrapar errores cuando se repiten tokens válidos*/
 @Override
 protected Object recoverFromMismatchedToken (IntStream input,int 
ttype,BitSet follow) throws RecognitionException
 {
 // Desactivo el mecanismo de recuperación por defecto
 throw new MismatchedTokenException (ttype, input);
 }

 @Override
 public void reportError (RecognitionException e)
 {
 Class exceptionClass = e.getClass ();
 String msg = Línea invalida;
 if (exceptionClass == NoViableAltException.class)
 {
 NoViableAltException nvae = (NoViableAltException) e;
 msg = Símbolo en posición  + (nvae.charPositionInLine+1) 
+  invalido :  + nvae.token.getText ();
 }
 if (exceptionClass == EarlyExitException.class)
 {
 EarlyExitException eee = (EarlyExitException) e;
msg = La expresión esta mal formada. Revisar el ' + 
eee.token.getText () + ';
 }
 if (exceptionClass == MismatchedTokenException.class)
 {
 MismatchedTokenException mte = (MismatchedTokenException) e;
 msg = La expresión esta mal formada. Revisar el ' + 
mte.token.getText () + ';
 }
 throw new RuntimeException (msg);
 }
}

And works perfectly... but i still think that is only a work around... 
cuz i really don't understand how to catch every kind of error propertly.

Hope that helps.
Grettins!

P.D: The message strings are in spanish: Basically the sayd you put 
thinks wrong.


El 07/02/2011 02:48 p.m., Stephen McGruer escribió:
 Jim,

 Thanks for the suggestion - although I don't quite want the level of detail
 you guys go into (and I don't quite understand all of it, possibly because I
 don't know the JavaFX syntax!) the source code was very useful and I think
 I've now got enough to put in some decent error messages.

 There are two things I'm still having trouble with that I think are related
 to this question strongly enough to not warrant a new thread. The first is
 printing out the line that the error occurred on. I wanted to try out
 mimicking the sort of print out that javac does when it encounters an error,
 but I can't work out how to get access to the text of the line that is
 having the problem. I read here -
 http://www.antlr.org/wiki/pages/viewpage.action?pageId=1769 - that if you
 are using the CommonTokenStream you should have access to a token with the
 information, but in my getErrorMessage override that simply returns null for
 most of the errors I've tried (currently MisMatchedToken exceptions):

 public String getErrorMessage(RecognitionException e, String[] tokenNames) {
  // Prints out true!
  System.out.println(e.token == null);
 }

 I then read that you can use the inputstream attached to the
 RecognitionException instead (can't quite remember where... the official
 docs?), but can't figure out how to do this. Currently I have tried casting
 it from an IntStream to a CharStream (I think it will always be a CharStream
 for me... maybe XD) and played with the various methods available, but none
 of them seem to offer a way to just print out the current line. The most
 promising looks like substring(start, stop), but I cannot find a way to get
 the length of the line to give as the stop parameter! I also assume that
 if I do this method I should mark the stream before I begin, and rewind it
 once I'm done?

 Alternatively, if there's a better way to do this when overriding the
 getErrorMessage method please tell me.

 The second problem I'm having is with is stopping the parser from running if
 the lexer has spat out any errors. As far as I can tell, it's very possible
 and easy to stop the lexer after a single error message is thrown, but I
 cannot find anywhere that shows how to let the lexer continue but not let
 the parser run. Since all error messages are only thrown once you call
 something like

[il-antlr-interest: 31389] [antlr-interest] TreeParser reset()

2011-02-08 Thread Bill Andersen

Folks

I noticed that setTokenStream() in antlr.org.runtime.Parser calls reset(), but 
setTreeNodeStream() in antlr.org.runtime.tree.TreeParser does not call reset(). 
 Is there a reason for this or is it just an oversight?  If the former should I 
call reset() before or after setTreeNodeStream() on a tree parser I want to 
re-use?

Thanks in advance.

   .bill

List: http://www.antlr.org/mailman/listinfo/antlr-interest
Unsubscribe: 
http://www.antlr.org/mailman/options/antlr-interest/your-email-address

-- 
You received this message because you are subscribed to the Google Groups 
il-antlr-interest group.
To post to this group, send email to il-antlr-inter...@googlegroups.com.
To unsubscribe from this group, send email to 
il-antlr-interest+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/il-antlr-interest?hl=en.

[il-antlr-interest: 31385] [antlr-interest] setTokenStream() on grammar with imports + feature request

[il-antlr-interest: 31386] [antlr-interest] Unicode input

[il-antlr-interest: 31387] Re: [antlr-interest] Unicode input

[il-antlr-interest: 31388] Re: [antlr-interest] [Bulk] Re: Correct way to handle custom errors?

[il-antlr-interest: 31389] [antlr-interest] TreeParser reset()

5 matches

Site Navigation

Mail list logo

Footer information