[il-antlr-interest: 28209] Re: [antlr-interest] Using previously matched parser rule in decision making

2010-03-09 Thread Kieran Simpson
I agree Ron.

Ron Burk wrote:
 It is an interesting idea for a top-down parser generator
 to just make the parsing stack of non-terminals available
 to user actions. Whether that's easy or hard depends on
 the details of how the tool generates parser code. But
 certainly knowing the context you expect to be in is
 arguably an advantage of top-down over bottom-up
 parsing, so there's an argument to be made for making
 that information available. As I struggle to think of
 common/practical use for it, mainly error reporting or
 recovery comes to mind. But, if the syntax made it
 easy to ask things like is X on the stack, I suppose
 there are a variety of semantic checks that could be
 made clearer and simpler than via flags and such. E.g.
 checking that a 'break' keyword in C occurs within a
 do/for/switch/while.

 I usually try to do things in one pass, so it may be more
 interesting of an idea to me than to someone who intends
 to build a syntax tree first before doing any actual work.

 Dinking with syntax:

 A: B
 C: B
 B:
 { if($Stack[A])... else if($Stack[C])... else assert(FALSE); }

 or maybe (also?)

 { if($Stack[-1]==$NonTerm[A]) ...; else ...; }

 or

 LoopStmt: Do | For | Switch | While ;
 ...
 BreakStmt: 'break'
 { if(!$Stack[LoopStmt]) SynError(break is not inside
 do/for/switch/while.\n); }

 List: http://www.antlr.org/mailman/listinfo/antlr-interest
 Unsubscribe: 
 http://www.antlr.org/mailman/options/antlr-interest/your-email-address
   

List: http://www.antlr.org/mailman/listinfo/antlr-interest
Unsubscribe: 
http://www.antlr.org/mailman/options/antlr-interest/your-email-address

-- 
You received this message because you are subscribed to the Google Groups 
il-antlr-interest group.
To post to this group, send email to il-antlr-inter...@googlegroups.com.
To unsubscribe from this group, send email to 
il-antlr-interest+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/il-antlr-interest?hl=en.



[il-antlr-interest: 28210] Re: [antlr-interest] Using previously matched parser rule in decision making

2010-03-09 Thread Gavin Lambert
At 15:47 9/03/2010, Ron Burk wrote:
 It is an interesting idea for a top-down parser generator
 to just make the parsing stack of non-terminals available
 to user actions. Whether that's easy or hard depends on
 the details of how the tool generates parser code. But
 certainly knowing the context you expect to be in is
 arguably an advantage of top-down over bottom-up
 parsing, so there's an argument to be made for making
 that information available.

You can use ANTLR's scopes to do that.  There are ways to tell if 
a particular scope has been entered, how many times it has been 
entered, and to retrieve information from any of those levels.


List: http://www.antlr.org/mailman/listinfo/antlr-interest
Unsubscribe: 
http://www.antlr.org/mailman/options/antlr-interest/your-email-address

-- 
You received this message because you are subscribed to the Google Groups 
il-antlr-interest group.
To post to this group, send email to il-antlr-inter...@googlegroups.com.
To unsubscribe from this group, send email to 
il-antlr-interest+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/il-antlr-interest?hl=en.



[il-antlr-interest: 28216] Re: [antlr-interest] Unexpected behavior - Error?

2010-03-09 Thread Christoph Schinko
Hi Bart!

Thanks for the quick answer! Adding an EOF to the rule solves the issue 
in the toy example. Unfortunately we are using custom token label types 
and are now getting a ClassCastException. It seems that we now have the 
problem mentioned here:

http://www.antlr.org/pipermail/antlr-interest/2009-November/036712.html

Any thoughts on that?

On 09.03.2010 15:04, Bart Kiers wrote:
 Hi Chris,

 Since the input ' .mine' does not contain any illegal tokens, 
 the parser just stops parsing since (statement)* will also match 
 nothing. You'll want to tell your parser to continue parsing all the 
 way to the end of your token stream. Do that by adding an EOF to the 
 end of your entry-point: presumably the source parser rule:

 source
   : (statement)* EOF
   ;

 Regards,

 Bart.


-- 
Dipl.-Ing. Christoph Schinko   c.schi...@cgv.tugraz.at
Institute of Computer Graphics and Knowledge Visualization
Graz University of Technology  tel: +43 (316) 873-5416
Inffeldgasse 16c, 8010 Graz, Austria


List: http://www.antlr.org/mailman/listinfo/antlr-interest
Unsubscribe: 
http://www.antlr.org/mailman/options/antlr-interest/your-email-address

-- 
You received this message because you are subscribed to the Google Groups 
il-antlr-interest group.
To post to this group, send email to il-antlr-inter...@googlegroups.com.
To unsubscribe from this group, send email to 
il-antlr-interest+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/il-antlr-interest?hl=en.



[il-antlr-interest: 28217] Re: [antlr-interest] Unexpected behavior - Error?

2010-03-09 Thread Bart Kiers
Hi Chris, sorry, forgot to send to the list the first time!


On Tue, Mar 9, 2010 at 4:41 PM, Christoph Schinko
c.schi...@cgv.tugraz.atwrote:

  Hi Bart!

 Thanks for the quick answer! Adding an EOF to the rule solves the issue in
 the toy example. Unfortunately we are using custom token label types and are
 now getting a ClassCastException. It seems that we now have the problem
 mentioned here:

 http://www.antlr.org/pipermail/antlr-interest/2009-November/036712.html

 Any thoughts on that?


Unfortunately, I don't... I presume you read that entire thread, if not, a
(possible) solution is given here:
http://www.antlr.org/pipermail/antlr-interest/2009-November/036719.html

Best of luck!

Regards,

Bart.




 On 09.03.2010 15:04, Bart Kiers wrote:

 Hi Chris,

 Since the input ' .mine' does not contain any illegal tokens, the
 parser just stops parsing since (statement)* will also match nothing. You'll
 want to tell your parser to continue parsing all the way to the end of
 your token stream. Do that by adding an EOF to the end of your entry-point:
 presumably the source parser rule:

 source
   : (statement)* EOF
   ;

 Regards,

 Bart.



List: http://www.antlr.org/mailman/listinfo/antlr-interest
Unsubscribe: 
http://www.antlr.org/mailman/options/antlr-interest/your-email-address

-- 
You received this message because you are subscribed to the Google Groups 
il-antlr-interest group.
To post to this group, send email to il-antlr-inter...@googlegroups.com.
To unsubscribe from this group, send email to 
il-antlr-interest+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/il-antlr-interest?hl=en.



[il-antlr-interest: 28218] Re: [antlr-interest] Using previously matched parser rule in decision making

2010-03-09 Thread Jim Idle
From anywhere in the parser:

java.util.List stack = getRuleInvocationStack(e, getParserName());

But this only works for Java and other targets that copy it (I think C# might 
do it). I don't do it in C because I prefer to take the view that the C stuff 
should be as close to the metal as it can be and the programmer will choose to 
add the overheads they need.

In the JavaFX front end, this stack is used to pin down errors a little more 
precisely - as it is open source you can download the code and look at 
AbstractGeneratedParserV4.java

Jim

 -Original Message-
 From: antlr-interest-boun...@antlr.org [mailto:antlr-interest-
 boun...@antlr.org] On Behalf Of Kieran Simpson
 Sent: Tuesday, March 09, 2010 1:58 AM
 To: Ron Burk
 Cc: antlr-interest@antlr.org
 Subject: Re: [antlr-interest] Using previously matched parser rule in
 decision making
 
 I agree Ron.
 
 Ron Burk wrote:
  It is an interesting idea for a top-down parser generator
  to just make the parsing stack of non-terminals available
  to user actions. Whether that's easy or hard depends on
  the details of how the tool generates parser code. But
  certainly knowing the context you expect to be in is
  arguably an advantage of top-down over bottom-up
  parsing, so there's an argument to be made for making
  that information available. As I struggle to think of
  common/practical use for it, mainly error reporting or
  recovery comes to mind. But, if the syntax made it
  easy to ask things like is X on the stack, I suppose
  there are a variety of semantic checks that could be
  made clearer and simpler than via flags and such. E.g.
  checking that a 'break' keyword in C occurs within a
  do/for/switch/while.
 
  I usually try to do things in one pass, so it may be more
  interesting of an idea to me than to someone who intends
  to build a syntax tree first before doing any actual work.
 
  Dinking with syntax:
 
  A: B
  C: B
  B:
  { if($Stack[A])... else if($Stack[C])... else assert(FALSE); }
 
  or maybe (also?)
 
  { if($Stack[-1]==$NonTerm[A]) ...; else ...; }
 
  or
 
  LoopStmt: Do | For | Switch | While ;
  ...
  BreakStmt: 'break'
  { if(!$Stack[LoopStmt]) SynError(break is not inside
  do/for/switch/while.\n); }
 
  List: http://www.antlr.org/mailman/listinfo/antlr-interest
  Unsubscribe: http://www.antlr.org/mailman/options/antlr-
 interest/your-email-address
 
 
 List: http://www.antlr.org/mailman/listinfo/antlr-interest
 Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-
 email-address




List: http://www.antlr.org/mailman/listinfo/antlr-interest
Unsubscribe: 
http://www.antlr.org/mailman/options/antlr-interest/your-email-address

-- 
You received this message because you are subscribed to the Google Groups 
il-antlr-interest group.
To post to this group, send email to il-antlr-inter...@googlegroups.com.
To unsubscribe from this group, send email to 
il-antlr-interest+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/il-antlr-interest?hl=en.



[il-antlr-interest: 28219] [antlr-interest] MismatchedTokenException in simple grammar

2010-03-09 Thread Parker, Joel J. K. (GSFC-5950)
Hi all,

I'm completely new to ANTLR and EBNF grammars to begin with, so this is 
probably a basic issue I'm simply not understanding.

I have a rule such as:

version_line : WS? 'VERS' WS? '=' WS? '1.0' WS? EOL ;
WS : ' '+ ;
EOL : '\r' | '\n' | '\r\n' | '\n\r' ;

that matches a statement in my input file that looks like this (with optional 
whitespace):
VERSION = 1.0

With the rule form above, I'm getting a successful match, although I get an 
exception with this form:

version_line : WS? 'VERS' WS? '=' WS? '1' '.0' WS? EOL ;

or this form:

version_line : WS? 'VERS' WS? '=' WS? DIGIT '.0' WS? EOL ;
DIGIT : '1' ;

Why is this different? I discovered this issue when trying to decompose the 
rule even more, hopefully ending up with something like this:

version_line : WS? 'VERS' WS? '=' WS? DIGIT '.' DIGIT WS? EOL ;
DIGIT : '0'..'9' ;

Thanks in advance,

-- 
Joel Parker

List: http://www.antlr.org/mailman/listinfo/antlr-interest
Unsubscribe: 
http://www.antlr.org/mailman/options/antlr-interest/your-email-address

-- 
You received this message because you are subscribed to the Google Groups 
il-antlr-interest group.
To post to this group, send email to il-antlr-inter...@googlegroups.com.
To unsubscribe from this group, send email to 
il-antlr-interest+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/il-antlr-interest?hl=en.



[il-antlr-interest: 28220] Re: [antlr-interest] MismatchedTokenException in simple grammar

2010-03-09 Thread Bart Kiers
FYI:
http://stackoverflow.com/questions/2412440/antlr-mismatchedtokenexception-on-simple-grammar

List: http://www.antlr.org/mailman/listinfo/antlr-interest
Unsubscribe: 
http://www.antlr.org/mailman/options/antlr-interest/your-email-address

-- 
You received this message because you are subscribed to the Google Groups 
il-antlr-interest group.
To post to this group, send email to il-antlr-inter...@googlegroups.com.
To unsubscribe from this group, send email to 
il-antlr-interest+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/il-antlr-interest?hl=en.



[il-antlr-interest: 28221] [antlr-interest] AntLRWorks Rule Debugging Error

2010-03-09 Thread Rahul Mehta
I am trying to test whenDescriptor rule in following grammar 
in AntLRWorks. I keep getting following exception as soon as I start 
debugging. Input text for testing is when order : OrderBll then

[16:45:07] C:\Documents and Settings\RM\My Documents\My 
Tools\AntLRWorks\output\__Test__.java:14: cannot find symbol
[16:45:07] symbol  : method whenDescriptor()
[16:45:07] location: class 
RulesTranslatorParser
[16:45:07] g.whenDescriptor();
[16:45:07]  ^
[16:45:07] 1 error

I am able to test packageDescriptor 
and declareDescriptor successfully. Does anyone knows resolution to 
the error message? I tried various combination of input string but rule 
debugging fails.

Here is my grammar.

grammar RulesTranslator;

options
{

backtrack=true;
memoize=true;
language=CSharp2;
}

tokens {
AND='and';
OR='or';
NOT='not';

EXISTS='exists';
EVAL='eval';
FORALL='forall';

CONTAINS='contains';
IS='is';
INSTANCEOF='instanceof';
STRSIM='strsim';
SOUNDSLIKE='soundslike';
IN='in';
NEW='new';
WITH='with';
ASSERT='assert';

ISDEF='isdef';
}

packageDescriptor
: 'package' 
qualifiedName
;

declareDescriptorList
: (declareDescriptor)*
;

declareDescriptor
: 'declare' qualifiedName 
(variableDef)+ 'end'
;

whenDescriptor
: 
//'when' ( typeRef | NOT ) (parExpression)+ 'then'
 'when' 
typeRef  'then'
;

typeRef
:  (Identifier | 
variableDef)
;

primitiveType
:   'boolean'

|'char'
|'byte'
|'short'
|'int'
|'long'
|'float'
|'double'
;

qualifiedNameList
: qualifiedName 
(',' qualifiedName)*
;

qualifiedName
: 
Identifier ('.' Identifier)*
;

literal
: 
integerLiteral
| FloatingPointLiteral
| CharacterLiteral
| StringLiteral
| booleanLiteral
| 'null'
;

integerLiteral
:   HexLiteral
|   OctalLiteral
|   DecimalLiteral
;

booleanLiteral
:   'true'
|   'false'
;

elementValuePairs
: elementValuePair (',' elementValuePair)*
;

elementValuePair
: (Identifier '=')? 
conditionalExpression
;

variableDef
: ( Identifier ':' Identifier | Identifier ':' qualifiedName ) 
;

// STATEMENTS / BLOCKS
chunk : (statement (';')?)*
;

block : chunk EOF;

statement
: 'while' parExpression 
statement
| 'do' statement 'while' parExpression ';'

| 'switch' parExpression '{' switchBlockStatementGroups '}'

| 'return' expression? ';'
| 'break' Identifier? ';'
| 'continue' Identifier? ';'
//| 'when' parExpression 'then' (statement)? 'end'
|  
statementExpression
|  Identifier ':' statement
;

switchBlockStatementGroups
:   
(switchBlockStatementGroup)*
;

switchBlockStatementGroup
:   switchLabel statement*
;

switchLabel
:  
'case' constantExpression ':'
|  'default' ':'
;

moreStatementExpressions
:  (',' statementExpression)*
;

fieldseperator 
: (',' | ';')
;

logicalOperator
 : ('' | '||' | '~=') 
;

parExpression
:   '(' expression ')'
;

expressionList

:   expression (',' expression)*
;

statementExpression
:   expression
;

constantExpression
:   
expression
;

expression
:   
conditionalExpression (assignmentOperator expression)?
;

assignmentOperator
:   '='
|   '+='
|   '-='
|   '*='
|   '/='
|   '='
|   '|='
|   '^='
|   '%='
|   '' '' '='
|   '' 
'' '='
|   '' '' '' '='
;

conditionalExpression
:   conditionalOrExpression ( '?' expression ':' expression )?
;

conditionalOrExpression
:   conditionalAndExpression ( ( '||' | OR ) conditionalAndExpression 
)*
;

conditionalAndExpression
:   inclusiveOrExpression ( ( 
'' | AND ) inclusiveOrExpression )*
;

inclusiveOrExpression
:   exclusiveOrExpression ( '|' exclusiveOrExpression )*
;

exclusiveOrExpression
:   andExpression ( '^' andExpression )*
;

andExpression
:   equalityExpression ( '' equalityExpression )*
;

equalityExpression
:   relationalExpression ( ('==' | '!=') relationalExpression )*
;

relationalExpression
:   shiftExpression ( relationalOp shiftExpression )*
;

relationalOp
:('' '=' | '' '=' | '' | '')
;

shiftExpression
:   additiveExpression ( shiftOp additiveExpression )*
;

shiftOp
:   
('' '' | '' '' '' | '' '')
;

additiveExpression
:   multiplicativeExpression ( ('+' | '-') multiplicativeExpression )*
;

multiplicativeExpression
:   
unaryExpression ( ( '*' | '/' | '%' ) unaryExpression )*
;

unaryExpression
:   '+' unaryExpression
|'-' 
unaryExpression
|   '++' primary
|   '--' primary


[il-antlr-interest: 28223] [antlr-interest] Unicode lexing

2010-03-09 Thread Jonathan S. Shapiro
I know this topic has come up before, and sorry to bring it up again.

Context: I'm bringing up BitC on CLI, and planning to use antlr to do it.
BitC characters cover the full unicode (20 bit) range.

The good news:
haracters above U+ can only appear in character and string literals.

List: http://www.antlr.org/mailman/listinfo/antlr-interest
Unsubscribe: 
http://www.antlr.org/mailman/options/antlr-interest/your-email-address

-- 
You received this message because you are subscribed to the Google Groups 
il-antlr-interest group.
To post to this group, send email to il-antlr-inter...@googlegroups.com.
To unsubscribe from this group, send email to 
il-antlr-interest+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/il-antlr-interest?hl=en.



[il-antlr-interest: 28224] [antlr-interest] Stopping parser and lexer at first error

2010-03-09 Thread Corrado Campisano
Hi all,

I needed to catch any syntax error (letting the lexer insert/delete chars or
the parser keeping parsing with the sys.err message only could be very
dangerous to my application), so I took a look on the reference (which
reports information not valid anymore) and on the internet and I found
several hints and articles:

Why the generated parser code tolerates illegal
expression?http://www.antlr.org/wiki/pages/viewpage.action?pageId=4554943
How can I make the lexer exit upon first lexical
error?http://www.antlr.org/wiki/pages/viewpage.action?pageId=5341217
http://www.antlr.org/wiki/display/ANTLR3/Custom+Syntax+Error+Recovery
[antlr-interest] I want to throw an exception and stop parse, please!
http://www.antlr.org/pipermail/antlr-interest/2009-May/034605.html

It looks to me I found a way to do this, maybe it's worth to publish that on
the wiki, once validated.


I just added the following overrides to my grammar (attached):

@parser::members
{
public class ParserException extends RuntimeException {
Object objCurrentInputSymbol = null;

public ParserException(Object oCurrentInputSymbol) {
this.objCurrentInputSymbol = oCurrentInputSymbol;
}
}

protected Object recoverFromMismatchedToken(IntStream input, int
ttype, BitSet follow) throws RecognitionException {
System.out.println(PARSER :
this.getCurrentInputSymbol(input).toString() :  +
this.getCurrentInputSymbol(input).toString());
System.out.println(PARSER : this.failed() :  + this.failed());
System.out.println(PARSER : this.getNumberOfSyntaxErrors() : 
+ this.getNumberOfSyntaxErrors());
throw new ParserException(this.getCurrentInputSymbol(input));
}
}

@lexer::members
{
public class LexerException extends RuntimeException {
RecognitionException recognitionException = null;
String strErrorHeader = null;
String strErrorMessage = null;

public LexerException(RecognitionException recExc, String sHead,
String sMsg) {
this.recognitionException = recExc;
this.strErrorHeader = sHead;
this.strErrorMessage = sMsg;

System.out.println(LEXER : ErrorHeader :  + sHead);
System.out.println(LEXER : ErrorMessage :  + sMsg);
System.out.println(LEXER : RecognitionException :  +
this.recognitionException.toString());
}
}


public void reportError(RecognitionException recExc) {
throw new LexerException(recExc, this.getErrorHeader(recExc),
getErrorMessage(recExc, this.getTokenNames()));
}
}


Then I tested it with a simple class:
public static void main(String[] args) {
testLexerError();
testParserError();
}
private static void testLexerError() {
String strDlToParse = {CORRADO PIPPO ;feee};
System.out.println(TESTING LEXER with :  + strDlToParse);
testError(strDlToParse);
}
private static void testParserError() {
String strDlToParse = {CORRADO PIPPO feee} dhert;
System.out.println(TESTING PARSER with :  + strDlToParse);
testError(strDlToParse);
}
private static void testError(String strDlToParse) {
CommonTree tree=null;
String strError = null;

ANTLRStringStream input = new
org.antlr.runtime.ANTLRStringStream(strDlToParse);
Dl2OwlJavaBLexer lexer = new Dl2OwlJavaBLexer(input);
TokenStream tokens = new org.antlr.runtime.CommonTokenStream(lexer);
Dl2OwlJavaBParser parser = new Dl2OwlJavaBParser(tokens);

try {
// this may rise an exception
// TODO : check why NO EXCEPTION is risen with error line 1:9
no viable alternative at character ';' on inputs like {CORRADO ;}
eu.servicemix.dl2owl.Dl2OwlJavaBParser.axiom_return ret =
parser.axiom();

// TODO : check if this will be executed if no exception rises
tree = (CommonTree) ret.getTree();

printTreeHelper(tree);

} catch (RecognitionException e) {

System.out.println(e.toString());
e.printStackTrace();

} catch (RuntimeException e) {

System.out.println(e.toString());
e.printStackTrace();
}
}


The output looks ok, I wonder whether the whole 'trick' is too...

TESTING LEXER with : {CORRADO PIPPO *;*feee}
LEXER : ErrorHeader : line 1:15
LEXER : ErrorMessage : no viable alternative at character ';'
LEXER : RecognitionException : NoViableAltException(';'@[1:1: Tokens : (
T__37 | T__38 | T__39 | T__40 | HAS_VALUE | ALL_VALUES | SOME_VALUES | DOT |
HAS_CARD | MIN_CARD | MAX_CARD | NOT | AND | OR | URI_REF | INT_VALUE | WS |
CTRL_CHAR );])
eu.servicemix.dl2owl.Dl2OwlJavaBLexer$LexerException
eu.servicemix.dl2owl.Dl2OwlJavaBLexer$LexerException
at
eu.servicemix.dl2owl.Dl2OwlJavaBLexer.reportError(Dl2OwlJavaBLexer.java:69)
at 

[il-antlr-interest: 28226] Re: [antlr-interest] Using previously matched parser rule in decision making

2010-03-09 Thread Gokulakannan Somasundaram



 java.util.List stack = getRuleInvocationStack(e, getParserName());

 But this only works for Java and other targets that copy it (I think C#
 might do it). I don't do it in C because I prefer to take the view that the
 C stuff should be as close to the metal as it can be and the programmer will
 choose to add the overheads they need.


That makes a lot of sense, but can you tell me how can a C-programmer keep
track of the rules, he will pass through during the LookAhead. I think this
question might be valid for someone using Java Target also. The action, you
are asking to execute can be executed during normal parsing, but if that is
used in Semantic predicates, then we have to be sure that this will get
executed even during the lookahead. Can we create such a facility?

What i mean is a set of actions, if put inside some special construct, will
get executed while executing Syntactic predicates and semantic predicates,
possibly with a rollback action. Please let me know, if i am
misunderstanding the concept somewhere.

Thanks,
Gokul.

List: http://www.antlr.org/mailman/listinfo/antlr-interest
Unsubscribe: 
http://www.antlr.org/mailman/options/antlr-interest/your-email-address

-- 
You received this message because you are subscribed to the Google Groups 
il-antlr-interest group.
To post to this group, send email to il-antlr-inter...@googlegroups.com.
To unsubscribe from this group, send email to 
il-antlr-interest+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/il-antlr-interest?hl=en.