[il-antlr-interest: 32700] [antlr-interest] issue with antlr requiring a whitespace at a specific place

2011-06-08 Thread Olivier Sallou
Hi,
I have an issue with antlrworks (1.4.2), where for a specific grammar,
it requires a whitespace.
I upgraded from antlrworks 1.1.7 where the same did not asked for the
whitespace.

example:
'?' string
| '%' string ':' percentage=INT
| ...

string: '' LOWID '';
LOWID: ('a'..'z'|'\-')+;
INT :   ('0'..'9')+ ;

If I call my example rules with:
  ?\acgt\
it works fine
but if I call
 %\acgt\:30

If fails.

At least if I add a whitespace between % and \acgt\, it works:
 % \acgt\:30

I really can't understand why a whitespace is required here, and only
here

Thanks for your help

Olivier


-- 
gpg key id: 4096R/326D8438  (pgp.mit.edu)
Key fingerprint = 5FB4 6F83 D3B9 5204 6335  D26D 78DC 68DB 326D 8438



List: http://www.antlr.org/mailman/listinfo/antlr-interest
Unsubscribe: 
http://www.antlr.org/mailman/options/antlr-interest/your-email-address

-- 
You received this message because you are subscribed to the Google Groups 
il-antlr-interest group.
To post to this group, send email to il-antlr-inter...@googlegroups.com.
To unsubscribe from this group, send email to 
il-antlr-interest+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/il-antlr-interest?hl=en.



[il-antlr-interest: 32703] Re: [antlr-interest] issue with antlr requiring a whitespace at a specific place

2011-06-08 Thread Bart Kiers
Well, your grammar is now quite different than the few rule you posted (more
rules, and not everything is visible, so I can't test it myself).
All I can say is that the interpreter from ANTLRWorks has (quite some)
odd quirks, so best not use it. If something seems odd in the interpreter,
either create a little test rig yourself, or use ANTLRWorks' debugger to be
sure if the error lies in your grammar, or the interpreter.

Good luck!

Regards,

Bart.


On Wed, Jun 8, 2011 at 1:04 PM, Olivier Sallou olivier.sal...@irisa.frwrote:

 For the same, I have mismatched token. I simplified it to maximum (see
 first line of attached screenshot)

 However I see in my editor (antlrworks) in interpreter tab: Ignore
 rules: WHITESPACE.

 I wonder why, I did not ask for such ignore, and I do not see how to
 remove this.

 Maybe this occurs in generated code too.

 Olivier

 Le 6/8/11 12:58 PM, Bart Kiers a écrit :
  Hi Olivier,
 
  I can't reproduce it. I tested with ANTLRWorks 1.4.2 as well.
  See the attached screenshot.
 
  Regards,
 
  Bart.
 
 
  On Wed, Jun 8, 2011 at 11:23 AM, Olivier Sallou olivier.sal...@irisa.fr
 wrote:
 
  Hi,
  I have an issue with antlrworks (1.4.2), where for a specific grammar,
  it requires a whitespace.
  I upgraded from antlrworks 1.1.7 where the same did not asked for the
  whitespace.
 
  example:
 '?' string
 | '%' string ':' percentage=INT
 | ...
 
  string: '' LOWID '';
  LOWID: ('a'..'z'|'\-')+;
  INT :   ('0'..'9')+ ;
 
  If I call my example rules with:
   ?\acgt\
  it works fine
  but if I call
   %\acgt\:30
 
  If fails.
 
  At least if I add a whitespace between % and \acgt\, it works:
   % \acgt\:30
 
  I really can't understand why a whitespace is required here, and only
  here
 
  Thanks for your help
 
  Olivier
 
 
  --
  gpg key id: 4096R/326D8438  (pgp.mit.edu)
  Key fingerprint = 5FB4 6F83 D3B9 5204 6335  D26D 78DC 68DB 326D 8438
 
 
 
  List: http://www.antlr.org/mailman/listinfo/antlr-interest
  Unsubscribe:
  http://www.antlr.org/mailman/options/antlr-interest/your-email-address
 

 --
 gpg key id: 4096R/326D8438  (pgp.mit.edu)
 Key fingerprint = 5FB4 6F83 D3B9 5204 6335  D26D 78DC 68DB 326D 8438




List: http://www.antlr.org/mailman/listinfo/antlr-interest
Unsubscribe: 
http://www.antlr.org/mailman/options/antlr-interest/your-email-address

-- 
You received this message because you are subscribed to the Google Groups 
il-antlr-interest group.
To post to this group, send email to il-antlr-inter...@googlegroups.com.
To unsubscribe from this group, send email to 
il-antlr-interest+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/il-antlr-interest?hl=en.



[il-antlr-interest: 32705] Re: [antlr-interest] Something in my little grammar throws an Unable to cast CommonTree to type GrammarAST

2011-06-08 Thread Jim Idle
You are looking too far down the error message list. Fix the error at:


 error(100): XMLParser.g:29:11: syntax error: antlr:


First.

Jim

 -Original Message-
 From: antlr-interest-boun...@antlr.org [mailto:antlr-interest-
 boun...@antlr.org] On Behalf Of Arturo Hernandez
 Sent: Wednesday, June 08, 2011 7:21 AM
 To: AN TLR
 Subject: [antlr-interest] Something in my little grammar throws an
 Unable to cast CommonTree to type GrammarAST


 This is a modified version of the xml example from the ANTLR website.
 Eventually I want to extract data from XHTML.
 I probably just need a second to spot a simple mistake. I have the
 reference book, not finished it yet, but did spend plenty of time on
 this and I am stuck.
 error(100): XMLParser.g:29:11: syntax error: antlr:
 NoViableAltException(72@[475:4: ( ( id ( ASSIGN | PLUS_ASSIGN ) ( atom
 | block ) ) (sub= ebnfSuffix[root_0,false] )? |a= atom (sub2=
 ebnfSuffix[$a.tree,false] )? | ebnf | FORCED_ACTION | ACTION |p=
 SEMPRED ( IMPLIES )? |t3= tree_ )])error(100): XMLParser.g:29:11:
 syntax error: antlr: NoViableAltException(72@[475:4: ( ( id ( ASSIGN |
 PLUS_ASSIGN ) ( atom | block ) ) (sub= ebnfSuffix[root_0,false] )? |a=
 atom (sub2= ebnfSuffix[$a.tree,false] )? | ebnf | FORCED_ACTION |
 ACTION |p= SEMPRED ( IMPLIES )? |t3= tree_ )])error(100):
 XMLParser.g:0:1: syntax error: assign.types:
 MismatchedTreeNodeException(0!=32)error(100): XMLParser.g:0:1: syntax
 error: assign.types: MismatchedTreeNodeException(3!=33)error(100):
 XMLParser.g:0:1: syntax error: assign.types:
 MismatchedTreeNodeException(3!=34)error(10):  internal error:
 XMLParser.g : System.InvalidCastException: Unable to cast object of
 type 'Antlr.Runtime.Tree.CommonTree' to type 'Antlr
 3.Tool.GrammarAST'.


 parser  grammar XMLParser;options{language=CSharp3;
 tokenVocab=XMLLexer;}
 @header {using System;}@namespace { XMLParserN } document  : element ;
 element: startTag (element| PCDATA)*
 endTag| emptyElement;
 startTag  : TAG_START_OPEN GENERIC_ID {Console.Write(@ +
 $GENERIC_ID.text); }(attribute {if $attribute.cl!=@ then
 Console.Write(@ class=\ + $attribute.cl + @\);} )*
 TAG_CLOSE {Console.Write(@ + $GENERIC_ID.text); }
 attribute returns [string cl] : GENERIC_ID ATTR_EQ ATTR_VALUE
   { if ($GENERIC_ID.text==@class) $cl = $ATTR_VALUE.text else $cl
 = @; } ;
 endTag: TAG_END_OPEN GENERIC_ID TAG_CLOSE
{Console.Write(@/ +
 $GENERIC_ID.text + @); };
 emptyElement : TAG_START_OPEN GENERIC_ID  (attribute)* TAG_EMPTY_CLOSE
 ;

 List: http://www.antlr.org/mailman/listinfo/antlr-interest
 Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-
 email-address

List: http://www.antlr.org/mailman/listinfo/antlr-interest
Unsubscribe: 
http://www.antlr.org/mailman/options/antlr-interest/your-email-address

-- 
You received this message because you are subscribed to the Google Groups 
il-antlr-interest group.
To post to this group, send email to il-antlr-inter...@googlegroups.com.
To unsubscribe from this group, send email to 
il-antlr-interest+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/il-antlr-interest?hl=en.



[il-antlr-interest: 32706] Re: [antlr-interest] Something in my little grammar throws an Unable to cast CommonTree to type GrammarAST

2011-06-08 Thread Arturo Hernandez










Thanks Jim,
I looked at  XMLParser.g:29:11 before doing the first post. That pointed me 
to returns in this rule. My only idea is that there is something I need to do 
at the top of the grammar definition, like set output to AST. Before using 
returns. At that point I started to backtrack and looked at the rest of the 
stack.
I am still stuck.
attribute returns [string cl]   : GENERIC_ID ATTR_EQ ATTR_VALUE


 From: j...@temporal-wave.com
 Date: Wed, 8 Jun 2011 09:01:44 -0700
 To: antlr-interest@antlr.org
 Subject: Re: [antlr-interest] Something in my little grammar throws an Unable 
 to cast CommonTree to type GrammarAST
 
 You are looking too far down the error message list. Fix the error at:
 
 
  error(100): XMLParser.g:29:11: syntax error: antlr:
 
 
 First.
 
 Jim
 
  -Original Message-
  From: antlr-interest-boun...@antlr.org [mailto:antlr-interest-
  boun...@antlr.org] On Behalf Of Arturo Hernandez
  Sent: Wednesday, June 08, 2011 7:21 AM
  To: AN TLR
  Subject: [antlr-interest] Something in my little grammar throws an
  Unable to cast CommonTree to type GrammarAST
 
 
  This is a modified version of the xml example from the ANTLR website.
  Eventually I want to extract data from XHTML.
  I probably just need a second to spot a simple mistake. I have the
  reference book, not finished it yet, but did spend plenty of time on
  this and I am stuck.
  error(100): XMLParser.g:29:11: syntax error: antlr:
  NoViableAltException(72@[475:4: ( ( id ( ASSIGN | PLUS_ASSIGN ) ( atom
  | block ) ) (sub= ebnfSuffix[root_0,false] )? |a= atom (sub2=
  ebnfSuffix[$a.tree,false] )? | ebnf | FORCED_ACTION | ACTION |p=
  SEMPRED ( IMPLIES )? |t3= tree_ )])error(100): XMLParser.g:29:11:
  syntax error: antlr: NoViableAltException(72@[475:4: ( ( id ( ASSIGN |
  PLUS_ASSIGN ) ( atom | block ) ) (sub= ebnfSuffix[root_0,false] )? |a=
  atom (sub2= ebnfSuffix[$a.tree,false] )? | ebnf | FORCED_ACTION |
  ACTION |p= SEMPRED ( IMPLIES )? |t3= tree_ )])error(100):
  XMLParser.g:0:1: syntax error: assign.types:
  MismatchedTreeNodeException(0!=32)error(100): XMLParser.g:0:1: syntax
  error: assign.types: MismatchedTreeNodeException(3!=33)error(100):
  XMLParser.g:0:1: syntax error: assign.types:
  MismatchedTreeNodeException(3!=34)error(10):  internal error:
  XMLParser.g : System.InvalidCastException: Unable to cast object of
  type 'Antlr.Runtime.Tree.CommonTree' to type 'Antlr
  3.Tool.GrammarAST'.
 
 
  parser  grammar XMLParser;options{language=CSharp3;
  tokenVocab=XMLLexer;}
  @header {using System;}@namespace { XMLParserN } document  : element ;
  element: startTag (element| PCDATA)*
  endTag| emptyElement;
  startTag: TAG_START_OPEN GENERIC_ID {Console.Write(@ +
  $GENERIC_ID.text); }  (attribute {if $attribute.cl!=@ then
  Console.Write(@ class=\ + $attribute.cl + @\);}   )*
  TAG_CLOSE {Console.Write(@ + $GENERIC_ID.text); }
  attribute returns [string cl]   : GENERIC_ID ATTR_EQ ATTR_VALUE
  { if ($GENERIC_ID.text==@class) $cl = $ATTR_VALUE.text else $cl
  = @; } ;
  endTag  : TAG_END_OPEN GENERIC_ID TAG_CLOSE
 {Console.Write(@/ +
  $GENERIC_ID.text + @); };
  emptyElement : TAG_START_OPEN GENERIC_ID  (attribute)* TAG_EMPTY_CLOSE
  ;
 
  List: http://www.antlr.org/mailman/listinfo/antlr-interest
  Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-
  email-address
 
 List: http://www.antlr.org/mailman/listinfo/antlr-interest
 Unsubscribe: 
 http://www.antlr.org/mailman/options/antlr-interest/your-email-address
  

List: http://www.antlr.org/mailman/listinfo/antlr-interest
Unsubscribe: 
http://www.antlr.org/mailman/options/antlr-interest/your-email-address

-- 
You received this message because you are subscribed to the Google Groups 
il-antlr-interest group.
To post to this group, send email to il-antlr-inter...@googlegroups.com.
To unsubscribe from this group, send email to 
il-antlr-interest+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/il-antlr-interest?hl=en.



[il-antlr-interest: 32707] Re: [antlr-interest] Something in my little grammar throws an Unable to cast CommonTree to type GrammarAST

2011-06-08 Thread Arturo Hernandez

Came home and tried again with a clearer head.
And this time I found my missing ';' plus many other syntax errors ;)
The build action for VS2010 MSBuild works great too!!! All compiled and 
executed perfectly!!

parser  grammar XMLParser;options{language=CSharp3;tokenVocab=XMLLexer;}
@header {using System;}@namespace { XMLParserN }
public document  : element ;
element: startTag (element| PCDATA)*endTag  
  | emptyElement;
startTag: TAG_START_OPEN GENERIC_ID {Console.Write(@ + 
$GENERIC_ID.text); } (attribute )*  TAG_CLOSE {Console.Write(@); } ;
attribute   : GENERIC_ID ATTR_EQ ATTR_VALUE { if 
($GENERIC_ID.text==class) Console.Write(@ class= + $ATTR_VALUE.text); } 
   ;
endTag  : TAG_END_OPEN GENERIC_ID TAG_CLOSE {Console.WriteLine(@/ + 
$GENERIC_ID.text + @); }  ;   
emptyElement : TAG_START_OPEN GENERIC_ID  (attribute)* TAG_EMPTY_CLOSE ;
 From: arther...@hotmail.com
 To: antlr-interest@antlr.org
 Date: Wed, 8 Jun 2011 13:09:00 -0500
 Subject: Re: [antlr-interest] Something in my little grammar throws an Unable 
 to cast CommonTree to type GrammarAST
 
 
 Thanks Jim,
 I looked at  XMLParser.g:29:11 before doing the first post. That pointed me 
 to returns in this rule. My only idea is that there is something I need to 
 do at the top of the grammar definition, like set output to AST. Before using 
 returns. At that point I started to backtrack and looked at the rest of 
 the stack.
 I am still stuck.
 attribute returns [string cl] : GENERIC_ID ATTR_EQ ATTR_VALUE
 
 
  

List: http://www.antlr.org/mailman/listinfo/antlr-interest
Unsubscribe: 
http://www.antlr.org/mailman/options/antlr-interest/your-email-address

-- 
You received this message because you are subscribed to the Google Groups 
il-antlr-interest group.
To post to this group, send email to il-antlr-inter...@googlegroups.com.
To unsubscribe from this group, send email to 
il-antlr-interest+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/il-antlr-interest?hl=en.



[il-antlr-interest: 32708] Re: [antlr-interest] New Guy Question...

2011-06-08 Thread William Clodius
Note that matching in terms of UPPER case is generally a bad idea. There are 
languages with characters that do not appear at the start of words. As upper 
case has come to be primarily used to indicate the start of words in selective 
contexts, such characters need not have a proper mapping to upper case. The 
German ß is the best known such character in languages with latin based 
character sets, but it is not the only such example. However if a language has 
a notion of case, there is always a mapping to lower case and for simple case 
folding that is to be preferred.

In many ways the problem of dealing with case is similar to the problem of 
dealing with normalization, where the same character can be represented by more 
than one combination of code points. As part of its process of dealing with 
normalization, for programming languages the UNICODE consortium recommended a 
couple of straightforward means of dealing identifier uniqueness.These are 
covered in Unicode Standard Annex #31, Unicode Identifier and Pattern Syntax
http://www.unicode.org/reports/tr31/
These have a straightforward implementation in terms of the UNICODE character 
property tables, and it is a small matter of programming to implement their 
lexical classes for identifiers.

On Jun 6, 2011, at 4:56 PM, Jim Idle wrote:

 No, that is not correct, please look at the WIKI article. The input stream
 merely MATCHES in upper case, it does NOT change the input stream itself,
 hence both the keywords and anything else are case preserved when you ask
 for their text; that is the whole point of doing it that way. Then you
 specify the tokens in the lexer using upper case only and it has the side
 effect of simplifying the lexer rules as well as not creating a method
 call to match every letter of every keyword (which is a bad idea even with
 JIT inlining).
 
 Jim
 
 -Original Message-
 From: antlr-interest-boun...@antlr.org [mailto:antlr-interest-
 boun...@antlr.org] On Behalf Of Douglas Godfrey
 Sent: Monday, June 06, 2011 12:41 PM
 To: Marco Hunsicker
 Cc: antlr-interest@antlr.org
 Subject: Re: [antlr-interest] New Guy Question...
 
 When you implement case insensitive keywords, you may still want case
 sensitive identifiers.
 If the input stream does case folding, you can't use case sensitive
 identifiers.
 
 On Sun, Jun 5, 2011 at 5:58 PM, Marco Hunsicker de...@hunsicker.de
 wrote:
 
 You have to handle case insensitivity the hard way:
 
 fragment A
 :'A' | 'a';
 
 [...]
 
 I don't think it's a necessity to do it this way. Actually, I think
 it
 would be better using a specialized input stream that does any
 necessary transformation. Your mileage may vary ;)
 
 Cheers,
 
 Marco
 
 List: http://www.antlr.org/mailman/listinfo/antlr-interest
 Unsubscribe:
 http://www.antlr.org/mailman/options/antlr-interest/your-email-
 address
 
 
 List: http://www.antlr.org/mailman/listinfo/antlr-interest
 Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-
 email-address
 
 List: http://www.antlr.org/mailman/listinfo/antlr-interest
 Unsubscribe: 
 http://www.antlr.org/mailman/options/antlr-interest/your-email-address


List: http://www.antlr.org/mailman/listinfo/antlr-interest
Unsubscribe: 
http://www.antlr.org/mailman/options/antlr-interest/your-email-address

-- 
You received this message because you are subscribed to the Google Groups 
il-antlr-interest group.
To post to this group, send email to il-antlr-inter...@googlegroups.com.
To unsubscribe from this group, send email to 
il-antlr-interest+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/il-antlr-interest?hl=en.



[il-antlr-interest: 32709] Re: [antlr-interest] [CSharp3] rule visibility in composite grammars

2011-06-08 Thread Sam Harwell
I'm also very interested in ways to make ANTLR grammars
target-language-agnostic. I know ANTLR version 4 will provide more
consistent semantics across a number of language features, but I'm not sure
if a specific symbol table support like you mention will be included. The
problem with such a feature is it increases the complexity of the grammar
language specification, increases the size of the runtime and/or the
complexity of the code generation templates (making things more difficult
for target developers), and only meets the semantic language features of a
small number of users.

 

One idea I had is for a target-agnostic call syntax that could be used in
semantic predicates and actions. It could use a form such as the following:

 

@{FunctionName(arg1,arg2,arg3)}

 

Where an argument can be one of the following:

 

. A reference to a token, rule return value, or label in the rule.

. A reference to an argument passed to the rule.

. A reference to a value in an attribute scope.

 

The target would then declare a user-definable method with the appropriate
parameter types. The Java and CSharp2 targets could create an abstract
method, the CSharp3 target could create a partial method, and a C/C++
targets could declare the method in a generated header.

 

Sam

 

From: Douglas Godfrey [mailto:douglasgodf...@gmail.com] 
Sent: Tuesday, June 07, 2011 8:09 AM
To: Ranco Marcus
Cc: Sam Harwell; antlr-interest@antlr.org
Subject: Re: [antlr-interest] [CSharp3] rule visibility in composite
grammars

 

2 months ago I submitted a feature request for an Antlr built-in symbol
table to support the common 
requirements of the majority of block-structured languages. By making the
SymbolTable part of the
Antlr grammar language the interface can be much cleaner. The implementation
of the SymbolTable
classes would be part of the target runtime(s).

i.e. 

new_variable_name:
(Identifier.IsNewSymbol()) = Identifier.AddSymbol();

SymbolTable - NameSpace - SymbolScope - Symbol - Attribute-List {
optional for structs - NameSpace }  

On Tue, Jun 7, 2011 at 5:56 AM, Ranco Marcus ranco.mar...@epirion.nl
wrote:

Yes, that sounds like a good idea. I would definitely be in favour of
merging the grammars before generating the code. My only concern is that
this approach would deviate from the general ANTLR approach.

In general, I have found ANTLR to be a great tool for parser generation, but
never really liked the way target specific actions are mixed with the
grammar definition. Ideally, I would like my grammars to be _completely_
target agnostic (no actions, no visibility modifiers, members, headers,
superClass definitions, etc.) and have an abstract mechanism that we can use
to hookup actions and implementation specific stuff to the generated
grammar.

Do you know if there are plans to redesign the composite grammar feature in
v4?

Best regards,

Ranco


 -Original Message-
 From: Sam Harwell [mailto:sharw...@pixelminegames.com]
 Sent: Sunday, May 29, 2011 11:08 AM
 To: Ranco Marcus; antlr-interest@antlr.org
 Subject: RE: [antlr-interest] [CSharp3] rule visibility in composite
grammars

 I'm not going to be able to address this issue until the second week of
June.

 That said, it seems the best way to handle all these issues with delegate
 grammars is to inline their rules before code generation. Suppose you have
 grammar C importing A and B, and you also have D importing A and B. The
 code generation will result in classes C, C_A, C_B, D, D_A, and D_B.
Clearly
 the independent generation of C_A and D_A during code generation does
 not allow a single instance of the imported A grammar to be shared by C
and
 D. If we instead flatten the imported grammar hierarchy and only
generate
 classes C and D, then everything behaves like it was written in a single
 grammar. Do you see any immediate problems with this potential approach?

 -Original Message-
 From: Ranco Marcus [mailto:ranco.mar...@epirion.nl]
 Sent: Wednesday, May 25, 2011 4:26 PM
 To: Sam Harwell; antlr-interest@antlr.org
 Subject: RE: [antlr-interest] [CSharp3] rule visibility in composite
grammars

 Hi Sam/all,

 When a (tree) grammar C imports (tree) grammars A and B, where grammar
 A calls a rule R from grammar B, a call is being made from delegate parser
C_A
 to a delegate rule R (targeting C_B) in the composite parser C (its
parent).

 Now that the visibility of the delegate rules in C match the visibility of
the
 imported grammar, the rule R has to be made public for the above to work.
 In our grammars, we build up internal structures that are subsequently
 processed. In our case, that means that all those internal structures have
to
 be made public as well. This could be solved by allowing ANTLR rules to
have
 'internal' visibility. Also, imported grammars can probably remain
internal as
 well.

 What are your thoughts on this?

 Thanks, Ranco

  -Original Message-
  From: Sam Harwell