[il-antlr-interest: 28890] Re: [antlr-interest] help please

2010-05-19 Thread Bart Kiers
On Wed, May 19, 2010 at 2:10 AM, Ernesto Castillo
hcast...@rocketmail.comwrote:

 hello everybody my name is Ernesto and i am calling for help on antlr
 programming, i am a newby in this and  i am in my second semester  master ,
 after 12 years ago that finish my degree in computer science but because the
 circumstance never work  in the computer  field, but planning to get into,
 so this semester i am taking compiler, and my first  programming assignment
 was really bad because i am not clear how put together the java with the
 antlr. i know how Java work because my first semester i took Java and i used
 it with eclipse . Now i thing i have properly installed the antlr 3.2
 nevertheless  i do not know if i have to install antlrwork because the IDE ,
 i was trying to do the main java with eclipse to invoke antlr but never work
 . so i feel lost in the sea and i have the antlr book but look like is the
 old version. my computer is Mac . i would appreciate the help thanks


Scott  Stanchfield has written some excellent video tutorials starting from
the very basics (setting up ANTLR with Eclipse). Have a look at them:
http://javadude.com/articles/antlr3xtut/%20http://javadude.com/articles/antlr3xtut/

Kind regards,

Bart Kiers.

List: http://www.antlr.org/mailman/listinfo/antlr-interest
Unsubscribe: 
http://www.antlr.org/mailman/options/antlr-interest/your-email-address

-- 
You received this message because you are subscribed to the Google Groups 
il-antlr-interest group.
To post to this group, send email to il-antlr-inter...@googlegroups.com.
To unsubscribe from this group, send email to 
il-antlr-interest+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/il-antlr-interest?hl=en.



[il-antlr-interest: 28892] [antlr-interest] The Java Method that Generates the Lexer and the Parser

2010-05-19 Thread Sameh W. Zaky
Dear All,

I am a Java developer using ANTLR 1.3.1
I am working is some dynamic environment, so my grammar is changing over
time due to the continuous change in vocabulary..
So I was thinking of generating my *.g grammar file automatically not to
write it by myself..

But now I face the problem that I cannot find the runtime method that takes
the grammar file as input, and gives as output the generation of the tokens
file, lexer.java file, and parser.java file.. In other words, I simply want
the method that does the exact same task as the Generate Code option in
the Generate menu in ANTLR 1.3.1 :-)

Any help?
Thanks in Advance ;-)

-- 
Sameh W. Zaky

List: http://www.antlr.org/mailman/listinfo/antlr-interest
Unsubscribe: 
http://www.antlr.org/mailman/options/antlr-interest/your-email-address

-- 
You received this message because you are subscribed to the Google Groups 
il-antlr-interest group.
To post to this group, send email to il-antlr-inter...@googlegroups.com.
To unsubscribe from this group, send email to 
il-antlr-interest+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/il-antlr-interest?hl=en.



[il-antlr-interest: 28893] Re: [antlr-interest] The Java Method that Generates the Lexer and the Parser

2010-05-19 Thread Sameh W. Zaky
Sorry, I meant ANTLRWorks 1.3.1..

On Wed, May 19, 2010 at 11:37 AM, Sameh W. Zaky sameh...@gmail.com wrote:

 Dear All,

 I am a Java developer using ANTLR 1.3.1
 I am working is some dynamic environment, so my grammar is changing over
 time due to the continuous change in vocabulary..
 So I was thinking of generating my *.g grammar file automatically not to
 write it by myself..

 But now I face the problem that I cannot find the runtime method that takes
 the grammar file as input, and gives as output the generation of the tokens
 file, lexer.java file, and parser.java file.. In other words, I simply want
 the method that does the exact same task as the Generate Code option in
 the Generate menu in ANTLR 1.3.1 :-)

 Any help?
 Thanks in Advance ;-)

 --
 Sameh W. Zaky




-- 
Sameh W. Zaky

List: http://www.antlr.org/mailman/listinfo/antlr-interest
Unsubscribe: 
http://www.antlr.org/mailman/options/antlr-interest/your-email-address

-- 
You received this message because you are subscribed to the Google Groups 
il-antlr-interest group.
To post to this group, send email to il-antlr-inter...@googlegroups.com.
To unsubscribe from this group, send email to 
il-antlr-interest+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/il-antlr-interest?hl=en.



[il-antlr-interest: 28894] Re: [antlr-interest] Skip subtree in tree grammar

2010-05-19 Thread Jan H. van der Ven
Hello list,


Did someone solve this? I have a similar problem with a grammar I took 
from this list (Eval.g and Simple.g). It concerns the .

ifElse
scope {
   bool expResult;
} :
^(
   IFTHEN b = expression { $ifElse::expResult = b; }
   (
  {$ifElse::expResult == true}?= actionSequence
  | . // if expResult == false, no action required but eat the token
   )
  )
|
^(
   IFTHENELSE b = expression { $ifElse::expResult = b; }
   (
  {$ifElse::expResult == true}? actionSequence
  . // if expResult == true, call the 'then' action and 'eat' 
the else action
  | . actionSequence // if expResult == false, 'eat' the 'then' 
action and call the else action
   )
  );
On nested statements this fails to throw away the 'false' part of the tree.
How can I fix that?

Kind regards,


Jan


On 7-5-2009 20:38, Martijn Reuvers wrote:
 Hello!

 I tried it, but neither works. :/ I ran it against a snapshot of 3.1.4
 runtime that I built with mave (3.1.3 has the same errors btw):

 The skip option says when run:
 * Wildcard invalid as root; wildcard can itself be a tree.

 As for the | * option it still has a similar error as before:
 * node from after line 22:12 no viable alternative at input 'DOWN'.

 This is what I have for the |*
 --
 bool_function_content[Boolean value]
 scope {
   Boolean t;
 }
 @init {
   $bool_function_content::t = $value;
 }
   : {$bool_function_content::t  != null
 $bool_function_content::t.booleanValue() }? =  function_content*
   | .*
   ;   

 Any thoughts?

 Martijn

 List: http://www.antlr.org/mailman/listinfo/antlr-interest
 Unsubscribe: 
 http://www.antlr.org/mailman/options/antlr-interest/your-email-address




List: http://www.antlr.org/mailman/listinfo/antlr-interest
Unsubscribe: 
http://www.antlr.org/mailman/options/antlr-interest/your-email-address

-- 
You received this message because you are subscribed to the Google Groups 
il-antlr-interest group.
To post to this group, send email to il-antlr-inter...@googlegroups.com.
To unsubscribe from this group, send email to 
il-antlr-interest+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/il-antlr-interest?hl=en.



[il-antlr-interest: 28895] Re: [antlr-interest] SKIP() vs skip() in 'C' runtime

2010-05-19 Thread Jim Idle
Why?

:s/skip\(\)/SKIP()/g

However it is a macro defined in the generated code, all you need do is:

#define skip() SKIP()

In an @section that follows the macro definition of SKIP

Jim

 -Original Message-
 From: antlr-interest-boun...@antlr.org [mailto:antlr-interest-
 boun...@antlr.org] On Behalf Of Alan Condit
 Sent: Tuesday, May 18, 2010 9:42 PM
 To: antlr-interest@antlr.org
 Subject: [antlr-interest] SKIP() vs skip() in 'C' runtime
 
 Where is the code for SKIP() found in the 'C' runtime? I had SKIP() in
 my C code version of the parser then I had to move to Java to find some
 bugs in my grammar. There I had to change SKIP() to skip(). Now I am
 going back to 'C' but I would like to change the 'C' runtime so that it
 will accept the lowercase skip().
 
 Thanks,
 Alan
 ---
 
 Alan Condit
 1085 Tierra Ct.
 Woodburn, OR 97071
 
 Email -- acon...@ipns.com
 Home-Office (503) 982-0906
 
 
 List: http://www.antlr.org/mailman/listinfo/antlr-interest
 Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-
 email-address




List: http://www.antlr.org/mailman/listinfo/antlr-interest
Unsubscribe: 
http://www.antlr.org/mailman/options/antlr-interest/your-email-address

-- 
You received this message because you are subscribed to the Google Groups 
il-antlr-interest group.
To post to this group, send email to il-antlr-inter...@googlegroups.com.
To unsubscribe from this group, send email to 
il-antlr-interest+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/il-antlr-interest?hl=en.



[il-antlr-interest: 0] Re: [antlr-interest] another question about custom lexer

2010-05-19 Thread Jim Idle
Well, what language are you talking about? What are you trying to achieve? Why 
do you think you need a custom lexer? 

http://perl.plover.com/Questions.html


Jim

 -Original Message-
 From: antlr-interest-boun...@antlr.org [mailto:antlr-interest-
 boun...@antlr.org] On Behalf Of ante...@freemail.hu
 Sent: Wednesday, May 19, 2010 1:59 AM
 To: antlr-interest@antlr.org
 Subject: [antlr-interest] another question about custom lexer
 
 Hi,
 
 I have a hand-made lexer that returns tokens.
 Let us say it has a fuction string getnexttoken(int  tokentype);
 
 How would you plug that in the Antlr?
 
 
 Thanks.
 
 Marton Papp
 
 
 
 
 
 List: http://www.antlr.org/mailman/listinfo/antlr-interest
 Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-
 email-address




List: http://www.antlr.org/mailman/listinfo/antlr-interest
Unsubscribe: 
http://www.antlr.org/mailman/options/antlr-interest/your-email-address

-- 
You received this message because you are subscribed to the Google Groups 
il-antlr-interest group.
To post to this group, send email to il-antlr-inter...@googlegroups.com.
To unsubscribe from this group, send email to 
il-antlr-interest+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/il-antlr-interest?hl=en.



[il-antlr-interest: 28897] [antlr-interest] Input buffer instead of reading the whole file

2010-05-19 Thread Bob
Hi, a back-breaker question,

 

Is it possible under these circumstances to have the input file read in
blocks (say, 8kb) instead of reading the whole file into memory?

 

I'll be writing actions for every rule (not using Antlr's AST). Once the
actions are processed the input history is not used.

 

Reason: Some source files are 800mb - 1.4gb in size and reading the entire
thing into 32 bit address space doesn't leave much leftover.

 

If it's possible to limit the input buffer size, can you point me in the
right direction?

 

Thanks,

Bob

 


List: http://www.antlr.org/mailman/listinfo/antlr-interest
Unsubscribe: 
http://www.antlr.org/mailman/options/antlr-interest/your-email-address

-- 
You received this message because you are subscribed to the Google Groups 
il-antlr-interest group.
To post to this group, send email to il-antlr-inter...@googlegroups.com.
To unsubscribe from this group, send email to 
il-antlr-interest+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/il-antlr-interest?hl=en.



[il-antlr-interest: 28898] [antlr-interest] Token lin lexer

2010-05-19 Thread Bob
I'm 1 day into Antlr and hope for an answer to this:

 

With an identifier rule (for example this one):

 

SIMPLE_IDENTIFIER : ( 'a'..'z'|'A'..'Z'|'_' ) (
'a'..'z'|'A'..'Z'|'_'|'0'..'9'|'$')* ;

 

Is it possible, when the lexer recognizes the input stream to be a
SIMPLE_IDENTIFIER, to add some extra code that would

look-up the SIMPLE_IDENTIFIER and return possibly a different token? - Thus
directing the parser to different grammar rules.

 

Take this expression for example:

 

(  V(n1)/r1 + Func(arg1) )

 

where the semantics of V(n1) are more akin to n1-V rather than a function
call to V with arg n1.

I'd like to capture the V(n1) during parsing and make it a n1-V node
instead of a function call node.

 

Using flex this is easy: Once the identifier string is matched it can be
used in a lookup to determine the token type then fed to bison.

 

So, Can Antlr let me switch the token type at the lexical level before the
parser gets hold of it?

 

Hope this makes sense!

 


List: http://www.antlr.org/mailman/listinfo/antlr-interest
Unsubscribe: 
http://www.antlr.org/mailman/options/antlr-interest/your-email-address

-- 
You received this message because you are subscribed to the Google Groups 
il-antlr-interest group.
To post to this group, send email to il-antlr-inter...@googlegroups.com.
To unsubscribe from this group, send email to 
il-antlr-interest+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/il-antlr-interest?hl=en.



[il-antlr-interest: 28900] [antlr-interest] null pointer to ADAPTOR-setTokenBoundaries

2010-05-19 Thread Alan Condit
Help!!!

I am getting a null pointer to setTokenBoundaries in the following line of 
generated code. 
ADAPTOR-setTokenBoundaries(ADAPTOR, retval.tree, retval.start, retval.stop); 

The grammar works under Java.  In moving it back to 'C', I changed the language 
option to 'C', added option ASTLabelType=pANTLR3_BASE_TREE; and added the 
necessary includes to compile and link under Objective-C.

Is there anything obvious that I am doing wrong?

Thanks,
Alan
---

Alan Condit
1085 Tierra Ct.
Woodburn, OR 97071

Email -- acon...@ipns.com
Home-Office (503) 982-0906


List: http://www.antlr.org/mailman/listinfo/antlr-interest
Unsubscribe: 
http://www.antlr.org/mailman/options/antlr-interest/your-email-address

-- 
You received this message because you are subscribed to the Google Groups 
il-antlr-interest group.
To post to this group, send email to il-antlr-inter...@googlegroups.com.
To unsubscribe from this group, send email to 
il-antlr-interest+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/il-antlr-interest?hl=en.



[il-antlr-interest: 28901] Re: [antlr-interest] enums in v4 ANTLR Java code generation considered useless

2010-05-19 Thread Terence Parr

On May 18, 2010, at 2:58 PM, Scott Stanchfield wrote:

 There are several advantages to enums:
 * there is a discrete set of values that can be used (no accidental
 42's passed in when 42 isn't a token type)
 * the enum value can carry extra information
 * the enum values can override methods differently

These are all excellent advantages. I believe that these mostly apply when 
you're writing code, not generating. Just like the compiler generates integers 
underneath, if antlr is generating integers, it's probably okay.

 OH - one of the things that's clouding this is that you really don't
 need the numeric type identifers anymore. You can just have
 
  public enum TokenType {
IDENT, INT ...;
  }
 
 then in your match method:
 
  void match(TokenType type) {
if (LA(1).getType() == type) {
...
}
  }

The only problem is that match() lives up in the superclass in the library but 
the generated parser needs to define the enum.

I also have the problem that I need to merge token types from multiple grammars 
for grammar imports. This gets more competition with enum types without 
inheritance.

 
 And you can use the types in a switch statement:
 
  switch(type) {
case INT:
case IDENT:
...
  }
 
 No more magic numbers! Woohoo!

ANTLR already uses the labels when possible such as INT. If you use a literal 
in your grammar such as ';' in don't label it in the lexer, than I had no 
choice but to generate the integer token type or a weird label like TOKEN34.

Ter

List: http://www.antlr.org/mailman/listinfo/antlr-interest
Unsubscribe: 
http://www.antlr.org/mailman/options/antlr-interest/your-email-address

-- 
You received this message because you are subscribed to the Google Groups 
il-antlr-interest group.
To post to this group, send email to il-antlr-inter...@googlegroups.com.
To unsubscribe from this group, send email to 
il-antlr-interest+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/il-antlr-interest?hl=en.



[il-antlr-interest: 28902] Re: [antlr-interest] enums in v4 ANTLR Java code generation considered useless

2010-05-19 Thread Scott Stanchfield
You can still define the match in the superclass -- just use an
interface like Edgar mentioned and I demonstrated in the
clarification note I sent.

I think the big value here would be that it forces every place that
uses the token types to use the enum names (as there are no integer
values). I think that would help debugging enormously (rather than
seeing '4' as the value in the variables window, you'd see 'IDENT').
-- Scott


Scott Stanchfield
http://javadude.com



On Wed, May 19, 2010 at 2:34 PM, Terence Parr pa...@cs.usfca.edu wrote:

 On May 18, 2010, at 2:58 PM, Scott Stanchfield wrote:

 There are several advantages to enums:
 * there is a discrete set of values that can be used (no accidental
 42's passed in when 42 isn't a token type)
 * the enum value can carry extra information
 * the enum values can override methods differently

 These are all excellent advantages. I believe that these mostly apply when 
 you're writing code, not generating. Just like the compiler generates 
 integers underneath, if antlr is generating integers, it's probably okay.

 OH - one of the things that's clouding this is that you really don't
 need the numeric type identifers anymore. You can just have

  public enum TokenType {
    IDENT, INT ...;
  }

 then in your match method:

  void match(TokenType type) {
    if (LA(1).getType() == type) {
        ...
    }
  }

 The only problem is that match() lives up in the superclass in the library 
 but the generated parser needs to define the enum.

 I also have the problem that I need to merge token types from multiple 
 grammars for grammar imports. This gets more competition with enum types 
 without inheritance.


 And you can use the types in a switch statement:

  switch(type) {
    case INT:
    case IDENT:
    ...
  }

 No more magic numbers! Woohoo!

 ANTLR already uses the labels when possible such as INT. If you use a literal 
 in your grammar such as ';' in don't label it in the lexer, than I had no 
 choice but to generate the integer token type or a weird label like TOKEN34.

 Ter

 List: http://www.antlr.org/mailman/listinfo/antlr-interest
 Unsubscribe: 
 http://www.antlr.org/mailman/options/antlr-interest/your-email-address


List: http://www.antlr.org/mailman/listinfo/antlr-interest
Unsubscribe: 
http://www.antlr.org/mailman/options/antlr-interest/your-email-address

-- 
You received this message because you are subscribed to the Google Groups 
il-antlr-interest group.
To post to this group, send email to il-antlr-inter...@googlegroups.com.
To unsubscribe from this group, send email to 
il-antlr-interest+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/il-antlr-interest?hl=en.



[il-antlr-interest: 28903] Re: [antlr-interest] enums in v4 ANTLR Java code generation considered useless

2010-05-19 Thread Terence Parr

On May 19, 2010, at 11:39 AM, Scott Stanchfield wrote:

 You can still define the match in the superclass -- just use an
 interface like Edgar mentioned and I demonstrated in the
 clarification note I sent.

oh right.

 I think the big value here would be that it forces every place that
 uses the token types to use the enum names (as there are no integer
 values). I think that would help debugging enormously (rather than
 seeing '4' as the value in the variables window, you'd see 'IDENT').

what about ';' token?  What's it's label?
T

List: http://www.antlr.org/mailman/listinfo/antlr-interest
Unsubscribe: 
http://www.antlr.org/mailman/options/antlr-interest/your-email-address

-- 
You received this message because you are subscribed to the Google Groups 
il-antlr-interest group.
To post to this group, send email to il-antlr-inter...@googlegroups.com.
To unsubscribe from this group, send email to 
il-antlr-interest+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/il-antlr-interest?hl=en.



[il-antlr-interest: 28904] [antlr-interest] company looking for 2 ANTLR developers

2010-05-19 Thread Terence Parr
Hi, a recruiter in NYC has 2 positions to fill for a client.  full-time and 
paying anywhere from $100k to $120k. Contact info:

Hamilton Daza
Intrigue Systems, Inc.
7211 Austin Street #259
Forest Hills, NY 11375
800.809.0318 Main
917.699.3376 Mobile
718.841.7091 Fax
hdaza at intriguesys.com

Ter

List: http://www.antlr.org/mailman/listinfo/antlr-interest
Unsubscribe: 
http://www.antlr.org/mailman/options/antlr-interest/your-email-address

-- 
You received this message because you are subscribed to the Google Groups 
il-antlr-interest group.
To post to this group, send email to il-antlr-inter...@googlegroups.com.
To unsubscribe from this group, send email to 
il-antlr-interest+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/il-antlr-interest?hl=en.



[il-antlr-interest: 28905] Re: [antlr-interest] enums in v4 ANTLR Java code generation considered useless

2010-05-19 Thread Scott Stanchfield
Hmmm... that's evil, ya know that ;)  Good to catch that now, though...

Probably LITERAL_1, LITERAL_2, etc. To make it easier for
debugging/printing/reporting you could add a pattern property (hmmm...
the more I think about it the more I like it... if there's a
description it could be printed w/ the error message, otherwise the
pattern. both could be useful for other purposes)

  public enum FooParserTokens implements TokenType {
IDENT(('a'..'z')('a'..'z'|'A'..'Z')*, an identifier ...),
LITERAL_1(;, null),
LITERAL_2(+, null);
private String pattern;
private String description;
private FooParserTokens(String pattern, String description) {
  this.pattern = pattern;
  this.description = description;
}
  }

-- Scott


Scott Stanchfield
http://javadude.com



On Wed, May 19, 2010 at 2:42 PM, Terence Parr pa...@cs.usfca.edu wrote:

 On May 19, 2010, at 11:39 AM, Scott Stanchfield wrote:

 You can still define the match in the superclass -- just use an
 interface like Edgar mentioned and I demonstrated in the
 clarification note I sent.

 oh right.

 I think the big value here would be that it forces every place that
 uses the token types to use the enum names (as there are no integer
 values). I think that would help debugging enormously (rather than
 seeing '4' as the value in the variables window, you'd see 'IDENT').

 what about ';' token?  What's it's label?
 T

 List: http://www.antlr.org/mailman/listinfo/antlr-interest
 Unsubscribe: 
 http://www.antlr.org/mailman/options/antlr-interest/your-email-address


List: http://www.antlr.org/mailman/listinfo/antlr-interest
Unsubscribe: 
http://www.antlr.org/mailman/options/antlr-interest/your-email-address

-- 
You received this message because you are subscribed to the Google Groups 
il-antlr-interest group.
To post to this group, send email to il-antlr-inter...@googlegroups.com.
To unsubscribe from this group, send email to 
il-antlr-interest+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/il-antlr-interest?hl=en.



[il-antlr-interest: 28907] Re: [antlr-interest] enums in v4 ANTLR Java code generation considered useless

2010-05-19 Thread Jim Idle
I also have doubts about the performance characteristics and the possibility of 
starting to rely on the target language to fill in gaps such as token numbering 
- we could get to the point where code generators cannot be built for more 
primitive languages because the schema is relying the language to automatically 
do things. 

The generated code should be as primitive as possible, with the runtime being 
as maintainable and clear as possible while not sacrificing performance.

Jim

 -Original Message-
 From: antlr-interest-boun...@antlr.org [mailto:antlr-interest-
 boun...@antlr.org] On Behalf Of Terence Parr
 Sent: Wednesday, May 19, 2010 11:35 AM
 To: antlr-interest interest
 Subject: Re: [antlr-interest] enums in v4 ANTLR Java code generation
 considered useless
 
 
 On May 18, 2010, at 2:58 PM, Scott Stanchfield wrote:
 
  There are several advantages to enums:
  * there is a discrete set of values that can be used (no accidental
  42's passed in when 42 isn't a token type)
  * the enum value can carry extra information
  * the enum values can override methods differently
 
 These are all excellent advantages. I believe that these mostly apply
 when you're writing code, not generating. Just like the compiler
 generates integers underneath, if antlr is generating integers, it's
 probably okay.
 
  OH - one of the things that's clouding this is that you really don't
  need the numeric type identifers anymore. You can just have
 
   public enum TokenType {
 IDENT, INT ...;
   }
 
  then in your match method:
 
   void match(TokenType type) {
 if (LA(1).getType() == type) {
 ...
 }
   }
 
 The only problem is that match() lives up in the superclass in the
 library but the generated parser needs to define the enum.
 
 I also have the problem that I need to merge token types from multiple
 grammars for grammar imports. This gets more competition with enum
 types without inheritance.
 
 
  And you can use the types in a switch statement:
 
   switch(type) {
 case INT:
 case IDENT:
 ...
   }
 
  No more magic numbers! Woohoo!
 
 ANTLR already uses the labels when possible such as INT. If you use a
 literal in your grammar such as ';' in don't label it in the lexer,
 than I had no choice but to generate the integer token type or a weird
 label like TOKEN34.
 
 Ter
 
 List: http://www.antlr.org/mailman/listinfo/antlr-interest
 Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-
 email-address




List: http://www.antlr.org/mailman/listinfo/antlr-interest
Unsubscribe: 
http://www.antlr.org/mailman/options/antlr-interest/your-email-address

-- 
You received this message because you are subscribed to the Google Groups 
il-antlr-interest group.
To post to this group, send email to il-antlr-inter...@googlegroups.com.
To unsubscribe from this group, send email to 
il-antlr-interest+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/il-antlr-interest?hl=en.



[il-antlr-interest: 28908] Re: [antlr-interest] enums in v4 ANTLR Java code generation considered useless

2010-05-19 Thread Scott Stanchfield
Interesting point re common code generation approaches, but as far as
performance goes, it's equivalent - all == tests are done using
pointers, which are the same size as ints. If switch is used the
ordinal values of the enums are used, and the java compiler may be
able to better optimize which switch bytecode is used b/c it knows the
exact possible range of values.

I'd much rather use enums where available, though. I'd think any code
generator could generate a simple int equivalent where enums don't
exist, though. The only gotcha would be if we had the
pattern/description properties, which would have to be represented as
separate arrays in most languages. They aren't necessary though (but
I'd love to have them)
-- Scott


Scott Stanchfield
http://javadude.com



On Wed, May 19, 2010 at 3:04 PM, Jim Idle j...@temporal-wave.com wrote:
 I also have doubts about the performance characteristics and the possibility 
 of starting to rely on the target language to fill in gaps such as token 
 numbering - we could get to the point where code generators cannot be built 
 for more primitive languages because the schema is relying the language to 
 automatically do things.

 The generated code should be as primitive as possible, with the runtime being 
 as maintainable and clear as possible while not sacrificing performance.

 Jim

 -Original Message-
 From: antlr-interest-boun...@antlr.org [mailto:antlr-interest-
 boun...@antlr.org] On Behalf Of Terence Parr
 Sent: Wednesday, May 19, 2010 11:35 AM
 To: antlr-interest interest
 Subject: Re: [antlr-interest] enums in v4 ANTLR Java code generation
 considered useless


 On May 18, 2010, at 2:58 PM, Scott Stanchfield wrote:

  There are several advantages to enums:
  * there is a discrete set of values that can be used (no accidental
  42's passed in when 42 isn't a token type)
  * the enum value can carry extra information
  * the enum values can override methods differently

 These are all excellent advantages. I believe that these mostly apply
 when you're writing code, not generating. Just like the compiler
 generates integers underneath, if antlr is generating integers, it's
 probably okay.

  OH - one of the things that's clouding this is that you really don't
  need the numeric type identifers anymore. You can just have
 
   public enum TokenType {
     IDENT, INT ...;
   }
 
  then in your match method:
 
   void match(TokenType type) {
     if (LA(1).getType() == type) {
         ...
     }
   }

 The only problem is that match() lives up in the superclass in the
 library but the generated parser needs to define the enum.

 I also have the problem that I need to merge token types from multiple
 grammars for grammar imports. This gets more competition with enum
 types without inheritance.

 
  And you can use the types in a switch statement:
 
   switch(type) {
     case INT:
     case IDENT:
     ...
   }
 
  No more magic numbers! Woohoo!

 ANTLR already uses the labels when possible such as INT. If you use a
 literal in your grammar such as ';' in don't label it in the lexer,
 than I had no choice but to generate the integer token type or a weird
 label like TOKEN34.

 Ter

 List: http://www.antlr.org/mailman/listinfo/antlr-interest
 Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-
 email-address




 List: http://www.antlr.org/mailman/listinfo/antlr-interest
 Unsubscribe: 
 http://www.antlr.org/mailman/options/antlr-interest/your-email-address


List: http://www.antlr.org/mailman/listinfo/antlr-interest
Unsubscribe: 
http://www.antlr.org/mailman/options/antlr-interest/your-email-address

-- 
You received this message because you are subscribed to the Google Groups 
il-antlr-interest group.
To post to this group, send email to il-antlr-inter...@googlegroups.com.
To unsubscribe from this group, send email to 
il-antlr-interest+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/il-antlr-interest?hl=en.



[il-antlr-interest: 28909] Re: [antlr-interest] enums in v4 ANTLR Java code generation considered useless

2010-05-19 Thread Kirby Bohling
On Wed, May 19, 2010 at 2:13 PM, Scott Stanchfield sc...@javadude.com wrote:
 Interesting point re common code generation approaches, but as far as
 performance goes, it's equivalent - all == tests are done using
 pointers, which are the same size as ints. If switch is used the
 ordinal values of the enums are used, and the java compiler may be
 able to better optimize which switch bytecode is used b/c it knows the
 exact possible range of values.

That's true of most full scale JVMs with good JIT, but for many
embedded VM's that isn't true.  See the Dalvik VM for Android.

This link for instance:
http://developer.android.com/guide/practices/design/performance.html#avoid_enums

I believe it is becoming less true as time goes along, but from what I
know right now it is true.

If you can't support generating both, I'd agree with Jim Idle support
the one that will go everywhere.  If however you could treat it like
the C target does with using switch vs. if/else, I'd think that'd be
nifty.  Doubly so because maintenance burden is free when somebody
else is doing the work.  As this affects the external API, I would
assume that it's a non-option to generate one or the other.



 I'd much rather use enums where available, though. I'd think any code
 generator could generate a simple int equivalent where enums don't
 exist, though. The only gotcha would be if we had the
 pattern/description properties, which would have to be represented as
 separate arrays in most languages. They aren't necessary though (but
 I'd love to have them)
 -- Scott

 
 Scott Stanchfield
 http://javadude.com



 On Wed, May 19, 2010 at 3:04 PM, Jim Idle j...@temporal-wave.com wrote:
 I also have doubts about the performance characteristics and the possibility 
 of starting to rely on the target language to fill in gaps such as token 
 numbering - we could get to the point where code generators cannot be built 
 for more primitive languages because the schema is relying the language to 
 automatically do things.

 The generated code should be as primitive as possible, with the runtime 
 being as maintainable and clear as possible while not sacrificing 
 performance.

 Jim

 -Original Message-
 From: antlr-interest-boun...@antlr.org [mailto:antlr-interest-
 boun...@antlr.org] On Behalf Of Terence Parr
 Sent: Wednesday, May 19, 2010 11:35 AM
 To: antlr-interest interest
 Subject: Re: [antlr-interest] enums in v4 ANTLR Java code generation
 considered useless


 On May 18, 2010, at 2:58 PM, Scott Stanchfield wrote:

  There are several advantages to enums:
  * there is a discrete set of values that can be used (no accidental
  42's passed in when 42 isn't a token type)
  * the enum value can carry extra information
  * the enum values can override methods differently

 These are all excellent advantages. I believe that these mostly apply
 when you're writing code, not generating. Just like the compiler
 generates integers underneath, if antlr is generating integers, it's
 probably okay.

  OH - one of the things that's clouding this is that you really don't
  need the numeric type identifers anymore. You can just have
 
   public enum TokenType {
     IDENT, INT ...;
   }
 
  then in your match method:
 
   void match(TokenType type) {
     if (LA(1).getType() == type) {
         ...
     }
   }

 The only problem is that match() lives up in the superclass in the
 library but the generated parser needs to define the enum.

 I also have the problem that I need to merge token types from multiple
 grammars for grammar imports. This gets more competition with enum
 types without inheritance.

 
  And you can use the types in a switch statement:
 
   switch(type) {
     case INT:
     case IDENT:
     ...
   }
 
  No more magic numbers! Woohoo!

 ANTLR already uses the labels when possible such as INT. If you use a
 literal in your grammar such as ';' in don't label it in the lexer,
 than I had no choice but to generate the integer token type or a weird
 label like TOKEN34.

 Ter

 List: http://www.antlr.org/mailman/listinfo/antlr-interest
 Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-
 email-address




 List: http://www.antlr.org/mailman/listinfo/antlr-interest
 Unsubscribe: 
 http://www.antlr.org/mailman/options/antlr-interest/your-email-address


 List: http://www.antlr.org/mailman/listinfo/antlr-interest
 Unsubscribe: 
 http://www.antlr.org/mailman/options/antlr-interest/your-email-address


List: http://www.antlr.org/mailman/listinfo/antlr-interest
Unsubscribe: 
http://www.antlr.org/mailman/options/antlr-interest/your-email-address

-- 
You received this message because you are subscribed to the Google Groups 
il-antlr-interest group.
To post to this group, send email to il-antlr-inter...@googlegroups.com.
To unsubscribe from this group, send email to 
il-antlr-interest+unsubscr...@googlegroups.com.
For more options, visit this group at 

[il-antlr-interest: 28911] Re: [antlr-interest] enums in v4 ANTLR Java code generation considered useless

2010-05-19 Thread Scott Stanchfield
Don't pre-optimize for things like this. Profile, then optimize. This
won't even show up as an issue.

I think whoever wrote that page was daydreaming about any minor way
performance might be increased - note that they don't talk at all on
that page about the big performance issues (I/O, networking, etc),
though I do like that they talk about limiting object creation.

With the example they show on that android dev page, you'll never
see/feel the difference. And their example on grabbing the ordinal
value so you don't need to lookup a static field is really silly. If
they just want to avoid looking up the static field everytime through
the loop, don't do:

    int valX = MyEnum.VAL_X.ordinal();
int valY = MyEnum.VAL_Y.ordinal();
int count = list.size();
MyItem items = list.items();
for (int  n = 0; n  count; n++)   {
int  valItem = items[n].e.ordinal();
if (valItem == valX)
// do stuff 1
else if (valItem == valY)
// do stuff 2
}

instead do

MyEnum valX = MyEnum.VAL_X;
MyEnum valY = MyEnum.VAL_Y;
int count = list.size();
MyItem items = list.items();
for (int  n = 0; n  count; n++)   {
MyEnum valItem = items[n].e;
if (valItem == valX)
// do stuff 1
else if (valItem == valY)
// do stuff 2
}

Stuff like that makes me think whoever wrote that really didn't think
it through all the way. The pointer comparison is the same expense as
the int comparison and avoids n+2 calls to ordinal() in their example
code.

Moreso, the suggestion to use constants that the compiler will inline
is truly evil. Compiler constant inlining can very easily lead to
incorrect constant values when a library (that provides a constant)
changes (new jar dropped in with a new value for the constant) but the
code using that library isn't recompiled. Safety issue.

If this becomes an issue (which I doubt it will), someone can always
extend the code generator to tweak it.
-- Scott


Scott Stanchfield
http://javadude.com



On Wed, May 19, 2010 at 3:59 PM, Kirby Bohling kirby.bohl...@gmail.com wrote:
 On Wed, May 19, 2010 at 2:13 PM, Scott Stanchfield sc...@javadude.com wrote:
 Interesting point re common code generation approaches, but as far as
 performance goes, it's equivalent - all == tests are done using
 pointers, which are the same size as ints. If switch is used the
 ordinal values of the enums are used, and the java compiler may be
 able to better optimize which switch bytecode is used b/c it knows the
 exact possible range of values.

 That's true of most full scale JVMs with good JIT, but for many
 embedded VM's that isn't true.  See the Dalvik VM for Android.

 This link for instance:
 http://developer.android.com/guide/practices/design/performance.html#avoid_enums

 I believe it is becoming less true as time goes along, but from what I
 know right now it is true.

 If you can't support generating both, I'd agree with Jim Idle support
 the one that will go everywhere.  If however you could treat it like
 the C target does with using switch vs. if/else, I'd think that'd be
 nifty.  Doubly so because maintenance burden is free when somebody
 else is doing the work.  As this affects the external API, I would
 assume that it's a non-option to generate one or the other.



 I'd much rather use enums where available, though. I'd think any code
 generator could generate a simple int equivalent where enums don't
 exist, though. The only gotcha would be if we had the
 pattern/description properties, which would have to be represented as
 separate arrays in most languages. They aren't necessary though (but
 I'd love to have them)
 -- Scott

 
 Scott Stanchfield
 http://javadude.com



 On Wed, May 19, 2010 at 3:04 PM, Jim Idle j...@temporal-wave.com wrote:
 I also have doubts about the performance characteristics and the 
 possibility of starting to rely on the target language to fill in gaps such 
 as token numbering - we could get to the point where code generators cannot 
 be built for more primitive languages because the schema is relying the 
 language to automatically do things.

 The generated code should be as primitive as possible, with the runtime 
 being as maintainable and clear as possible while not sacrificing 
 performance.

 Jim

 -Original Message-
 From: antlr-interest-boun...@antlr.org [mailto:antlr-interest-
 boun...@antlr.org] On Behalf Of Terence Parr
 Sent: Wednesday, May 19, 2010 11:35 AM
 To: antlr-interest interest
 Subject: Re: [antlr-interest] enums in v4 ANTLR Java code generation
 considered useless


 On May 18, 2010, at 2:58 PM, Scott Stanchfield wrote:

  There are several advantages to enums:
  * there is a discrete set of values that can be used (no accidental
  42's passed in when 42 isn't a token type)
  * the enum value can carry extra information
  * the enum values can override 

[il-antlr-interest: 28912] Re: [antlr-interest] null pointer to ADAPTOR-setTokenBoundaries

2010-05-19 Thread Alan Condit
Jim,

Here is what I have set in options:
options {
backtrack   =   true;
memoize =   true;
language=   C;
output  =   AST;
ASTLabelType=   pANTLR3_BASE_TREE;
}

The null is inside 'ctx' inside 'adaptor' at 'setTokenBoundaries'.

It is inside a function 
/** 
* $ANTLR start line
* /Users/acondit/source/GCCnv/LatheBranch/trunk/Parser/RS274ngc.g:184:1: line : 
( ( line_number )? ( segment )+ K_NEWLINE - ^( STMT ( segment )+ ) | ( 
line_number )? K_NEWLINE - | oword_stmt - ^( STMT oword_stmt ) );
*/
static RS274ngcParser_line_return
line(pRS274ngcParser ctx)
{
...
}

which I assume, based on the comment, is generated from this rule:
line:   line_number? segment+ K_NEWLINE
- ^(STMT segment+)
|   line_number? K_NEWLINE
-
|   oword_stmt
- ^(STMT oword_stmt)
;

The grammar is for parsing an existing language not one of my invention, and 
grammatically the newlines delineate a semantic block therefore must be known 
by the parser, but empty lines are discarded and therefore should not be in the 
tree.

Alan
---

Alan's MachineWorks
1085 Tierra Ct.
Woodburn, OR 97071

Email -- acon...@alansmachineworks.com
www.alansmachineworks.com

Jim wrote--
Please post more information about your grammar, what the null pointer is, etc. 
It is hard to interpolate, but the common mistake is not adding output=AST; to 
the options, so you do not get a tree adaptor created.

Jim
 -Original Message-
 From: antlr-interest-bounces at antlr.org
[mailto:antlr-interest-
 bounces at antlr.org
] On Behalf Of Alan Condit

 Sent: Wednesday, May 19, 2010 11:25 AM
 To: antlr-interest at antlr.org
 Subject: [antlr-interest] null pointer to ADAPTOR-setTokenBoundaries
 
 Help!!!
 
 I am getting a null pointer to setTokenBoundaries in the following line
 of generated code.
 ADAPTOR-setTokenBoundaries(ADAPTOR, retval.tree, retval.start,
 retval.stop);
 
 The grammar works under Java.  In moving it back to 'C', I changed the
 language option to 'C', added option ASTLabelType=pANTLR3_BASE_TREE;
 and added the necessary includes to compile and link under Objective-C.
 
 Is there anything obvious that I am doing wrong?
 
 Thanks,
 Alan
 



List: http://www.antlr.org/mailman/listinfo/antlr-interest
Unsubscribe: 
http://www.antlr.org/mailman/options/antlr-interest/your-email-address

-- 
You received this message because you are subscribed to the Google Groups 
il-antlr-interest group.
To post to this group, send email to il-antlr-inter...@googlegroups.com.
To unsubscribe from this group, send email to 
il-antlr-interest+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/il-antlr-interest?hl=en.



[il-antlr-interest: 28914] [antlr-interest] Referencing attributes

2010-05-19 Thread Junkman
Greetings,

I'm a Antlr noob, and have a question regarding accessing attributes.

Where, outside of action, can you reference attributes?  One place seems
to be as parameter to rule invocation like this:

decl: type declarator[ $type.text ] ';' ;
 
This is from The Definitive Antlr Reference,  page 119.

Is that true in general?  Are there other locations outside of actions
where attributes can be accessed?

As noted, I am a noob to Antlr and just joined this list.  Please let me
know if this email's question/topic is not appropriate to the list.

Thanks.



List: http://www.antlr.org/mailman/listinfo/antlr-interest
Unsubscribe: 
http://www.antlr.org/mailman/options/antlr-interest/your-email-address

-- 
You received this message because you are subscribed to the Google Groups 
il-antlr-interest group.
To post to this group, send email to il-antlr-inter...@googlegroups.com.
To unsubscribe from this group, send email to 
il-antlr-interest+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/il-antlr-interest?hl=en.



[il-antlr-interest: 28915] Re: [antlr-interest] null pointer to ADAPTOR-setTokenBoundaries

2010-05-19 Thread John B. Brodie
Pardon me for butting in.

And I have never used the C code generator, but.

On Wed, 2010-05-19 at 14:06 -0700, Alan Condit wrote:

 which I assume, based on the comment, is generated from this rule:
 line  :   line_number? segment+ K_NEWLINE
   - ^(STMT segment+)
   |   line_number? K_NEWLINE
   -
   |   oword_stmt
   - ^(STMT oword_stmt)
   ;
 
 The grammar is for parsing an existing language not one of my invention,
 and grammatically the newlines delineate a semantic block therefore must
 be known by the parser, but empty lines are discarded and therefore
 should not be in the tree.

having an empty RHS of the - rewrite operator feels well unusual.

i am not sure that ANTLR permits a rule which produces no tree when
output=AST is present

Maybe try (untested):

line : line_number? ( segment+ - ^(STMT segment+) )? K_NEWLINE
 | oword_stmt - ^(STMT oword_stmt)
 ;

but i do not know what would happen when no segment is present for the
above rule

have you considered building a dummy tree node for the empty case and
then your tree walker can just ignore it?

not sure that i have really helped any, sorry.
   -jbb



List: http://www.antlr.org/mailman/listinfo/antlr-interest
Unsubscribe: 
http://www.antlr.org/mailman/options/antlr-interest/your-email-address

-- 
You received this message because you are subscribed to the Google Groups 
il-antlr-interest group.
To post to this group, send email to il-antlr-inter...@googlegroups.com.
To unsubscribe from this group, send email to 
il-antlr-interest+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/il-antlr-interest?hl=en.



[il-antlr-interest: 28916] Re: [antlr-interest] Question about building code generation target

2010-05-19 Thread Naveen
On Jan 16 2009, 4:51 pm, Jim Idle j...@temporal-wave.com wrote:
 When you change your template or codegen target java file, you just type:
 mvn
 And it rebuilds just what has changed in a second or two (depends on your 
 machine speed of course).
On my slow machine, this takes 33 seconds after changing 1 template
file.
However, once its built, I can unjar to /path/to/antlr_unjarred
export CLASSPATH=/path/to/antlr_unjarred:$CLASSPATH
and edit the templates without having to rebuild anything.

by the way, are there plans to integrate the build of the other
runtimes into maven ?

List: http://www.antlr.org/mailman/listinfo/antlr-interest
Unsubscribe: 
http://www.antlr.org/mailman/options/antlr-interest/your-email-address

-- 
You received this message because you are subscribed to the Google Groups 
il-antlr-interest group.
To post to this group, send email to il-antlr-inter...@googlegroups.com.
To unsubscribe from this group, send email to 
il-antlr-interest+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/il-antlr-interest?hl=en.



[il-antlr-interest: 28918] Re: [antlr-interest] null pointer to ADAPTOR-setTokenBoundaries

2010-05-19 Thread Alan Condit
On page 164 of The Definitive Antlr Reference under the heading Omitting 
Input Elements Terrance shows using an empty rewrite rule to allow omitting 
unneeded symbols from the output AST tree.

This does not say that it could not be causing a problem with the generated 'C' 
code.

Jim, is there a possibility that this is a problem?

Alan
---

Alan Condit
1085 Tierra Ct.
Woodburn, OR 97071

Email -- acon...@ipns.com
Home-Office (503) 982-0906

On May 19, 2010, at 3:36 PM, John B. Brodie wrote:

 Pardon me for butting in.
 
 And I have never used the C code generator, but.
 
 On Wed, 2010-05-19 at 14:06 -0700, Alan Condit wrote:
 
 which I assume, based on the comment, is generated from this rule:
 line :   line_number? segment+ K_NEWLINE
  - ^(STMT segment+)
  |   line_number? K_NEWLINE
  -
  |   oword_stmt
  - ^(STMT oword_stmt)
  ;
 
 The grammar is for parsing an existing language not one of my invention,
 and grammatically the newlines delineate a semantic block therefore must
 be known by the parser, but empty lines are discarded and therefore
 should not be in the tree.
 
 having an empty RHS of the - rewrite operator feels well unusual.
 
 i am not sure that ANTLR permits a rule which produces no tree when
 output=AST is present
 
 Maybe try (untested):
 
 line : line_number? ( segment+ - ^(STMT segment+) )? K_NEWLINE
 | oword_stmt - ^(STMT oword_stmt)
 ;
 
 but i do not know what would happen when no segment is present for the
 above rule
 
 have you considered building a dummy tree node for the empty case and
 then your tree walker can just ignore it?
 
 not sure that i have really helped any, sorry.
   -jbb
 
 


List: http://www.antlr.org/mailman/listinfo/antlr-interest
Unsubscribe: 
http://www.antlr.org/mailman/options/antlr-interest/your-email-address

-- 
You received this message because you are subscribed to the Google Groups 
il-antlr-interest group.
To post to this group, send email to il-antlr-inter...@googlegroups.com.
To unsubscribe from this group, send email to 
il-antlr-interest+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/il-antlr-interest?hl=en.



[il-antlr-interest: 28919] Re: [antlr-interest] null pointer to ADAPTOR-setTokenBoundaries

2010-05-19 Thread Jim Idle
I think you will have to put those three productions in separate rules, but I 
will look into it more.

Jim

 -Original Message-
 From: antlr-interest-boun...@antlr.org [mailto:antlr-interest-
 boun...@antlr.org] On Behalf Of Alan Condit
 Sent: Wednesday, May 19, 2010 2:06 PM
 To: antlr-interest@antlr.org
 Subject: Re: [antlr-interest] null pointer to ADAPTOR-
 setTokenBoundaries
 
 Jim,
 
 Here is what I have set in options:
 options {
   backtrack   =   true;
   memoize =   true;
   language=   C;
   output  =   AST;
   ASTLabelType=   pANTLR3_BASE_TREE;
   }
 
 The null is inside 'ctx' inside 'adaptor' at 'setTokenBoundaries'.
 
 It is inside a function
 /**
 * $ANTLR start line
 *
 /Users/acondit/source/GCCnv/LatheBranch/trunk/Parser/RS274ngc.g:184:1:
 line : ( ( line_number )? ( segment )+ K_NEWLINE - ^( STMT ( segment
 )+ ) | ( line_number )? K_NEWLINE - | oword_stmt - ^( STMT oword_stmt
 ) );
 */
 static RS274ngcParser_line_return
 line(pRS274ngcParser ctx)
 {
 ...
 }
 
 which I assume, based on the comment, is generated from this rule:
 line  :   line_number? segment+ K_NEWLINE
   - ^(STMT segment+)
   |   line_number? K_NEWLINE
   -
   |   oword_stmt
   - ^(STMT oword_stmt)
   ;
 
 The grammar is for parsing an existing language not one of my
 invention, and grammatically the newlines delineate a semantic block
 therefore must be known by the parser, but empty lines are discarded
 and therefore should not be in the tree.
 
 Alan
 ---
 
 Alan's MachineWorks
 1085 Tierra Ct.
 Woodburn, OR 97071
 
 Email -- acon...@alansmachineworks.com
 www.alansmachineworks.com
 
 Jim wrote--
 Please post more information about your grammar, what the null pointer
 is, etc. It is hard to interpolate, but the common mistake is not
 adding output=AST; to the options, so you do not get a tree adaptor
 created.
 
 Jim
  -Original Message-
  From: antlr-interest-bounces at antlr.org
 [mailto:antlr-interest-
  bounces at antlr.org
 ] On Behalf Of Alan Condit
 
  Sent: Wednesday, May 19, 2010 11:25 AM
  To: antlr-interest at antlr.org
  Subject: [antlr-interest] null pointer to ADAPTOR-setTokenBoundaries
 
  Help!!!
 
  I am getting a null pointer to setTokenBoundaries in the following
 line
  of generated code.
  ADAPTOR-setTokenBoundaries(ADAPTOR, retval.tree, retval.start,
  retval.stop);
 
  The grammar works under Java.  In moving it back to 'C', I changed
 the
  language option to 'C', added option ASTLabelType=pANTLR3_BASE_TREE;
  and added the necessary includes to compile and link under Objective-
 C.
 
  Is there anything obvious that I am doing wrong?
 
  Thanks,
  Alan
 
 
 
 
 List: http://www.antlr.org/mailman/listinfo/antlr-interest
 Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-
 email-address




List: http://www.antlr.org/mailman/listinfo/antlr-interest
Unsubscribe: 
http://www.antlr.org/mailman/options/antlr-interest/your-email-address

-- 
You received this message because you are subscribed to the Google Groups 
il-antlr-interest group.
To post to this group, send email to il-antlr-inter...@googlegroups.com.
To unsubscribe from this group, send email to 
il-antlr-interest+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/il-antlr-interest?hl=en.



[il-antlr-interest: 28920] Re: [antlr-interest] C target - initialization of return/scope structures

2010-05-19 Thread Jim Idle
Why would you try to use a return value that you have not set? If it is set to 
NULL then you will core dump unless you check for NULL so it would not help 
you. The values are not initialized because I don't know what they are, they 
might be object references or something that cannot be set to NULL. I changed 
from assuming a nullable target because everyone complained ;-)

But I assure you that you can initialize all your values in the @init{} 
section. Where is it that you are having problems. I think that your question 
might not be the one you are asking.

Jim

 -Original Message-
 From: antlr-interest-boun...@antlr.org [mailto:antlr-interest-
 boun...@antlr.org] On Behalf Of Cristian Târºoagã
 Sent: Wednesday, May 19, 2010 2:08 PM
 To: antlr-interest@antlr.org
 Subject: [antlr-interest] C target - initialization of return/scope
 structures
 
 Hi All,
 
 My name is Chris, I started to use antlr and I like it a lot!
 I use C++ and I have successfully used it to generate some sourcecode.
 
 I need to use C++: I want std::string, std::vector and more things like
 this.
 But since I use the C target, it didn't took too much time to get into
 some
 quirks though.
 
 One of the problems I had/have is this: structures used for return
 values
 and those used for scope values are NOT initialized.
 
 Since I tried to use a std:string as a scoped value, I quickly got a
 nice
 crash since my string was created using malloc.
 
 These are (well) known problems, I know that. I found some posts from
 other
 guys having the same problems.
 I also found some recommendations on how to avoid initialization
 problems.
 E.g:
 http://www.mail-archive.com/il-antlr-
 inter...@googlegroups.com/msg02614.html
 
 The hint there was to use pointers, and:
 
 1. define ANTLR3_MALLOC / ANTLR3_FREE to override antlr's allocators
 
 or
 
 2. manually allocate/deallocate those pointers, probably inside @init
 and
 @after
 
 
 I'd like to have a clean solution to this, but I can't see how any of
 these
 two options can properly work.
 
 
 Option 1: I can't override antlr allocator like suggested
 #define ANTLR3_MALLOC(request) new request()
 because ANTLR_MALLOC is actually called with an argument which is
 actually
 the SIZE of the type that will be allocated and not the TYPE itself.
 I think a simple change inside antlr can fix this, but until then I
 tried the other way...
 
 
 Option 2: I can't use @init and @after because this will create memory
 leaks.
 Imagine that I have a scoped value x. I would do @init {x = new X();}
 and
 @after{delete x;}
 When rule is fully matched, this works perfectly.
 But when the parser fails, the code the pops the scoped value from the
 stack
 is called (and my piece of code inside @after is skipped) so I will get
 a
 memory leak!!
 I noticed that the scoped values also have a free function pointer
 inside
 (member), that can take care of deallocation in that situations, but I
 couldn't find a way to set it. (?)
 
 
 So:
 - my suggestion: change the ANTLR_MALLOC macro (change the name to
 ANTLR_ALLOC and change the impl to take as arg the type itself, so that
 a
 c++ impl could override it with 'new')
 - my suggestion: generate a properly initialized structure (I know,
 it's C
 code, but still...once you have such a smart StringTemplate lib, this
 shouldn't be a problem)
 - my question: what would be a clean way to allocate/deallocate
 pointers
 (without leaks)?
 
 
 THANKS a lot for ANTLR and for your help!
 
Chris
 
 
 PS: I have some other problems too with the C target: I wasn't able to
 use
 composite grammars with C++. I will get back on this later :-)
 
 List: http://www.antlr.org/mailman/listinfo/antlr-interest
 Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-
 email-address




List: http://www.antlr.org/mailman/listinfo/antlr-interest
Unsubscribe: 
http://www.antlr.org/mailman/options/antlr-interest/your-email-address

-- 
You received this message because you are subscribed to the Google Groups 
il-antlr-interest group.
To post to this group, send email to il-antlr-inter...@googlegroups.com.
To unsubscribe from this group, send email to 
il-antlr-interest+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/il-antlr-interest?hl=en.



[il-antlr-interest: 28921] Re: [antlr-interest] enums in v4 ANTLR Java code generation considered useless

2010-05-19 Thread Jim Idle
I suspect that your benchmark runs afoul of clock granularity issues for the 
JIT. If you run it a few times you will likely get different results. Also you 
say 10% better for enums but look at your results again. 

Take the client JIT, your first run gives:

Enum Time: 25707993
Int Time : 28520406

So enum is slightly better, but your second run gives:

Enum Time: 34060167
Int Time : 24820249


And Int time in this run is superior to your enum time by a far greater margin 
than the reverse in the first run. Your server shows a similar disparity. You 
have to run for much longer times and repeat many times, then average out 
because the JIT does not always make the same decision. 

Unless there is something about your print outs that I am missing?

Finally, I would not trust 64 bit openjdk as far as I can throw my house :-)

Finally, finally, you need to look at switch() performance really, and as ANTLR 
will (does if you set the -X options to the same values as I use in the C 
generator) use them. There tend to be a fair number of switch cases with some 
further embedded switches. 

The C optimizer will murder those but the Java JIT has some opportunity to 
reorder the case at runtime and theoretically it could do better than the C 
compiler for some use cases. It rarely does though because of other overheads 
and the fact that most real world applications don't exhibit a polarization to 
one or two oft used cases out of many. You can see that ANTLR generated code 
would only do this if out of many alts, just one or two were taken a lot (which 
would depend on the language being parsed).

Jim

 -Original Message-
 From: antlr-interest-boun...@antlr.org [mailto:antlr-interest-
 boun...@antlr.org] On Behalf Of Kirby Bohling
 Sent: Wednesday, May 19, 2010 4:29 PM
 To: Scott Stanchfield
 Cc: antlr-interest interest
 Subject: Re: [antlr-interest] enums in v4 ANTLR Java code generation
 considered useless
 
 On topic, I think the only important decision to make is from an API
 perspective, while one can go tweak the generator, going from int's
 to enums would change the API.  I'd suggest just deciding which one
 you want to support.  Enums are definitely nicer from that
 perspective.  Given the below performance benchmarks, and just how
 much of ANTLR's output is really just a series of if/else or switch
 blocks buried inside of a huge number of loops, I actually do think
 you'd spot the difference.
 
 Moving well off-topic, but since you said to, I did just what you
 suggested:
 
 Using my personal laptop running Fedora 11 using x86_64 for the kernel
 and JVM:
 $ java -version
 java version 1.6.0_18
 OpenJDK Runtime Environment (IcedTea6 1.8) (fedora-35.b18.fc11-x86_64)
 OpenJDK 64-Bit Server VM (build 14.0-b16, mixed mode)
 
 Both CPU's are Intel(R) Core(TM)2 Duo CPU P8600  @ 2.40GHz w/ 3MB
 cache.
 
 These aren't spectacular benchmarks from an accuracy perspective, but
 illustrate that assuming ints and enums have identical performance
 characteristics in all cases is an invalid assumption:
 
 Using java -Xint Foo:
 Enum Time: 516121334
 Int Time : 424748884
 Enum Time: 514078841
 Int Time : 423574161
 
 ~21% performance hit to use enums with HotSpot disabled, (similar to
 the DalikVM because it has minimal JIT as of right now, which I'm
 guessing why the original article suggested you stay away from them
 near performance critical areas).
 
 Using: java -client Foo
 Enum Time: 25707993
 Int Time : 28520406
 Enum Time: 34060167
 Int Time : 24820249
 
 ~10% speed up for using enums.
 
 Using: java -server Foo
 Enum Time: 25543589
 Int Time : 28637110
 Enum Time: 32887612
 Int Time : 28968574
 
 Again ~10% speed up for using enums.
 
 So there might actually be a reason to support Enum's internally from
 a speed/performance perspective if the non-JIT case is considered
 negligible.  I thought they'd match your claim in this case.  Didn't
 have any reason to actually think enums would be faster then int's.
 
 -- Sample code:
 
 public class Foo {
 
 private static long MAX = 1000;
 
 public static void main(String[] args) {
 doEnums();
 doInts();
 doEnums();
 doInts();
 }
 
 public static void doInts() {
 int val = 0;
 long start = System.nanoTime();
 for (long iii = 0; iii  MAX; ++iii) {
 if (0 == val) {
 val = 1;
 } else if (1 == val) {
 val = 0;
 }
 }
 long end = System.nanoTime();
 System.out.println(Int Time :  + (end - start));
 }
 
 enum Parity { EVEN, ODD };
 public static void doEnums() {
 Parity val = Parity.EVEN;
 long start = System.nanoTime();
 for (long iii = 0; iii  MAX; ++iii) {
 if (Parity.EVEN == val) {
 val = Parity.ODD;
 } else if (Parity.ODD == val) {
 val = Parity.EVEN;
 }
 }
 long end = 

[il-antlr-interest: 28922] Re: [antlr-interest] null pointer to ADAPTOR-setTokenBoundaries

2010-05-19 Thread Jim Idle
Possibly, though I suspect your easy work around is to make each alt a subrule. 
I will look tomorrow.

Jim

 -Original Message-
 From: antlr-interest-boun...@antlr.org [mailto:antlr-interest-
 boun...@antlr.org] On Behalf Of Alan Condit
 Sent: Wednesday, May 19, 2010 5:01 PM
 To: antlr-interest@antlr.org
 Subject: Re: [antlr-interest] null pointer to ADAPTOR-
 setTokenBoundaries
 
 On page 164 of The Definitive Antlr Reference under the heading
 Omitting Input Elements Terrance shows using an empty rewrite rule to
 allow omitting unneeded symbols from the output AST tree.
 
 This does not say that it could not be causing a problem with the
 generated 'C' code.
 
 Jim, is there a possibility that this is a problem?
 
 Alan
 ---
 
 Alan Condit
 1085 Tierra Ct.
 Woodburn, OR 97071
 
 Email -- acon...@ipns.com
 Home-Office (503) 982-0906
 
 On May 19, 2010, at 3:36 PM, John B. Brodie wrote:
 
  Pardon me for butting in.
 
  And I have never used the C code generator, but.
 
  On Wed, 2010-05-19 at 14:06 -0700, Alan Condit wrote:
 
  which I assume, based on the comment, is generated from this rule:
  line   :   line_number? segment+ K_NEWLINE
 - ^(STMT segment+)
 |   line_number? K_NEWLINE
 -
 |   oword_stmt
 - ^(STMT oword_stmt)
 ;
 
  The grammar is for parsing an existing language not one of my
 invention,
  and grammatically the newlines delineate a semantic block therefore
 must
  be known by the parser, but empty lines are discarded and therefore
  should not be in the tree.
 
  having an empty RHS of the - rewrite operator feels well unusual.
 
  i am not sure that ANTLR permits a rule which produces no tree when
  output=AST is present
 
  Maybe try (untested):
 
  line : line_number? ( segment+ - ^(STMT segment+) )? K_NEWLINE
  | oword_stmt - ^(STMT oword_stmt)
  ;
 
  but i do not know what would happen when no segment is present for
 the
  above rule
 
  have you considered building a dummy tree node for the empty case and
  then your tree walker can just ignore it?
 
  not sure that i have really helped any, sorry.
-jbb
 
 
 
 
 List: http://www.antlr.org/mailman/listinfo/antlr-interest
 Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-
 email-address




List: http://www.antlr.org/mailman/listinfo/antlr-interest
Unsubscribe: 
http://www.antlr.org/mailman/options/antlr-interest/your-email-address

-- 
You received this message because you are subscribed to the Google Groups 
il-antlr-interest group.
To post to this group, send email to il-antlr-inter...@googlegroups.com.
To unsubscribe from this group, send email to 
il-antlr-interest+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/il-antlr-interest?hl=en.



[il-antlr-interest: 28923] Re: [antlr-interest] enums in v4 ANTLR Java code generation considered useless

2010-05-19 Thread Scott Stanchfield
I just ran that code with it looping through doEnums/doInts 1000
times. The difference was ~5% for -client and -Xbatch, and ~10% for
-server. (I tried -Xint and it took waay too long). All had enums
as higher, which sounds reasonable (as there's static field lookups
being done)

My main point here is that while we're seeing 5-10% or so differences,
that's 5-10% difference in part of the program that goes incredibly
fast (so a 5-10% hit is unnoticeable), whereas a 5-10% hit in I/O
could be a very big deal.

We're measuring a performance difference of millions of calls. In a
typical parse, you may have a few thousand tokens, each of which may
be tested a few dozen times.

-- Scott


Scott Stanchfield
http://javadude.com



On Wed, May 19, 2010 at 7:29 PM, Kirby Bohling kirby.bohl...@gmail.com wrote:
 On topic, I think the only important decision to make is from an API
 perspective, while one can go tweak the generator, going from int's
 to enums would change the API.  I'd suggest just deciding which one
 you want to support.  Enums are definitely nicer from that
 perspective.  Given the below performance benchmarks, and just how
 much of ANTLR's output is really just a series of if/else or switch
 blocks buried inside of a huge number of loops, I actually do think
 you'd spot the difference.

 Moving well off-topic, but since you said to, I did just what you suggested:

 Using my personal laptop running Fedora 11 using x86_64 for the kernel and 
 JVM:
 $ java -version
 java version 1.6.0_18
 OpenJDK Runtime Environment (IcedTea6 1.8) (fedora-35.b18.fc11-x86_64)
 OpenJDK 64-Bit Server VM (build 14.0-b16, mixed mode)

 Both CPU's are Intel(R) Core(TM)2 Duo CPU     P8600  @ 2.40GHz w/ 3MB cache.

 These aren't spectacular benchmarks from an accuracy perspective, but
 illustrate that assuming ints and enums have identical performance
 characteristics in all cases is an invalid assumption:

 Using java -Xint Foo:
 Enum Time: 516121334
 Int Time : 424748884
 Enum Time: 514078841
 Int Time : 423574161

 ~21% performance hit to use enums with HotSpot disabled, (similar to
 the DalikVM because it has minimal JIT as of right now, which I'm
 guessing why the original article suggested you stay away from them
 near performance critical areas).

 Using: java -client Foo
 Enum Time: 25707993
 Int Time : 28520406
 Enum Time: 34060167
 Int Time : 24820249

 ~10% speed up for using enums.

 Using: java -server Foo
 Enum Time: 25543589
 Int Time : 28637110
 Enum Time: 32887612
 Int Time : 28968574

 Again ~10% speed up for using enums.

 So there might actually be a reason to support Enum's internally from
 a speed/performance perspective if the non-JIT case is considered
 negligible.  I thought they'd match your claim in this case.  Didn't
 have any reason to actually think enums would be faster then int's.

 -- Sample code:

 public class Foo {

    private static long MAX = 1000;

    public static void main(String[] args) {
        doEnums();
        doInts();
        doEnums();
        doInts();
    }

    public static void doInts() {
        int val = 0;
        long start = System.nanoTime();
        for (long iii = 0; iii  MAX; ++iii) {
            if (0 == val) {
                val = 1;
            } else if (1 == val) {
                val = 0;
            }
        }
        long end = System.nanoTime();
        System.out.println(Int Time :  + (end - start));
    }

    enum Parity { EVEN, ODD };
    public static void doEnums() {
        Parity val = Parity.EVEN;
        long start = System.nanoTime();
        for (long iii = 0; iii  MAX; ++iii) {
            if (Parity.EVEN == val) {
                val = Parity.ODD;
            } else if (Parity.ODD == val) {
                val = Parity.EVEN;
            }
        }
        long end = System.nanoTime();
        System.out.println(Enum Time:  + (end - start));
    }

 }


 On Wed, May 19, 2010 at 3:30 PM, Scott Stanchfield sc...@javadude.com wrote:
 Don't pre-optimize for things like this. Profile, then optimize. This
 won't even show up as an issue.

 I think whoever wrote that page was daydreaming about any minor way
 performance might be increased - note that they don't talk at all on
 that page about the big performance issues (I/O, networking, etc),
 though I do like that they talk about limiting object creation.

 With the example they show on that android dev page, you'll never
 see/feel the difference. And their example on grabbing the ordinal
 value so you don't need to lookup a static field is really silly. If
 they just want to avoid looking up the static field everytime through
 the loop, don't do:

     int valX = MyEnum.VAL_X.ordinal();
    int valY = MyEnum.VAL_Y.ordinal();
    int count = list.size();
    MyItem items = list.items();
    for (int  n = 0; n  count; n++)   {
        int  valItem = items[n].e.ordinal();
        if (valItem == valX)
            // do stuff 1
        else if (valItem