The library may be that big, but look at the load size. It may not be as big as it looks.
Jim > -----Original Message----- > From: Todor Dimitrov [mailto:[email protected]] > Sent: Saturday, August 20, 2011 10:04 AM > To: Jim Idle > Cc: [email protected] > Subject: Re: [antlr-interest] Sparql Grammar & Huge C Files > > I followed your instructions and successfully compiled the lexer to a > static library. The file size of the library is 82M, which is still > quite large for my needs. I will try to rewrite/simplify the grammar. > > Thank you very much for your support! > > Todor > > > > On Aug 20, 2011, at 6:13 PM, Jim Idle wrote: > > > The lexer rules: > > > > BLANK_NODE_LABEL : '_:' t=PN_LOCAL { setText($t.text); }; > > > > VAR1 : QUESTION_MARK v=VARNAME { setText($v.text); }; > > > > VAR2 : '$' v=VARNAME { setText($v.text); } > > > > > > Are coded for Java and not C, you cannot simply change the target > > language when there is embedded Java code. > > > > All the lexer rules are specified as ('E'|'e' etc, which will > generate > > bigger tables than the other ways to implement case insensitivity as > > explained on the wiki. Also, it has a lot of rules that it has just > > left ANTLR to sort out, which is fair enough, but it is much better > to > > left factor the rules and change the $type once you know what the > > token is. For instance all the numeric rules. > > > > The parser grammar will just work, but it is just naturally a big > one. > > You might contact the authors about it. There are probably a lot of > > ways it could be made more efficient, but as the tables are all > > static, then it does not matter that much in C. Look at the size of > > the data segment once it is compiled as this is a better indicator > > than the size of the source code, which has lots of annotations. > > > > Finally look at the code that it is output, find the decisions that > > are generating large decision trees and look at the corresponding > > rules for any optimizations. However fix up the SETTEXT and it will > just work. > > > > To fix the SETTEXT I would just not do what they are doing but merely > > advance the start pointer in the token by 1 or 2 when/if you use it > > (or within the lexer code if you must). That is trivial and better > > performance. In otherwords just take the setText() actions out > altogether. > > > > Don't forget to use antlr.markmail.org > > > > > > Jim > > > > > >> -----Original Message----- > >> From: Todor Dimitrov [mailto:[email protected]] > >> Sent: Saturday, August 20, 2011 8:53 AM > >> To: Jim Idle > >> Subject: Re: [antlr-interest] Sparql Grammar & Huge C Files > >> > >> Hi Jim, > >> > >> this is an open source grammar for the Sparql language that has not > >> been developed by me. I have run the ANTLR tool like this: > >> > >> java -Xms1024m -Xmx1024m -cp antlr-3.4-complete.jar org.antlr.Tool > >> Sparql.g > >> > >> No warnings have been outputted and looking at the ANTLR tool > >> options, I don't see any switches that would enable/disable warnings > generation. > >> I'm not using the SETTEXT macro and I'm not quite sure where to use > it. > >> Are there any examples for it? In addition, the Sparql grammar > >> contains only rewriting rules so I'm not sure whether I have to use > >> the SETTEXT macro. I've attached the grammar file for reference. > >> > >> Todor > > > > List: http://www.antlr.org/mailman/listinfo/antlr-interest > > Unsubscribe: > > http://www.antlr.org/mailman/options/antlr-interest/your-email- > address > > List: http://www.antlr.org/mailman/listinfo/antlr-interest Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-email-address -- You received this message because you are subscribed to the Google Groups "il-antlr-interest" group. To post to this group, send email to [email protected]. To unsubscribe from this group, send email to [email protected]. For more options, visit this group at http://groups.google.com/group/il-antlr-interest?hl=en.
