Hi I got this info from javacc mailing lists. This may prove helpful: ---------------------------------------------------------------------------------------------------------------------------------------------------------------- -----Original Message----- From: Ken Beesley [mailto:ken....@xrce.xerox.com] Sent: Wednesday, August 18, 2004 2:56 PM To: javacc Subject: [JavaCC] Alternatives to JavaCC (was Hello All)
Vicas wrote: Hello All Kindly let me know other parsers available which does the same job as javacc. It would be very nice of you if you can send me some documentation related to this. Thanks Vikas (Correction and clarifications to the following would be _very_ welcome. I'm very likely out of date.) Of course, no two software tools are likely to do _exactly_ the same job. Someone already pointed you to ANTLR, which is probably the best-known alternative to JavaCC. Another possibility is SableCC. http://sablecc.org The criteria include stability, documentation, language of the parser generated, and abstract-syntax-tree building. When I last looked (a couple of years ago) at ANTLR, SableCC and JavaCC, I chose JavaCC for the following reasons: 1. ANTLR could not handle Unicode input. Things change, of course, so ANTLR might now be more Unicode-friendly. Unicode was important to me, so this was a big factor in my decision. On the plus side for ANTLR, it has better abstract-syntax-tree building capabilities (in my opinion) than JJTree/JavaCC. You can learn to use JJTree commands, but it's not easy for most people. And ANTLR can generate either a Java or a C++ parser. JavaCC generates only Java parsers. Another concern about ANTLR was that it was reputed to change a lot as the guru, Terence Parr, experimented with new syntax and functionality. JavaCC, at least at the time, was reputed to be more stable, perhaps stable to a fault. I wanted stability and reliability. 2. SableCC is much like JavaCC; it generates a Java parser from a grammar description; but it had, in my opinion, less flexible abstract-syntax-tree building than JJTree/JavaCC. In SableCC (when I looked at it), the AST it built was always a direct reflection of your grammar, generating one tree node for each grammar expansion involved in a parse, much like using JavaCC with Java Tree Builder (JTB http://www.cs.purdue.edu/jtb/). When using JavaCC, JTB is the alternative to using JJTree. Using SableCC, or the combination JavaCC/JTB, should be _very_ similar indeed. In my opinion, SableCC and JavaCC/JTB have made a conscious choice to simplify AST building--you get trees that reflect the expansions in your grammar. Period. But often these default trees will be big, full of extraneous nodes that reflect precedence hierarchies in the recursive-descent parsing. If you want to have more control over AST building, to get more compact and tailored ASTs, you need to pay the price of learning JJTree. Assuming that you need to build ASTs, with JavaCC you have the choice between JJTree and JTB. With SableCC, when I last looked at it, you only get the JTB-like option. ******* (Again, corrections and expansions would be much appreciated.) Ken Beesley --------------------------------------------------------------------------------------------------------------------------------------------------- Of course, no two software tools are likely to do _exactly_ the same job. Someone already pointed you to ANTLR, which is probably the best-known alternative to JavaCC. Another possibility is SableCC. http://sablecc.org The criteria include stability, documentation, language of the parser generated, and abstract-syntax-tree building. When I last looked (a couple of years ago) at ANTLR, SableCC and JavaCC, I chose JavaCC for the following reasons: 1. ANTLR could not handle Unicode input. Things change, of course, so ANTLR might now be more Unicode-friendly. Unicode was important to me, so this was a big factor in my decision. On the plus side for ANTLR, it has better abstract-syntax-tree building capabilities (in my opinion) than JJTree/JavaCC. You can learn to use JJTree commands, but it's not easy for most people. And ANTLR can generate either a Java or a C++ parser. JavaCC generates only Java parsers. Another concern about ANTLR was that it was reputed to change a lot as the guru, Terence Parr, experimented with new syntax and functionality. JavaCC, at least at the time, was reputed to be more stable, perhaps stable to a fault. I wanted stability and reliability. 2. SableCC is much like JavaCC; it generates a Java parser from a grammar description; but it had, in my opinion, less flexible abstract-syntax-tree building than JJTree/JavaCC. In SableCC (when I looked at it), the AST it built was always a direct reflection of your grammar, generating one tree node for each grammar expansion involved in a parse, much like using JavaCC with Java Tree Builder (JTB http://www.cs.purdue.edu/jtb/). When using JavaCC, JTB is the alternative to using JJTree. Using SableCC, or the combination JavaCC/JTB, should be _very_ similar indeed. In my opinion, SableCC and JavaCC/JTB have made a conscious choice to simplify AST building--you get trees that reflect the expansions in your grammar. Period. But often these default trees will be big, full of extraneous nodes that reflect precedence hierarchies in the recursive-descent parsing. If you want to have more control over AST building, to get more compact and tailored ASTs, you need to pay the price of learning JJTree. Assuming that you need to build ASTs, with JavaCC you have the choice between JJTree and JTB. With SableCC, when I last looked at it, you only get the JTB-like option. ---------- On Mon, Feb 23, 2009 at 10:06 PM, Alan Gates <ga...@yahoo-inc.com> wrote: > We looked into antlr. It appears to be very similar to javacc, with the > added feature that the java code it generates is humanly readable. That > isn't why we want to switch off of javacc. Olga listed the 3 things we want > out of a parser that javacc isn't giving us (lack of docs, no easy > customization of error handle, decoupling of scanning and parsing). So > antlr doesn't look viable. > > In response to Pi's suggestion that we could use the logical plan, I hope we > could use something close to it. Whatever we choose we want it to be > flexible enough to represent richer language constructs (like branch and > loop). I'm not sure our current logical plan can do that. At the same > time, we don't need another layer of translation (we already have logical -> > physical -> mapreduce). I would like to find a representation that could > handle expressing the syntax and what is currently the logical plan. > > Alan. > > On Feb 20, 2009, at 5:15 PM, pi song wrote: > >> Should be pretty close but we may need to cleanup the interface a bit. >> Then >> the new parser module can be switched in easily. >> BTW, have we already got the solution for the new parser generator? >> >> Pi >> >> >> On Fri, Feb 20, 2009 at 9:03 PM, Ted Dunning <ted.dunn...@gmail.com> >> wrote: >> >>> >>> Probably nearly the same effect as you suggest. Are the concepts at the >>> logical plan layer similar to those expressed in pig latin? Or has a >>> significant transformation occurred by then? >>> >>> >>> On Fri, Feb 20, 2009 at 1:59 AM, pi song <pi.so...@gmail.com> wrote: >>> >>>> Sounds good but how about exposing the logical plan layer instead? >>>> Wouldn't >>>> that yield the same effect? From python for example you still can >>>> construct >>>> a logical plan and give to Pig to execute. >>>> >>> >>> >>> >>> -- >>> Ted Dunning, CTO >>> DeepDyve >>> >>> > > -- Nitesh Bhatia Dhirubhai Ambani Institute of Information & Communication Technology Gandhinagar Gujarat "Life is never perfect. It just depends where you draw the line." visit: http://www.awaaaz.com - connecting through music http://www.volstreet.com - lets volunteer for better tomorrow http://www.instibuzz.com - Voice opinions, Transact easily, Have fun