I'm finally catching up to our Lexer/Parser thread after a year and a
half. I made a SQL
interpreter with
lex and yacc about 20 years ago, which was used in a real product. My
immediate goal is
to figure out which tools we are most likely to use and learn them.
Yesterday I reviewed
the available open source scanners and parsers after reading what
Campbell and others
had to say. Here are my estimations of the suitability of the available
products for our
use. In most cases, I don't mention the several products which are no
longer actively
used, which are intentionally limiting (i.e., unsuitable for complex
languages), or which
have non-Java requirements for build or run-time.
Knowing the other important goals for HSQLDB in the short and long term,
I don't
know if we'll get enough people to obligate to see through a rewrite of
the primary
HSQLDB interpreter/parser. I'm going to learn the tools anyways though,
in part
because I intend to make a PL/SQL engine for HSQLDB, and I don't think that
anything but a trivial subset would be possible without using a real
scanner and/or
parser.
Scanners
JFlex. GPL. I like it. Input looks very intuitive... at least for
somebody who used
to use lex. Documentation and examples look great. Lots of people
think
very highly of it. Change log shows that they took the effort to
follow Java
conventions (e.g. output class naming, Ant builds).
JLex. GPL-compatible. Not seeing much activity in the past couple
years.
JFlex is a "rewrite" of
JLex and can be run in JLex-compatibility mode. According to the
JFlex authors,
JFlex can do everything that JLex could do, but faster and better.
Not wanting to
take the time to run my own comparisons, I'm inclined to believe
that until I hear
somebody disagree.
SableCC. Looks like a good product. Supports EBNF. Not as widely
used as
JFlex, but still actively used. Input looks similar to lex input.
Compares well
to Antlr, but can't find comparison to JLex. Looks like I'd have to
download
the distro to find out what kind of license it has.
Coco/R. "Slightly extended" GPL. I'm not sure what the input
specification files
looks like for the Coco/R scanner, but if it is ATG files, they look
much more
difficult to maintain and much less intuitive than
lex/flex/jlex/jflex input files.
Looks like docs only availble in PDF (I love HTML docs!), but maybe the
HTML docs just aren't advertised as well as the PDFs. The
documentation
about the Java distro gives me the feeling that good Java design is
not their
top priority.
Antlr. BSD license. Poor performance.
Parsers
JavaCC (formerly "Jack"). BSD. Poor docs. I haven't seen anybody
compare CUP and
JavaCC and favor JavaCC. I saw somebody complain that JavaCC was
commercial, so
it may have been commercial at one time.
Jacc. BSD. Pure Java. I don't see much use of it for the past few
years. I'm concerned
about support, fixes for newly found bugs, etc.
CUP. GPL-compatible. Very popular. Highly regarded. Definitely
works with JFlex.
Beaver. BSD. Not as popular as CUP, but still actively used.
Definitely works with
JFlex. Allegedly the fastest performance possible for the class of
parsers that
we're looking at. Input spec looks intuitive yet powerful. Good
Java design.
SableCC. Not as widely used as CUP. I don't know about
compatibility with
JFlex. See SableCC listing in Scanner list.
Coco/R. See Coco/R listing in Scanner list.
SableCC. See SabbleCC listing in Scanner list.
Please reply with recommendations against any of these. I don't want to
waste my time
learning products that I won't ever use. I've already started learning
JFlex. CUP and
Beaver are next on my to-learn list, depending on future discussion and
findings.
-------------------------------------------------------
This SF.Net email is sponsored by:
Power Architecture Resource Center: Free content, downloads, discussions,
and more. http://solutions.newsforge.com/ibmarch.tmpl
_______________________________________________
hsqldb-developers mailing list
hsqldb-developers@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/hsqldb-developers