Re: master plan for jsr 283 query implementation

Thomas Mueller Wed, 12 Sep 2007 01:24:53 -0700

Hi,

Two more advantages of a hand-written parser:


- You can actually debug the parser. No chance with JavaCC or ANTLR
- Better tools support (refactoring, autocomplete)

> sorry for my somewhat ironic statement about you being the only one
> wanting a hand-written parser,

To my surprise, it turns out I was wrong!

> Just curious, don't you use use a separate tokenizing step in your
> hand-written parsers (I'm asking because of the literal "AND" above)?

Lexing (tokenizing, scanning) is done in a lower level. Can be
hand-written, or using a tool (for example StringTokenizer, or JFlex).
The boundary between tokenizing, lexing and parsing is soft. In my
example tokenizing is done in 'read(): a token'.

> I usually prefer a separate tokenizing step, if only to make testing
> easier.

Sure! Not sure how to do that in JavaCC or ANTLR, but it is probably
possible as well.

> context-sensitive tokenizing

I'm not sure what you refer to. Keywords versus identifiers? Example
token types are: 'integer value', 'decimal value', 'text value',
'operator', 'quoted identifier', 'name'. The keywords are well defined
in Java, but for SQL, I wouldn't decide if it's a keyword or
identifier while tokenizing. Remarks are usually silently eaten by the
tokenizer (except for @deprecated in Javac).

> The final answer to this question is probably "whoever implements it
> gets to decide". For me, the easiest way to understand a parser would
> be the unit tests which demonstrate its functionality, anyway.

I fully agree.

Some example parser code:

Derby JavaCC source file (313 KB):
http://svn.apache.org/repos/asf/db/derby/code/trunk/java/engine/org/apache/derby/impl/sql/compile/sqlgrammar.jj
(the generated .java files are 691 + 314 + 20 + 5 = 1030 KB)

H2 hand-written parser (161 KB):
http://h2database.googlecode.com/svn/trunk/h2/src/main/org/h2/command/Parser.java

Thomas

Re: master plan for jsr 283 query implementation

Reply via email to