Global LOOKAHEAD of the SQL parser

Hongze Zhang Tue, 12 Feb 2019 08:33:54 -0800

Hi all,


Recently I have spent some time on playing with Calcite's built-in SQL parsers. 
And now I'm interested with the reason why the global LOOKAHEAD is set to 2[1] 
by default.
I run a comparative benchmark using the ParserBenchmark util class, and the 
output log shows visible parsing performance improvement after setting global 
LOOKAHEAD to 1. For example, the metric "ParserBenchmark.parseCached" reduced 
from 1693.821 us/op ± 1921.925 us/op[2] to 655.452 ± 181.100 us/op[3].


JavaCC always generates java methods like jj_2_**(...) for LL(k) (k > 1) 
grammar[4], I am almost sure this way is not efficient comparing with the 
generated code for LL(1) grammar. It might be great If we can somehow take the 
advantage of JavaCC's LL(1). 
And of course I could see there was some trade-off consideration about not 
using LL(1) by default. Then I have done some search on dev list and JIRA cases 
but found nothing. Does anyone hold information about that?


Best,
Hongze


[1] 
https://github.com/apache/calcite/blob/883666929478aabe07ee5b9e572c43a6f1a703e2/core/pom.xml#L304
[2] https://www.dropbox.com/s/il6nodc44dzo0rz/bench_la2.log?dl=0
[3] https://www.dropbox.com/s/4rrou71siskdhhm/bench_la1.log?dl=0
[4] 
http://www.cs.tau.ac.il/~msagiv/courses/lab/Shai2/tools/javacc/examples/JavaGrammars/OPTIMIZING

Global LOOKAHEAD of the SQL parser

Reply via email to