Hey Mihai and Julian, Thanks for looking at this. The performance hit depends on the size of the stack. Here are the numbers I have for the moment: (*not run in a good benchmark setup* but gives an idea)
Benchmark Mode Cnt Score Error Units BabelParserBenchmark.instantiateBabelParser avgt 7 14.128 ± 0.041 us/op BabelParserBenchmark.instantiateBabelParserErrorFixed avgt 7 12.054 ± 0.039 us/op BabelParserBenchmark.instantiateBabelParserBigCallStack avgt 7 20.576 ± 0.069 us/op BabelParserBenchmark.instantiateBabelParserBigCallStackErrorFixed avgt 7 12.479 ± 0.068 us/op For the smallest error stack possible with JMH, the current version is ~15% slower than with the fix. With a bigger stack - instantiation that happens on depth ~100 - the current version gets even slower. With the fix the time is stable. Note I'm not sure my method to make the stack bigger is realistic. I just do some recursive calls. Also to fix the issue I just copied the SqlBabelParserImpl and fixed the stack trace call, so the results above may not match exactly the change that will be observed by upgrading to 7.0.13. I can create a PR that adds the benchmark (without the hotfix *ErrorFixed one) if it's okay for you. On Tue, Jul 9, 2024 at 4:07 AM Julian Hyde <jh...@apache.org> wrote: > I've started work on > https://issues.apache.org/jira/browse/CALCITE-5541, to upgrade to > 7.0.13. There are still a few errors, but the inefficiency in the > subclass of Error seems to have been solved. > > Can someone work on the microbenchmark? I would like to see numbers > before and after this fix. > > Julian > > > On Mon, Jul 8, 2024 at 1:51 PM Mihai Budiu <mbu...@gmail.com> wrote: > > > > There is an issue to upgrade JavaCC: > https://issues.apache.org/jira/browse/CALCITE-5541 > > > > It may be a worthwhile effort to do it. > > > > Mihai > > ________________________________ > > From: Cyril DE CATHEU <cy...@startree.ai.INVALID> > > Sent: Friday, July 5, 2024 8:55 AM > > To: dev@calcite.apache.org <dev@calcite.apache.org> > > Subject: Re: SqlBabelParserImpl instantiation is slow because it > instantiates a java.lang.Error > > > > Me again, > > > > So this LookaheadSuccess comes from JavaCC. > > It seems this issue was solved in JavaCC some time ago, from JavaCC > 7.0.5: > > https://github.com/javacc/javacc/pull/99 > > > https://github.com/javacc/javacc/blob/46aa917fda0d0726690e384632e5cefae46bab68/docs/release-notes.md?plain=1#L163 > > > > > > > > On Fri, Jul 5, 2024 at 5:44 PM Cyril DE CATHEU <cy...@startree.ai > <mailto:cy...@startree.ai>> wrote: > > Had a deeper look, it seems the LookaheadSuccess instance is exploited > for conditional branching > > > > Eg: > > > > try { return !jj_3_2(); } > > catch(LookaheadSuccess ls) { return true; } > > finally { jj_save(1, xla); } > > > > So I guess lazy instantiation will not help because the LookaheadSuccess > will be instantiated anyway. > > If we know this error is always caught internally in the class, could we > override the constructor to not do a call to Throwable.fillInStackTrace(); > ? which is what makes this slow. > > > > On Fri, Jul 5, 2024 at 10:53 AM Cyril DE CATHEU <cy...@startree.ai > <mailto:cy...@startree.ai>> wrote: > > Hey, > > > > I'm using Calcite to parse snippets of SQL. > > My (simplified) function looks like this: > > > > public static SqlNode expressionToNode(final String sqlExpression, > > final SqlParser.Config config) { > > SqlParser sqlParser = SqlParser.create(sqlExpression, config); > > try { > > return sqlParser.parseExpression(); > > } catch (SqlParseException e) { > > //do something > > } > > } > > > > and it is called many times. > > The parser set in the config is Babel, so the parser instantiated is > SqlBabelParserImpl. > > A SqlBabelParserImpl is instantiated every time this function is called. > From what I understand this parser cannot be re-used easily to parse > different expressions but let me know if I'm incorrect. > > > > The issue I'm encountering is the instantiation of this class is pretty > slow because one of the instance field jj_ls extends java.lang.Error and > is instantiated when the SqlBabelParserImpl is instantiated. > > > > [Screenshot 2024-07-05 at 10.37.17.png] > > > > here is the extract of the generated code in SqlBabelParserImpl: > > > > static private final class LookaheadSuccess extends java.lang.Error { } > > > > final private LookaheadSuccess jj_ls = new LookaheadSuccess(); > > > > final private boolean jj_scan_token(int kind) { > > if (jj_scanpos == jj_lastpos) { > > jj_la--; > > if (jj_scanpos.next == null) { > > jj_lastpos = jj_scanpos = jj_scanpos.next = > token_source.getNextToken(); > > } else { > > jj_lastpos = jj_scanpos = jj_scanpos.next; > > } > > } else { > > jj_scanpos = jj_scanpos.next; > > } > > if (jj_rescan) { > > int i = 0; Token tok = token; > > while (tok != null && tok != jj_scanpos) { i++; tok = tok.next; } > > if (tok != null) jj_add_error_token(kind, i); > > } > > if (jj_scanpos.kind != kind) return true; > > if (jj_la == 0 && jj_scanpos == jj_lastpos) throw jj_ls; > > return false; > > } > > > > > > jj_ls is only used in an error case. (before last line of jj_scan_token) > > I'm assuming the re-use of a single error instance is done on purpose, > > but could we consider instantiate jj_ls lazily? > > > > I can try to do this but I'm a bit lost in the code generation codebase > so I would need some pointers. > > > > Have a nice day. > > -- > > > > [StarTree] <https://startree.ai> > > Cyril de Catheu > > Software Engineer > > +33 (684) 829-908<tel:+33+(684)+829-908> > > Follow us: [RSS] <https://www.startree.ai/blogs> [LinkedIn] < > https://www.linkedin.com/company/startreedata/> [Twitter] < > https://twitter.com/startreedata> [Slack] <https://stree.ai/slack> > [YouTube] <https://youtube.com/StarTreeData> > > > > [Try StarTree Cloud Today]< > https://get.startree.ai/startree-cloud?utm_campaign=byoc-edition-of-startree-cloud&utm_source=email&utm_content=startree-employee-email-signatures > > >