Hey Mihai and Julian,

Thanks for looking at this.
The performance hit depends on the size of the stack.
Here are the numbers I have for the moment: (*not run in a good benchmark
setup* but gives an idea)

Benchmark                                                          Mode
 Cnt   Score   Error  Units
BabelParserBenchmark.instantiateBabelParser                        avgt
 7  14.128 ± 0.041  us/op
BabelParserBenchmark.instantiateBabelParserErrorFixed              avgt
 7  12.054 ± 0.039  us/op

BabelParserBenchmark.instantiateBabelParserBigCallStack            avgt
 7  20.576 ± 0.069  us/op
BabelParserBenchmark.instantiateBabelParserBigCallStackErrorFixed  avgt
 7  12.479 ± 0.068  us/op

For the smallest error stack possible with JMH, the current version is ~15%
slower than with the fix.
With a bigger stack - instantiation that happens on depth ~100 - the
current version gets even slower. With the fix the time is stable.

Note I'm not sure my method to make the stack bigger is realistic. I just
do some recursive calls.
Also to fix the issue I just copied the SqlBabelParserImpl and fixed the
stack trace call, so the results above may not match exactly the change
that will be observed by upgrading to 7.0.13.

I can create a PR that adds the benchmark (without the hotfix *ErrorFixed
one) if it's okay for you.

On Tue, Jul 9, 2024 at 4:07 AM Julian Hyde <jh...@apache.org> wrote:

> I've started work on
> https://issues.apache.org/jira/browse/CALCITE-5541, to upgrade to
> 7.0.13. There are still a few errors, but the inefficiency in the
> subclass of Error seems to have been solved.
>
> Can someone work on the microbenchmark? I would like to see numbers
> before and after this fix.
>
> Julian
>
>
> On Mon, Jul 8, 2024 at 1:51 PM Mihai Budiu <mbu...@gmail.com> wrote:
> >
> > There is an issue to upgrade JavaCC:
> https://issues.apache.org/jira/browse/CALCITE-5541
> >
> > It may be a worthwhile effort to do it.
> >
> > Mihai
> > ________________________________
> > From: Cyril DE CATHEU <cy...@startree.ai.INVALID>
> > Sent: Friday, July 5, 2024 8:55 AM
> > To: dev@calcite.apache.org <dev@calcite.apache.org>
> > Subject: Re: SqlBabelParserImpl instantiation is slow because it
> instantiates a java.lang.Error
> >
> > Me again,
> >
> > So this LookaheadSuccess comes from JavaCC.
> > It seems this issue was solved in JavaCC some time ago, from JavaCC
> 7.0.5:
> > https://github.com/javacc/javacc/pull/99
> >
> https://github.com/javacc/javacc/blob/46aa917fda0d0726690e384632e5cefae46bab68/docs/release-notes.md?plain=1#L163
> >
> >
> >
> > On Fri, Jul 5, 2024 at 5:44 PM Cyril DE CATHEU <cy...@startree.ai
> <mailto:cy...@startree.ai>> wrote:
> > Had a deeper look, it seems the LookaheadSuccess instance is exploited
> for conditional branching
> >
> > Eg:
> >
> > try { return !jj_3_2(); }
> > catch(LookaheadSuccess ls) { return true; }
> > finally { jj_save(1, xla); }
> >
> > So I guess lazy instantiation will not help because the LookaheadSuccess
> will be instantiated anyway.
> > If we know this error is always caught internally in the class, could we
> override the constructor to not do a call to Throwable.fillInStackTrace();
> ? which is what makes this slow.
> >
> > On Fri, Jul 5, 2024 at 10:53 AM Cyril DE CATHEU <cy...@startree.ai
> <mailto:cy...@startree.ai>> wrote:
> > Hey,
> >
> > I'm using Calcite to parse snippets of SQL.
> > My (simplified) function looks like this:
> >
> > public static SqlNode expressionToNode(final String sqlExpression,
> >     final SqlParser.Config config) {
> >   SqlParser sqlParser = SqlParser.create(sqlExpression, config);
> >   try {
> >     return sqlParser.parseExpression();
> >   } catch (SqlParseException e) {
> >      //do something
> >   }
> > }
> >
> > and it is called many times.
> > The parser set in the config is Babel, so the parser instantiated is
> SqlBabelParserImpl.
> > A SqlBabelParserImpl is instantiated every time this function is called.
> From what I understand this parser cannot be re-used easily to parse
> different expressions but let me know if I'm incorrect.
> >
> > The issue I'm encountering is the instantiation of this class is pretty
> slow because one of the instance field  jj_ls extends java.lang.Error and
> is instantiated when the SqlBabelParserImpl is instantiated.
> >
> > [Screenshot 2024-07-05 at 10.37.17.png]
> >
> > here is the extract of the generated code in SqlBabelParserImpl:
> >
> > static private final class LookaheadSuccess extends java.lang.Error { }
> >
> > final private LookaheadSuccess jj_ls = new LookaheadSuccess();
> >
> > final private boolean jj_scan_token(int kind) {
> >   if (jj_scanpos == jj_lastpos) {
> >     jj_la--;
> >     if (jj_scanpos.next == null) {
> >       jj_lastpos = jj_scanpos = jj_scanpos.next =
> token_source.getNextToken();
> >     } else {
> >       jj_lastpos = jj_scanpos = jj_scanpos.next;
> >     }
> >   } else {
> >     jj_scanpos = jj_scanpos.next;
> >   }
> >   if (jj_rescan) {
> >     int i = 0; Token tok = token;
> >     while (tok != null && tok != jj_scanpos) { i++; tok = tok.next; }
> >     if (tok != null) jj_add_error_token(kind, i);
> >   }
> >   if (jj_scanpos.kind != kind) return true;
> >   if (jj_la == 0 && jj_scanpos == jj_lastpos) throw jj_ls;
> >   return false;
> > }
> >
> >
> > jj_ls is only used in an error case. (before last line of jj_scan_token)
> > I'm assuming the re-use of a single error instance is done on purpose,
> > but could we consider instantiate jj_ls lazily?
> >
> > I can try to do this but I'm a bit lost in the code generation codebase
> so I would need some pointers.
> >
> > Have a nice day.
> > --
> >
> >  [StarTree] <https://startree.ai>
> > Cyril de Catheu
> > Software Engineer
> > +33 (684) 829-908<tel:+33+(684)+829-908>
> > Follow us: [RSS] <https://www.startree.ai/blogs> [LinkedIn] <
> https://www.linkedin.com/company/startreedata/> [Twitter] <
> https://twitter.com/startreedata> [Slack] <https://stree.ai/slack>
> [YouTube] <https://youtube.com/StarTreeData>
> >
> > [Try StarTree Cloud Today]<
> https://get.startree.ai/startree-cloud?utm_campaign=byoc-edition-of-startree-cloud&utm_source=email&utm_content=startree-employee-email-signatures
> >
>

Reply via email to