[jira] [Commented] (CALCITE-2703) Reduce code generation and class loading overhead when executing queries in the EnumerableConvention
[ https://issues.apache.org/jira/browse/CALCITE-2703?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16779291#comment-16779291 ] Stamatis Zampetakis commented on CALCITE-2703: -- Below, I outline a few alternative names (along with a short description) for property "calcite.bindable.cache.maxSize". Among them, I still prefer the initial one but if you (mostly referring to [~vladimirsitnikov]) have a different opinion let me know and I will use that one. +Reminder+ The interface defines Bindable as a "Statement that can be bound to a DataContext and then executed". +Alternatives+ * *calcite.bindable.cache.maxSize*: The maximum size of the cache used for storing Bindable objects, instantiated via dynamically generated Java classes. * *calcite.codegen.plan.cache.maxSize*: The maximum size of the cache used for storing execution plans, instantiated via dynamically generated Java classes. * *calcite.executable.cache.maxSize*, *calcite.runtime.plan.cache.maxSize*, *calcite.execution.plan.cache.maxSize*: The maximum size of the cache used for storing ready to execute query plan instances. * *calcite.statement.cache.maxSize*: The maximum size of the cache that is used for storing ready to execute statements. * *calcite.interpreter.plan.cache.maxSize* * *calcite.interpreter.query.cache.maxSize* FYI: I plan to merge the PR tomorrow. > Reduce code generation and class loading overhead when executing queries in > the EnumerableConvention > > > Key: CALCITE-2703 > URL: https://issues.apache.org/jira/browse/CALCITE-2703 > Project: Calcite > Issue Type: Improvement > Components: core >Affects Versions: 1.17.0 >Reporter: Stamatis Zampetakis >Assignee: Stamatis Zampetakis >Priority: Major > Labels: performance, pull-request-available > Time Spent: 3h > Remaining Estimate: 0h > > The queries using Calcite's EnumerableConvention always end-up generating new > java classes at runtime (using Janino) that are then instantiated using > reflection. This combination of class generation and class loading introduces > a big overhead in query response time. > A quick profiling on our Company's internal test suite consisting in 4000 > tests with roughly 43 SQL queries passing through Calcite we observed > that a big amount of time is spend on code generation and class loading > making the EnumerableInterpretable#toBindable method a performance > bottleneck. > Among the 43 SQL queries there are many duplicates which are going to > lead to the generation of exactly the same code Java. Introducing, a small > cache at the level of EnumerableInterpretable class could avoid generating > and loading the same code over and over again. > A simple implementation based on Guava improved the overall execution time of > the afforementioned test suite by more than 50%. > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (CALCITE-2703) Reduce code generation and class loading overhead when executing queries in the EnumerableConvention
[ https://issues.apache.org/jira/browse/CALCITE-2703?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16768751#comment-16768751 ] Stamatis Zampetakis commented on CALCITE-2703: -- Thanks [~julianhyde], [~hhlai1990] for your comments. I agree, about keeping the properties centralised. I will take care of it! > Reduce code generation and class loading overhead when executing queries in > the EnumerableConvention > > > Key: CALCITE-2703 > URL: https://issues.apache.org/jira/browse/CALCITE-2703 > Project: Calcite > Issue Type: Improvement > Components: core >Affects Versions: 1.17.0 >Reporter: Stamatis Zampetakis >Assignee: Julian Hyde >Priority: Major > Labels: performance, pull-request-available > > The queries using Calcite's EnumerableConvention always end-up generating new > java classes at runtime (using Janino) that are then instantiated using > reflection. This combination of class generation and class loading introduces > a big overhead in query response time. > A quick profiling on our Company's internal test suite consisting in 4000 > tests with roughly 43 SQL queries passing through Calcite we observed > that a big amount of time is spend on code generation and class loading > making the EnumerableInterpretable#toBindable method a performance > bottleneck. > Among the 43 SQL queries there are many duplicates which are going to > lead to the generation of exactly the same code Java. Introducing, a small > cache at the level of EnumerableInterpretable class could avoid generating > and loading the same code over and over again. > A simple implementation based on Guava improved the overall execution time of > the afforementioned test suite by more than 50%. > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (CALCITE-2703) Reduce code generation and class loading overhead when executing queries in the EnumerableConvention
[ https://issues.apache.org/jira/browse/CALCITE-2703?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16767551#comment-16767551 ] Julian Hyde commented on CALCITE-2703: -- There is not currently a global config holder. (Let's have that discussion elsewhere... this case is about caching, not config.) But I do think that we should centralize the code that reads system properties, so that each property we use has a definition. That is sufficiently easy that it could be done as part of this case. > Reduce code generation and class loading overhead when executing queries in > the EnumerableConvention > > > Key: CALCITE-2703 > URL: https://issues.apache.org/jira/browse/CALCITE-2703 > Project: Calcite > Issue Type: Improvement > Components: core >Affects Versions: 1.17.0 >Reporter: Stamatis Zampetakis >Assignee: Julian Hyde >Priority: Major > Labels: performance, pull-request-available > > The queries using Calcite's EnumerableConvention always end-up generating new > java classes at runtime (using Janino) that are then instantiated using > reflection. This combination of class generation and class loading introduces > a big overhead in query response time. > A quick profiling on our Company's internal test suite consisting in 4000 > tests with roughly 43 SQL queries passing through Calcite we observed > that a big amount of time is spend on code generation and class loading > making the EnumerableInterpretable#toBindable method a performance > bottleneck. > Among the 43 SQL queries there are many duplicates which are going to > lead to the generation of exactly the same code Java. Introducing, a small > cache at the level of EnumerableInterpretable class could avoid generating > and loading the same code over and over again. > A simple implementation based on Guava improved the overall execution time of > the afforementioned test suite by more than 50%. > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (CALCITE-2703) Reduce code generation and class loading overhead when executing queries in the EnumerableConvention
[ https://issues.apache.org/jira/browse/CALCITE-2703?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16766897#comment-16766897 ] Lai Zhou commented on CALCITE-2703: --- [~julianhyde], I think this cache will be shared by different connections, it'll be better to provide a global property . Is there a global config holder? I don't like the way Util.getXXProperty to get the system property. > Reduce code generation and class loading overhead when executing queries in > the EnumerableConvention > > > Key: CALCITE-2703 > URL: https://issues.apache.org/jira/browse/CALCITE-2703 > Project: Calcite > Issue Type: Improvement > Components: core >Affects Versions: 1.17.0 >Reporter: Stamatis Zampetakis >Assignee: Julian Hyde >Priority: Major > Labels: performance, pull-request-available > > The queries using Calcite's EnumerableConvention always end-up generating new > java classes at runtime (using Janino) that are then instantiated using > reflection. This combination of class generation and class loading introduces > a big overhead in query response time. > A quick profiling on our Company's internal test suite consisting in 4000 > tests with roughly 43 SQL queries passing through Calcite we observed > that a big amount of time is spend on code generation and class loading > making the EnumerableInterpretable#toBindable method a performance > bottleneck. > Among the 43 SQL queries there are many duplicates which are going to > lead to the generation of exactly the same code Java. Introducing, a small > cache at the level of EnumerableInterpretable class could avoid generating > and loading the same code over and over again. > A simple implementation based on Guava improved the overall execution time of > the afforementioned test suite by more than 50%. > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (CALCITE-2703) Reduce code generation and class loading overhead when executing queries in the EnumerableConvention
[ https://issues.apache.org/jira/browse/CALCITE-2703?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16766414#comment-16766414 ] Julian Hyde commented on CALCITE-2703: -- The change looks fine. I'd like some documentation around the property. I can't easily tell whether the cache is enabled or disabled by default. How about a class, similar to CalciteResource or CalciteConnectionConfig, that holds all of the system properties that we use, throughout the code: their paths ("calcite.bindable.cache.maxSize"), types, default values, and a description of their use. > Reduce code generation and class loading overhead when executing queries in > the EnumerableConvention > > > Key: CALCITE-2703 > URL: https://issues.apache.org/jira/browse/CALCITE-2703 > Project: Calcite > Issue Type: Improvement > Components: core >Affects Versions: 1.17.0 >Reporter: Stamatis Zampetakis >Assignee: Julian Hyde >Priority: Major > Labels: performance, pull-request-available > > The queries using Calcite's EnumerableConvention always end-up generating new > java classes at runtime (using Janino) that are then instantiated using > reflection. This combination of class generation and class loading introduces > a big overhead in query response time. > A quick profiling on our Company's internal test suite consisting in 4000 > tests with roughly 43 SQL queries passing through Calcite we observed > that a big amount of time is spend on code generation and class loading > making the EnumerableInterpretable#toBindable method a performance > bottleneck. > Among the 43 SQL queries there are many duplicates which are going to > lead to the generation of exactly the same code Java. Introducing, a small > cache at the level of EnumerableInterpretable class could avoid generating > and loading the same code over and over again. > A simple implementation based on Guava improved the overall execution time of > the afforementioned test suite by more than 50%. > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (CALCITE-2703) Reduce code generation and class loading overhead when executing queries in the EnumerableConvention
[ https://issues.apache.org/jira/browse/CALCITE-2703?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16766097#comment-16766097 ] Stamatis Zampetakis commented on CALCITE-2703: -- Thanks [~vladimirsitnikov]! Meanwhile, I will try to think of a better name. > Reduce code generation and class loading overhead when executing queries in > the EnumerableConvention > > > Key: CALCITE-2703 > URL: https://issues.apache.org/jira/browse/CALCITE-2703 > Project: Calcite > Issue Type: Improvement > Components: core >Affects Versions: 1.17.0 >Reporter: Stamatis Zampetakis >Assignee: Julian Hyde >Priority: Major > Labels: performance, pull-request-available > > The queries using Calcite's EnumerableConvention always end-up generating new > java classes at runtime (using Janino) that are then instantiated using > reflection. This combination of class generation and class loading introduces > a big overhead in query response time. > A quick profiling on our Company's internal test suite consisting in 4000 > tests with roughly 43 SQL queries passing through Calcite we observed > that a big amount of time is spend on code generation and class loading > making the EnumerableInterpretable#toBindable method a performance > bottleneck. > Among the 43 SQL queries there are many duplicates which are going to > lead to the generation of exactly the same code Java. Introducing, a small > cache at the level of EnumerableInterpretable class could avoid generating > and loading the same code over and over again. > A simple implementation based on Guava improved the overall execution time of > the afforementioned test suite by more than 50%. > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (CALCITE-2703) Reduce code generation and class loading overhead when executing queries in the EnumerableConvention
[ https://issues.apache.org/jira/browse/CALCITE-2703?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16766095#comment-16766095 ] Vladimir Sitnikov commented on CALCITE-2703: [~zabetak], thanks for the ping, I'll review it later provided I have time (== don't count on me if I don't come back in a week). By the way, I don't quite like property name here: Util.getIntProperty("calcite.bindable.cache.maxSize" Property names are public API, and I think {{calcite.bindable.cache.maxSize}} is a quite obscure property name. > Reduce code generation and class loading overhead when executing queries in > the EnumerableConvention > > > Key: CALCITE-2703 > URL: https://issues.apache.org/jira/browse/CALCITE-2703 > Project: Calcite > Issue Type: Improvement > Components: core >Affects Versions: 1.17.0 >Reporter: Stamatis Zampetakis >Assignee: Julian Hyde >Priority: Major > Labels: performance, pull-request-available > > The queries using Calcite's EnumerableConvention always end-up generating new > java classes at runtime (using Janino) that are then instantiated using > reflection. This combination of class generation and class loading introduces > a big overhead in query response time. > A quick profiling on our Company's internal test suite consisting in 4000 > tests with roughly 43 SQL queries passing through Calcite we observed > that a big amount of time is spend on code generation and class loading > making the EnumerableInterpretable#toBindable method a performance > bottleneck. > Among the 43 SQL queries there are many duplicates which are going to > lead to the generation of exactly the same code Java. Introducing, a small > cache at the level of EnumerableInterpretable class could avoid generating > and loading the same code over and over again. > A simple implementation based on Guava improved the overall execution time of > the afforementioned test suite by more than 50%. > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (CALCITE-2703) Reduce code generation and class loading overhead when executing queries in the EnumerableConvention
[ https://issues.apache.org/jira/browse/CALCITE-2703?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16766080#comment-16766080 ] Stamatis Zampetakis commented on CALCITE-2703: -- Hi [~vladimirsitnikov], [~julianhyde], [~hhlai1990], I think the PR for this Jira is in good shape. If you don't have further comments I can proceed on merging it to the master. > Reduce code generation and class loading overhead when executing queries in > the EnumerableConvention > > > Key: CALCITE-2703 > URL: https://issues.apache.org/jira/browse/CALCITE-2703 > Project: Calcite > Issue Type: Improvement > Components: core >Affects Versions: 1.17.0 >Reporter: Stamatis Zampetakis >Assignee: Julian Hyde >Priority: Major > Labels: performance, pull-request-available > > The queries using Calcite's EnumerableConvention always end-up generating new > java classes at runtime (using Janino) that are then instantiated using > reflection. This combination of class generation and class loading introduces > a big overhead in query response time. > A quick profiling on our Company's internal test suite consisting in 4000 > tests with roughly 43 SQL queries passing through Calcite we observed > that a big amount of time is spend on code generation and class loading > making the EnumerableInterpretable#toBindable method a performance > bottleneck. > Among the 43 SQL queries there are many duplicates which are going to > lead to the generation of exactly the same code Java. Introducing, a small > cache at the level of EnumerableInterpretable class could avoid generating > and loading the same code over and over again. > A simple implementation based on Guava improved the overall execution time of > the afforementioned test suite by more than 50%. > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (CALCITE-2703) Reduce code generation and class loading overhead when executing queries in the EnumerableConvention
[ https://issues.apache.org/jira/browse/CALCITE-2703?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16720057#comment-16720057 ] Stamatis Zampetakis commented on CALCITE-2703: -- [~hhlai1990], I think it is doable but I believe it will be complimentary to this cache so I would suggest to open a new Jira about it. > Reduce code generation and class loading overhead when executing queries in > the EnumerableConvention > > > Key: CALCITE-2703 > URL: https://issues.apache.org/jira/browse/CALCITE-2703 > Project: Calcite > Issue Type: Improvement > Components: core >Affects Versions: 1.17.0 >Reporter: Stamatis Zampetakis >Assignee: Julian Hyde >Priority: Major > Labels: performance > Fix For: 1.18.0 > > > The queries using Calcite's EnumerableConvention always end-up generating new > java classes at runtime (using Janino) that are then instantiated using > reflection. This combination of class generation and class loading introduces > a big overhead in query response time. > A quick profiling on our Company's internal test suite consisting in 4000 > tests with roughly 43 SQL queries passing through Calcite we observed > that a big amount of time is spend on code generation and class loading > making the EnumerableInterpretable#toBindable method a performance > bottleneck. > Among the 43 SQL queries there are many duplicates which are going to > lead to the generation of exactly the same code Java. Introducing, a small > cache at the level of EnumerableInterpretable class could avoid generating > and loading the same code over and over again. > A simple implementation based on Guava improved the overall execution time of > the afforementioned test suite by more than 50%. > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (CALCITE-2703) Reduce code generation and class loading overhead when executing queries in the EnumerableConvention
[ https://issues.apache.org/jira/browse/CALCITE-2703?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16720050#comment-16720050 ] Lai Zhou commented on CALCITE-2703: --- [~zabetak], in fact, we use calcite to execute a lot of 'hive etl sql' which are big queries for online machine-learning. we extended the native enummerable implementation of calcite to support HiveOperator. I think it will be great if we can cache more thing(maybe the whole query plan) for the same sql . I think it's a common and typical use case . > Reduce code generation and class loading overhead when executing queries in > the EnumerableConvention > > > Key: CALCITE-2703 > URL: https://issues.apache.org/jira/browse/CALCITE-2703 > Project: Calcite > Issue Type: Improvement > Components: core >Affects Versions: 1.17.0 >Reporter: Stamatis Zampetakis >Assignee: Julian Hyde >Priority: Major > Labels: performance > Fix For: 1.18.0 > > > The queries using Calcite's EnumerableConvention always end-up generating new > java classes at runtime (using Janino) that are then instantiated using > reflection. This combination of class generation and class loading introduces > a big overhead in query response time. > A quick profiling on our Company's internal test suite consisting in 4000 > tests with roughly 43 SQL queries passing through Calcite we observed > that a big amount of time is spend on code generation and class loading > making the EnumerableInterpretable#toBindable method a performance > bottleneck. > Among the 43 SQL queries there are many duplicates which are going to > lead to the generation of exactly the same code Java. Introducing, a small > cache at the level of EnumerableInterpretable class could avoid generating > and loading the same code over and over again. > A simple implementation based on Guava improved the overall execution time of > the afforementioned test suite by more than 50%. > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (CALCITE-2703) Reduce code generation and class loading overhead when executing queries in the EnumerableConvention
[ https://issues.apache.org/jira/browse/CALCITE-2703?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16720036#comment-16720036 ] Stamatis Zampetakis commented on CALCITE-2703: -- [~hhlai1990], as it is the case for this optimization it can be beneficial depending on the query workload. If you have big queries with many joins that happen to appear very often it can be beneficial. On the other hand, if the queries are very simple or if they don't appear often it will not bring much benefit, rather the opposite. > Reduce code generation and class loading overhead when executing queries in > the EnumerableConvention > > > Key: CALCITE-2703 > URL: https://issues.apache.org/jira/browse/CALCITE-2703 > Project: Calcite > Issue Type: Improvement > Components: core >Affects Versions: 1.17.0 >Reporter: Stamatis Zampetakis >Assignee: Julian Hyde >Priority: Major > Labels: performance > Fix For: 1.18.0 > > > The queries using Calcite's EnumerableConvention always end-up generating new > java classes at runtime (using Janino) that are then instantiated using > reflection. This combination of class generation and class loading introduces > a big overhead in query response time. > A quick profiling on our Company's internal test suite consisting in 4000 > tests with roughly 43 SQL queries passing through Calcite we observed > that a big amount of time is spend on code generation and class loading > making the EnumerableInterpretable#toBindable method a performance > bottleneck. > Among the 43 SQL queries there are many duplicates which are going to > lead to the generation of exactly the same code Java. Introducing, a small > cache at the level of EnumerableInterpretable class could avoid generating > and loading the same code over and over again. > A simple implementation based on Guava improved the overall execution time of > the afforementioned test suite by more than 50%. > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (CALCITE-2703) Reduce code generation and class loading overhead when executing queries in the EnumerableConvention
[ https://issues.apache.org/jira/browse/CALCITE-2703?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16719993#comment-16719993 ] Lai Zhou commented on CALCITE-2703: --- [~julianhyde] [~zabetak] I use calcite(1.17.0) for real-time computation in product environment, is it beneficial for low latency to cache the query plan , not just the instance? > Reduce code generation and class loading overhead when executing queries in > the EnumerableConvention > > > Key: CALCITE-2703 > URL: https://issues.apache.org/jira/browse/CALCITE-2703 > Project: Calcite > Issue Type: Improvement > Components: core >Affects Versions: 1.17.0 >Reporter: Stamatis Zampetakis >Assignee: Julian Hyde >Priority: Major > Labels: performance > Fix For: 1.18.0 > > > The queries using Calcite's EnumerableConvention always end-up generating new > java classes at runtime (using Janino) that are then instantiated using > reflection. This combination of class generation and class loading introduces > a big overhead in query response time. > A quick profiling on our Company's internal test suite consisting in 4000 > tests with roughly 43 SQL queries passing through Calcite we observed > that a big amount of time is spend on code generation and class loading > making the EnumerableInterpretable#toBindable method a performance > bottleneck. > Among the 43 SQL queries there are many duplicates which are going to > lead to the generation of exactly the same code Java. Introducing, a small > cache at the level of EnumerableInterpretable class could avoid generating > and loading the same code over and over again. > A simple implementation based on Guava improved the overall execution time of > the afforementioned test suite by more than 50%. > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (CALCITE-2703) Reduce code generation and class loading overhead when executing queries in the EnumerableConvention
[ https://issues.apache.org/jira/browse/CALCITE-2703?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16708539#comment-16708539 ] Stamatis Zampetakis commented on CALCITE-2703: -- I placed the cache inside EnumerableInterpretable#getBindable method as suggested by [~vladimirsitnikov] and made it configurable using runtime properties as suggested by [~julianhyde]. > Reduce code generation and class loading overhead when executing queries in > the EnumerableConvention > > > Key: CALCITE-2703 > URL: https://issues.apache.org/jira/browse/CALCITE-2703 > Project: Calcite > Issue Type: Improvement > Components: core >Affects Versions: 1.17.0 >Reporter: Stamatis Zampetakis >Assignee: Julian Hyde >Priority: Major > Labels: performance > Fix For: 1.18.0 > > > The queries using Calcite's EnumerableConvention always end-up generating new > java classes at runtime (using Janino) that are then instantiated using > reflection. This combination of class generation and class loading introduces > a big overhead in query response time. > A quick profiling on our Company's internal test suite consisting in 4000 > tests with roughly 43 SQL queries passing through Calcite we observed > that a big amount of time is spend on code generation and class loading > making the EnumerableInterpretable#toBindable method a performance > bottleneck. > Among the 43 SQL queries there are many duplicates which are going to > lead to the generation of exactly the same code Java. Introducing, a small > cache at the level of EnumerableInterpretable class could avoid generating > and loading the same code over and over again. > A simple implementation based on Guava improved the overall execution time of > the afforementioned test suite by more than 50%. > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (CALCITE-2703) Reduce code generation and class loading overhead when executing queries in the EnumerableConvention
[ https://issues.apache.org/jira/browse/CALCITE-2703?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16706052#comment-16706052 ] Julian Hyde commented on CALCITE-2703: -- {quote}I opened this JIRA case because I believe a cache will be beneficial for the majority of users.{quote} I don't believe it will be beneficial to a majority (taking into account the hidden cost of extra complexity). However, I believe it could be beneficial for some users with particular work-loads. I think it is the kind of feature that should be enabled using a runtime property, disabled by default. Could you do that, and without significantly increasing the complexity of the default code path? I will accept this PR if you can do that. > Reduce code generation and class loading overhead when executing queries in > the EnumerableConvention > > > Key: CALCITE-2703 > URL: https://issues.apache.org/jira/browse/CALCITE-2703 > Project: Calcite > Issue Type: Improvement > Components: core >Affects Versions: 1.17.0 >Reporter: Stamatis Zampetakis >Assignee: Julian Hyde >Priority: Major > Labels: performance > Fix For: 1.18.0 > > > The queries using Calcite's EnumerableConvention always end-up generating new > java classes at runtime (using Janino) that are then instantiated using > reflection. This combination of class generation and class loading introduces > a big overhead in query response time. > A quick profiling on our Company's internal test suite consisting in 4000 > tests with roughly 43 SQL queries passing through Calcite we observed > that a big amount of time is spend on code generation and class loading > making the EnumerableInterpretable#toBindable method a performance > bottleneck. > Among the 43 SQL queries there are many duplicates which are going to > lead to the generation of exactly the same code Java. Introducing, a small > cache at the level of EnumerableInterpretable class could avoid generating > and loading the same code over and over again. > A simple implementation based on Guava improved the overall execution time of > the afforementioned test suite by more than 50%. > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (CALCITE-2703) Reduce code generation and class loading overhead when executing queries in the EnumerableConvention
[ https://issues.apache.org/jira/browse/CALCITE-2703?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16705864#comment-16705864 ] Stamatis Zampetakis commented on CALCITE-2703: -- The first reason to skip dynamic functions is the restriction imposed by the Javadoc shown below: {code:java} /** * @return true iff it is unsafe to cache query plans referencing this * operator; false is assumed by default */ public boolean isDynamicFunction() { return false; } {code} but the real reason is that some dynamic functions (e.g., RAND etc) have the @Determinstic annotation which ends up creating static fields in generated classes and this do not work well with the cache. Maybe we should re-consider which functions we mark as determinstic. {quote} What do you mean by "server" here? {quote} I meant through the connection interface. Typically, connection indicates connection to a server so that's how server came into the discussion. I know that you don't have to start a server, sorry for the confusion. > Reduce code generation and class loading overhead when executing queries in > the EnumerableConvention > > > Key: CALCITE-2703 > URL: https://issues.apache.org/jira/browse/CALCITE-2703 > Project: Calcite > Issue Type: Improvement > Components: core >Affects Versions: 1.17.0 >Reporter: Stamatis Zampetakis >Assignee: Julian Hyde >Priority: Major > Labels: performance > Fix For: 1.18.0 > > > The queries using Calcite's EnumerableConvention always end-up generating new > java classes at runtime (using Janino) that are then instantiated using > reflection. This combination of class generation and class loading introduces > a big overhead in query response time. > A quick profiling on our Company's internal test suite consisting in 4000 > tests with roughly 43 SQL queries passing through Calcite we observed > that a big amount of time is spend on code generation and class loading > making the EnumerableInterpretable#toBindable method a performance > bottleneck. > Among the 43 SQL queries there are many duplicates which are going to > lead to the generation of exactly the same code Java. Introducing, a small > cache at the level of EnumerableInterpretable class could avoid generating > and loading the same code over and over again. > A simple implementation based on Guava improved the overall execution time of > the afforementioned test suite by more than 50%. > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (CALCITE-2703) Reduce code generation and class loading overhead when executing queries in the EnumerableConvention
[ https://issues.apache.org/jira/browse/CALCITE-2703?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16705823#comment-16705823 ] Vladimir Sitnikov commented on CALCITE-2703: {quote}Calcite as a server.{quote} I don't get what do you mean by "server" here. One can access in-process Calcite via {{DriverManager.getConnection}}. You don't have to "start server" for that. > Reduce code generation and class loading overhead when executing queries in > the EnumerableConvention > > > Key: CALCITE-2703 > URL: https://issues.apache.org/jira/browse/CALCITE-2703 > Project: Calcite > Issue Type: Improvement > Components: core >Affects Versions: 1.17.0 >Reporter: Stamatis Zampetakis >Assignee: Julian Hyde >Priority: Major > Labels: performance > Fix For: 1.18.0 > > > The queries using Calcite's EnumerableConvention always end-up generating new > java classes at runtime (using Janino) that are then instantiated using > reflection. This combination of class generation and class loading introduces > a big overhead in query response time. > A quick profiling on our Company's internal test suite consisting in 4000 > tests with roughly 43 SQL queries passing through Calcite we observed > that a big amount of time is spend on code generation and class loading > making the EnumerableInterpretable#toBindable method a performance > bottleneck. > Among the 43 SQL queries there are many duplicates which are going to > lead to the generation of exactly the same code Java. Introducing, a small > cache at the level of EnumerableInterpretable class could avoid generating > and loading the same code over and over again. > A simple implementation based on Guava improved the overall execution time of > the afforementioned test suite by more than 50%. > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (CALCITE-2703) Reduce code generation and class loading overhead when executing queries in the EnumerableConvention
[ https://issues.apache.org/jira/browse/CALCITE-2703?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16705819#comment-16705819 ] Vladimir Sitnikov commented on CALCITE-2703: [~zabetak], what is the reason to skip class cache in case the plan contains dynamic functions? > Reduce code generation and class loading overhead when executing queries in > the EnumerableConvention > > > Key: CALCITE-2703 > URL: https://issues.apache.org/jira/browse/CALCITE-2703 > Project: Calcite > Issue Type: Improvement > Components: core >Affects Versions: 1.17.0 >Reporter: Stamatis Zampetakis >Assignee: Julian Hyde >Priority: Major > Labels: performance > Fix For: 1.18.0 > > > The queries using Calcite's EnumerableConvention always end-up generating new > java classes at runtime (using Janino) that are then instantiated using > reflection. This combination of class generation and class loading introduces > a big overhead in query response time. > A quick profiling on our Company's internal test suite consisting in 4000 > tests with roughly 43 SQL queries passing through Calcite we observed > that a big amount of time is spend on code generation and class loading > making the EnumerableInterpretable#toBindable method a performance > bottleneck. > Among the 43 SQL queries there are many duplicates which are going to > lead to the generation of exactly the same code Java. Introducing, a small > cache at the level of EnumerableInterpretable class could avoid generating > and loading the same code over and over again. > A simple implementation based on Guava improved the overall execution time of > the afforementioned test suite by more than 50%. > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (CALCITE-2703) Reduce code generation and class loading overhead when executing queries in the EnumerableConvention
[ https://issues.apache.org/jira/browse/CALCITE-2703?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16705762#comment-16705762 ] Stamatis Zampetakis commented on CALCITE-2703: -- That's right! I am using many parts of it but not Calcite as a server. > Reduce code generation and class loading overhead when executing queries in > the EnumerableConvention > > > Key: CALCITE-2703 > URL: https://issues.apache.org/jira/browse/CALCITE-2703 > Project: Calcite > Issue Type: Improvement > Components: core >Affects Versions: 1.17.0 >Reporter: Stamatis Zampetakis >Assignee: Julian Hyde >Priority: Major > Labels: performance > Fix For: 1.18.0 > > > The queries using Calcite's EnumerableConvention always end-up generating new > java classes at runtime (using Janino) that are then instantiated using > reflection. This combination of class generation and class loading introduces > a big overhead in query response time. > A quick profiling on our Company's internal test suite consisting in 4000 > tests with roughly 43 SQL queries passing through Calcite we observed > that a big amount of time is spend on code generation and class loading > making the EnumerableInterpretable#toBindable method a performance > bottleneck. > Among the 43 SQL queries there are many duplicates which are going to > lead to the generation of exactly the same code Java. Introducing, a small > cache at the level of EnumerableInterpretable class could avoid generating > and loading the same code over and over again. > A simple implementation based on Guava improved the overall execution time of > the afforementioned test suite by more than 50%. > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (CALCITE-2703) Reduce code generation and class loading overhead when executing queries in the EnumerableConvention
[ https://issues.apache.org/jira/browse/CALCITE-2703?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16705750#comment-16705750 ] Vladimir Sitnikov commented on CALCITE-2703: {quote} thus does not pass from the PreparedStatement{quote} Do you mean you are not using PreparedStatement to access Calcite? > Reduce code generation and class loading overhead when executing queries in > the EnumerableConvention > > > Key: CALCITE-2703 > URL: https://issues.apache.org/jira/browse/CALCITE-2703 > Project: Calcite > Issue Type: Improvement > Components: core >Affects Versions: 1.17.0 >Reporter: Stamatis Zampetakis >Assignee: Julian Hyde >Priority: Major > Labels: performance > Fix For: 1.18.0 > > > The queries using Calcite's EnumerableConvention always end-up generating new > java classes at runtime (using Janino) that are then instantiated using > reflection. This combination of class generation and class loading introduces > a big overhead in query response time. > A quick profiling on our Company's internal test suite consisting in 4000 > tests with roughly 43 SQL queries passing through Calcite we observed > that a big amount of time is spend on code generation and class loading > making the EnumerableInterpretable#toBindable method a performance > bottleneck. > Among the 43 SQL queries there are many duplicates which are going to > lead to the generation of exactly the same code Java. Introducing, a small > cache at the level of EnumerableInterpretable class could avoid generating > and loading the same code over and over again. > A simple implementation based on Guava improved the overall execution time of > the afforementioned test suite by more than 50%. > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (CALCITE-2703) Reduce code generation and class loading overhead when executing queries in the EnumerableConvention
[ https://issues.apache.org/jira/browse/CALCITE-2703?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16705738#comment-16705738 ] Stamatis Zampetakis commented on CALCITE-2703: -- {quote} In other words, what if we add a caching layer in-between PreparedStatement and EnumerableInterpretable#toBindable? {quote} [~vladimirsitnikov] That was kind of my original idea but the problem with this is that whoever does not use Calcite as a server (e.g., me) thus does not pass from the PreparedStatement. This mean, that if he uses the EnumerableConvention he will potentially need to add another cache at the application layer. [~julianhyde] I opened this JIRA case because I believe a cache will be beneficial for the majority of users. If you believe it is not the case, I will close this issue and we will add the cache outside of Calcite. Other than that using Calcite's interpreter is not an alternative since it also passes from Janino thus it suffers from the same compilation/ class loading overhead. > Reduce code generation and class loading overhead when executing queries in > the EnumerableConvention > > > Key: CALCITE-2703 > URL: https://issues.apache.org/jira/browse/CALCITE-2703 > Project: Calcite > Issue Type: Improvement > Components: core >Affects Versions: 1.17.0 >Reporter: Stamatis Zampetakis >Assignee: Julian Hyde >Priority: Major > Labels: performance > Fix For: 1.18.0 > > > The queries using Calcite's EnumerableConvention always end-up generating new > java classes at runtime (using Janino) that are then instantiated using > reflection. This combination of class generation and class loading introduces > a big overhead in query response time. > A quick profiling on our Company's internal test suite consisting in 4000 > tests with roughly 43 SQL queries passing through Calcite we observed > that a big amount of time is spend on code generation and class loading > making the EnumerableInterpretable#toBindable method a performance > bottleneck. > Among the 43 SQL queries there are many duplicates which are going to > lead to the generation of exactly the same code Java. Introducing, a small > cache at the level of EnumerableInterpretable class could avoid generating > and loading the same code over and over again. > A simple implementation based on Guava improved the overall execution time of > the afforementioned test suite by more than 50%. > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (CALCITE-2703) Reduce code generation and class loading overhead when executing queries in the EnumerableConvention
[ https://issues.apache.org/jira/browse/CALCITE-2703?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16705567#comment-16705567 ] Julian Hyde commented on CALCITE-2703: -- This always happens when someone proposes a cache. They have a use case where the cache makes things a lot better; for everyone else, it makes things slightly worse. If your use case is low-latency queries, a cache isn't the answer; you should be using the interpreter. > Reduce code generation and class loading overhead when executing queries in > the EnumerableConvention > > > Key: CALCITE-2703 > URL: https://issues.apache.org/jira/browse/CALCITE-2703 > Project: Calcite > Issue Type: Improvement > Components: core >Affects Versions: 1.17.0 >Reporter: Stamatis Zampetakis >Assignee: Julian Hyde >Priority: Major > Labels: performance > Fix For: 1.18.0 > > > The queries using Calcite's EnumerableConvention always end-up generating new > java classes at runtime (using Janino) that are then instantiated using > reflection. This combination of class generation and class loading introduces > a big overhead in query response time. > A quick profiling on our Company's internal test suite consisting in 4000 > tests with roughly 43 SQL queries passing through Calcite we observed > that a big amount of time is spend on code generation and class loading > making the EnumerableInterpretable#toBindable method a performance > bottleneck. > Among the 43 SQL queries there are many duplicates which are going to > lead to the generation of exactly the same code Java. Introducing, a small > cache at the level of EnumerableInterpretable class could avoid generating > and loading the same code over and over again. > A simple implementation based on Guava improved the overall execution time of > the afforementioned test suite by more than 50%. > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (CALCITE-2703) Reduce code generation and class loading overhead when executing queries in the EnumerableConvention
[ https://issues.apache.org/jira/browse/CALCITE-2703?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16705089#comment-16705089 ] Vladimir Sitnikov commented on CALCITE-2703: {quote}Calcite's implementation of PreparedStatement ends up calling EnumerableInterpretable#toBindable{quote} [~zabetak],what if Calcite's PreparedStatement would avoid call of #toBindable in case the execution plan can be reused? In other words, what if we add a caching layer in-between PreparedStatement and EnumerableInterpretable#toBindable? > Reduce code generation and class loading overhead when executing queries in > the EnumerableConvention > > > Key: CALCITE-2703 > URL: https://issues.apache.org/jira/browse/CALCITE-2703 > Project: Calcite > Issue Type: Improvement > Components: core >Affects Versions: 1.17.0 >Reporter: Stamatis Zampetakis >Assignee: Julian Hyde >Priority: Major > Labels: performance > Fix For: 1.18.0 > > > The queries using Calcite's EnumerableConvention always end-up generating new > java classes at runtime (using Janino) that are then instantiated using > reflection. This combination of class generation and class loading introduces > a big overhead in query response time. > A quick profiling on our Company's internal test suite consisting in 4000 > tests with roughly 43 SQL queries passing through Calcite we observed > that a big amount of time is spend on code generation and class loading > making the EnumerableInterpretable#toBindable method a performance > bottleneck. > Among the 43 SQL queries there are many duplicates which are going to > lead to the generation of exactly the same code Java. Introducing, a small > cache at the level of EnumerableInterpretable class could avoid generating > and loading the same code over and over again. > A simple implementation based on Guava improved the overall execution time of > the afforementioned test suite by more than 50%. > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (CALCITE-2703) Reduce code generation and class loading overhead when executing queries in the EnumerableConvention
[ https://issues.apache.org/jira/browse/CALCITE-2703?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16704984#comment-16704984 ] Stamatis Zampetakis commented on CALCITE-2703: -- [~vladimirsitnikov] Calcite's implementation of PreparedStatement ends up calling EnumerableInterpretable#toBindable through CalcitePrepareImpl.CalcitePreparingStmt#implement. All the code generation magic happens there and to be more precise inside EnumerableInterpretable#getBindable method. [~julianhyde] I undestand your concern, but I still believe it is worth the effort. I agree that the test query patterns are not like production patterns but this doesn't change the fact the same query patterns appear multiple times. It is very common that production workloads tend to repeat the same queries multiple times. I would dare to say that production workloads do not really introduce new queries very often. I am well aware of the fact that the production system may have a performance bottleneck elsewhere but this doesn't mean that code compilation/class loading comes for free. The fact that the test suite needs *2h* *instead of 1h* because of that certainly raises a warning flag. To better showcase the advantages/disadvantages of using a cache, I did some micro benchmarks (code included in the PR) using jmh. The full result report on my local machine can be found [here|https://docs.google.com/spreadsheets/d/1yVIjan8Nw-aCQOmvYLgj1PdlkrjeblNHihvfhnc5yf4/edit?usp=sharing]. For convenience, I provide below a small extract with the average response times per operation: ||Benchmark||(cacheSize)||(queries)||avt||Units|| |CodeGenerationBenchmark.getBindableNoCache|N/A|1|11289368.49|ns/op| | | |10|11826950.78|ns/op| | | |100|12633361.83|ns/op| | | |1000|13941583.80|ns/op| |CodeGenerationBenchmark.getBindableWithCacheAndCandidateDetector|10|1|4864.99|ns/op| | | |10|5916.71|ns/op| | | |100|11749056.09|ns/op| | | |1000|12479969.01|ns/op| | |100|1|4102.78|ns/op| | | |10|5377.52|ns/op| | | |100|9507081.47|ns/op| | | |1000|11093308.52|ns/op| | |1000|1|5297.09|ns/op| | | |10|7506.15|ns/op| | | |100|10994651.69|ns/op| | | |1000|15484723.50|ns/op| +Note+: when the query workload above contains 100 or more queries most of the operations lead to cache misses (since the measurement time is set to 1sec). *Bindables without cache* The existing code has a few drawbacks. _1. Compilation/Class loading overhead_: I would like to comment the fact that the current implementation (getBindableNoCache) takes in average 12ms (with a hot VM this can go lower but still) which I consider a lot. For simple, high selective, queries this time can easily be more than the time of really executing the query. This is a general problem of compiled vs. interpreted programs. Someone could argue that a few ms is not really a big deal so this brings me to the second drawback. _2. Metaspace pollution/Class unloading overhead:_ No cache, means that a new class will be loaded for every query consuming both Heap and Metaspace memory. This translates to higher memory requirements and increased activity of the garbage collector affecting performance negatively. Altough, the cost of loading classes rougly appears in the previous benchmark the cost of unloading (performed by the gc) is more difficult to measure in a few benchmark iterations. _3. Increased JIT activity:_ Since new classes are generated for every query the JIT compiler has to get in the middle in order to optimise the new bytecode in order to make the queries run faster. Part of the high response times reported above are attributable to JIT. JIT runs in separate threads, each one occupying more than 50% of available cores processing capacity. In low threaded applications, this can easily pass unoticed since running threads can always find an available core but as the number of threads grows JIT threads will have to compete with application threads affecting the overall performance of the system. *Bindables with cache* Adding a cache layer has also a few drawbacks. _1. Cache access overhead_: An additional cache layer means additional overhead in every query for accessing the cache. In the case that we never hit the cache we are paying an extra cost for nothing. In the previous benchmark this penalty is negligible (i.e., ~2ms extra in the worst case) and hopefully such scenarios should be rather rare. _2. Heap space overhead_: The approach of using the Java code as the key for the cache is not memory efficient either but can be mitigated by choosing a relatively small size for the cache (e.g., 100, 1000). _3. Code complexity_: An additional cache layer brings complexity issues in terms of implementation and code maintenance. *Conclusion* Overall, I am rather positive in the idea of the cache for the various reasons outlined above so I will be happy to
[jira] [Commented] (CALCITE-2703) Reduce code generation and class loading overhead when executing queries in the EnumerableConvention
[ https://issues.apache.org/jira/browse/CALCITE-2703?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16697373#comment-16697373 ] Julian Hyde commented on CALCITE-2703: -- Let’s be careful. You’re adding a layer of caching in production code path to make the test suite run faster. Test query patterns are not like production patterns. Are you optimizing the wrong thing? > Reduce code generation and class loading overhead when executing queries in > the EnumerableConvention > > > Key: CALCITE-2703 > URL: https://issues.apache.org/jira/browse/CALCITE-2703 > Project: Calcite > Issue Type: Improvement > Components: core >Affects Versions: 1.17.0 >Reporter: Stamatis Zampetakis >Assignee: Julian Hyde >Priority: Major > Labels: performance > Fix For: 1.18.0 > > > The queries using Calcite's EnumerableConvention always end-up generating new > java classes at runtime (using Janino) that are then instantiated using > reflection. This combination of class generation and class loading introduces > a big overhead in query response time. > A quick profiling on our Company's internal test suite consisting in 4000 > tests with roughly 43 SQL queries passing through Calcite we observed > that a big amount of time is spend on code generation and class loading > making the EnumerableInterpretable#toBindable method a performance > bottleneck. > Among the 43 SQL queries there are many duplicates which are going to > lead to the generation of exactly the same code Java. Introducing, a small > cache at the level of EnumerableInterpretable class could avoid generating > and loading the same code over and over again. > A simple implementation based on Guava improved the overall execution time of > the afforementioned test suite by more than 50%. > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (CALCITE-2703) Reduce code generation and class loading overhead when executing queries in the EnumerableConvention
[ https://issues.apache.org/jira/browse/CALCITE-2703?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16697316#comment-16697316 ] Vladimir Sitnikov commented on CALCITE-2703: I mean cache in Calcite's implementation of PreparedStatement. PS. EnumerableInterpretable should probably just interpret things without code generation. Am I missing something? > Reduce code generation and class loading overhead when executing queries in > the EnumerableConvention > > > Key: CALCITE-2703 > URL: https://issues.apache.org/jira/browse/CALCITE-2703 > Project: Calcite > Issue Type: Improvement > Components: core >Affects Versions: 1.17.0 >Reporter: Stamatis Zampetakis >Assignee: Julian Hyde >Priority: Major > Labels: performance > Fix For: 1.18.0 > > > The queries using Calcite's EnumerableConvention always end-up generating new > java classes at runtime (using Janino) that are then instantiated using > reflection. This combination of class generation and class loading introduces > a big overhead in query response time. > A quick profiling on our Company's internal test suite consisting in 4000 > tests with roughly 43 SQL queries passing through Calcite we observed > that a big amount of time is spend on code generation and class loading > making the EnumerableInterpretable#toBindable method a performance > bottleneck. > Among the 43 SQL queries there are many duplicates which are going to > lead to the generation of exactly the same code Java. Introducing, a small > cache at the level of EnumerableInterpretable class could avoid generating > and loading the same code over and over again. > A simple implementation based on Guava improved the overall execution time of > the afforementioned test suite by more than 50%. > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (CALCITE-2703) Reduce code generation and class loading overhead when executing queries in the EnumerableConvention
[ https://issues.apache.org/jira/browse/CALCITE-2703?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16697297#comment-16697297 ] Stamatis Zampetakis commented on CALCITE-2703: -- I am not using Calcite as a server so I am not passing at all through the PreparedStatement infrustructure. It is totally feasible to add a cache at a higher level (without even touching Calcite code) but I would rather do it in EnumerableInterpretable class so that everybody can benefit of it. > Reduce code generation and class loading overhead when executing queries in > the EnumerableConvention > > > Key: CALCITE-2703 > URL: https://issues.apache.org/jira/browse/CALCITE-2703 > Project: Calcite > Issue Type: Improvement > Components: core >Affects Versions: 1.17.0 >Reporter: Stamatis Zampetakis >Assignee: Julian Hyde >Priority: Major > Labels: performance > Fix For: 1.18.0 > > > The queries using Calcite's EnumerableConvention always end-up generating new > java classes at runtime (using Janino) that are then instantiated using > reflection. This combination of class generation and class loading introduces > a big overhead in query response time. > A quick profiling on our Company's internal test suite consisting in 4000 > tests with roughly 43 SQL queries passing through Calcite we observed > that a big amount of time is spend on code generation and class loading > making the EnumerableInterpretable#toBindable method a performance > bottleneck. > Among the 43 SQL queries there are many duplicates which are going to > lead to the generation of exactly the same code Java. Introducing, a small > cache at the level of EnumerableInterpretable class could avoid generating > and loading the same code over and over again. > A simple implementation based on Guava improved the overall execution time of > the afforementioned test suite by more than 50%. > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (CALCITE-2703) Reduce code generation and class loading overhead when executing queries in the EnumerableConvention
[ https://issues.apache.org/jira/browse/CALCITE-2703?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16697287#comment-16697287 ] Vladimir Sitnikov commented on CALCITE-2703: [~zabetak], have you checked the original SQL statements? I wonder if caching at the level of "PreparedStatement" would be better. > Reduce code generation and class loading overhead when executing queries in > the EnumerableConvention > > > Key: CALCITE-2703 > URL: https://issues.apache.org/jira/browse/CALCITE-2703 > Project: Calcite > Issue Type: Improvement > Components: core >Affects Versions: 1.17.0 >Reporter: Stamatis Zampetakis >Assignee: Julian Hyde >Priority: Major > Labels: performance > Fix For: 1.18.0 > > > The queries using Calcite's EnumerableConvention always end-up generating new > java classes at runtime (using Janino) that are then instantiated using > reflection. This combination of class generation and class loading introduces > a big overhead in query response time. > A quick profiling on our Company's internal test suite consisting in 4000 > tests with roughly 43 SQL queries passing through Calcite we observed > that a big amount of time is spend on code generation and class loading > making the EnumerableInterpretable#toBindable method a performance > bottleneck. > Among the 43 SQL queries there are many duplicates which are going to > lead to the generation of exactly the same code Java. Introducing, a small > cache at the level of EnumerableInterpretable class could avoid generating > and loading the same code over and over again. > A simple implementation based on Guava improved the overall execution time of > the afforementioned test suite by more than 50%. > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)