[jira] [Commented] (CALCITE-2703) Reduce code generation and class loading overhead when executing queries in the EnumerableConvention

2019-02-27 Thread Stamatis Zampetakis (JIRA)


[ 
https://issues.apache.org/jira/browse/CALCITE-2703?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16779291#comment-16779291
 ] 

Stamatis Zampetakis commented on CALCITE-2703:
--

Below, I outline a few alternative names (along with a short description) for 
property "calcite.bindable.cache.maxSize". Among them, I still prefer the 
initial one but if you (mostly referring to [~vladimirsitnikov]) have a 
different opinion let me know and I will use that one.

+Reminder+

The interface defines Bindable as a "Statement that can be bound to a 
DataContext and then executed".

+Alternatives+
 * *calcite.bindable.cache.maxSize*: The maximum size of the cache used for 
storing Bindable objects, instantiated via dynamically generated Java classes.
 * *calcite.codegen.plan.cache.maxSize*: The maximum size of the cache used for 
storing execution plans, instantiated via dynamically generated Java classes. 
 * *calcite.executable.cache.maxSize*, *calcite.runtime.plan.cache.maxSize*, 
*calcite.execution.plan.cache.maxSize*: The maximum size of the cache used for 
storing ready to execute query plan instances.
 * *calcite.statement.cache.maxSize*: The maximum size of the cache that is 
used for storing ready to execute statements. 
 * *calcite.interpreter.plan.cache.maxSize*
 * *calcite.interpreter.query.cache.maxSize*

FYI: I plan to merge the PR tomorrow.

> Reduce code generation and class loading overhead when executing queries in 
> the EnumerableConvention
> 
>
> Key: CALCITE-2703
> URL: https://issues.apache.org/jira/browse/CALCITE-2703
> Project: Calcite
>  Issue Type: Improvement
>  Components: core
>Affects Versions: 1.17.0
>Reporter: Stamatis Zampetakis
>Assignee: Stamatis Zampetakis
>Priority: Major
>  Labels: performance, pull-request-available
>  Time Spent: 3h
>  Remaining Estimate: 0h
>
> The queries using Calcite's EnumerableConvention always end-up generating new 
> java classes at runtime (using Janino) that are then instantiated using 
> reflection. This combination of class generation and class loading introduces 
> a big overhead in query response time.
> A quick profiling on our Company's internal test suite consisting in 4000 
> tests with roughly 43 SQL queries passing through Calcite we observed 
> that a big amount of time is spend on code generation and class loading 
> making the EnumerableInterpretable#toBindable method a performance 
> bottleneck. 
> Among the 43 SQL queries there are many duplicates which are going to 
> lead to the generation of exactly the same code Java. Introducing, a small 
> cache at the level of EnumerableInterpretable class could avoid generating 
> and loading the same code over and over again.
> A simple implementation based on Guava improved the overall execution time of 
> the afforementioned test suite by more than 50%.
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (CALCITE-2703) Reduce code generation and class loading overhead when executing queries in the EnumerableConvention

2019-02-14 Thread Stamatis Zampetakis (JIRA)


[ 
https://issues.apache.org/jira/browse/CALCITE-2703?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16768751#comment-16768751
 ] 

Stamatis Zampetakis commented on CALCITE-2703:
--

Thanks [~julianhyde], [~hhlai1990] for your comments. I agree, about keeping 
the properties centralised. I will take care of it!

> Reduce code generation and class loading overhead when executing queries in 
> the EnumerableConvention
> 
>
> Key: CALCITE-2703
> URL: https://issues.apache.org/jira/browse/CALCITE-2703
> Project: Calcite
>  Issue Type: Improvement
>  Components: core
>Affects Versions: 1.17.0
>Reporter: Stamatis Zampetakis
>Assignee: Julian Hyde
>Priority: Major
>  Labels: performance, pull-request-available
>
> The queries using Calcite's EnumerableConvention always end-up generating new 
> java classes at runtime (using Janino) that are then instantiated using 
> reflection. This combination of class generation and class loading introduces 
> a big overhead in query response time.
> A quick profiling on our Company's internal test suite consisting in 4000 
> tests with roughly 43 SQL queries passing through Calcite we observed 
> that a big amount of time is spend on code generation and class loading 
> making the EnumerableInterpretable#toBindable method a performance 
> bottleneck. 
> Among the 43 SQL queries there are many duplicates which are going to 
> lead to the generation of exactly the same code Java. Introducing, a small 
> cache at the level of EnumerableInterpretable class could avoid generating 
> and loading the same code over and over again.
> A simple implementation based on Guava improved the overall execution time of 
> the afforementioned test suite by more than 50%.
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (CALCITE-2703) Reduce code generation and class loading overhead when executing queries in the EnumerableConvention

2019-02-13 Thread Julian Hyde (JIRA)


[ 
https://issues.apache.org/jira/browse/CALCITE-2703?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16767551#comment-16767551
 ] 

Julian Hyde commented on CALCITE-2703:
--

There is not currently a global config holder. (Let's have that discussion 
elsewhere... this case is about caching, not config.) But I do think that we 
should centralize the code that reads system properties, so that each property 
we use has a definition. That is sufficiently easy that it could be done as 
part of this case. 

> Reduce code generation and class loading overhead when executing queries in 
> the EnumerableConvention
> 
>
> Key: CALCITE-2703
> URL: https://issues.apache.org/jira/browse/CALCITE-2703
> Project: Calcite
>  Issue Type: Improvement
>  Components: core
>Affects Versions: 1.17.0
>Reporter: Stamatis Zampetakis
>Assignee: Julian Hyde
>Priority: Major
>  Labels: performance, pull-request-available
>
> The queries using Calcite's EnumerableConvention always end-up generating new 
> java classes at runtime (using Janino) that are then instantiated using 
> reflection. This combination of class generation and class loading introduces 
> a big overhead in query response time.
> A quick profiling on our Company's internal test suite consisting in 4000 
> tests with roughly 43 SQL queries passing through Calcite we observed 
> that a big amount of time is spend on code generation and class loading 
> making the EnumerableInterpretable#toBindable method a performance 
> bottleneck. 
> Among the 43 SQL queries there are many duplicates which are going to 
> lead to the generation of exactly the same code Java. Introducing, a small 
> cache at the level of EnumerableInterpretable class could avoid generating 
> and loading the same code over and over again.
> A simple implementation based on Guava improved the overall execution time of 
> the afforementioned test suite by more than 50%.
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (CALCITE-2703) Reduce code generation and class loading overhead when executing queries in the EnumerableConvention

2019-02-12 Thread Lai Zhou (JIRA)


[ 
https://issues.apache.org/jira/browse/CALCITE-2703?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16766897#comment-16766897
 ] 

Lai Zhou commented on CALCITE-2703:
---

[~julianhyde], I think this cache will be shared by different connections, 
it'll be better to provide a global property . Is there a global config holder? 
I don't like the way  Util.getXXProperty to get the system property.

> Reduce code generation and class loading overhead when executing queries in 
> the EnumerableConvention
> 
>
> Key: CALCITE-2703
> URL: https://issues.apache.org/jira/browse/CALCITE-2703
> Project: Calcite
>  Issue Type: Improvement
>  Components: core
>Affects Versions: 1.17.0
>Reporter: Stamatis Zampetakis
>Assignee: Julian Hyde
>Priority: Major
>  Labels: performance, pull-request-available
>
> The queries using Calcite's EnumerableConvention always end-up generating new 
> java classes at runtime (using Janino) that are then instantiated using 
> reflection. This combination of class generation and class loading introduces 
> a big overhead in query response time.
> A quick profiling on our Company's internal test suite consisting in 4000 
> tests with roughly 43 SQL queries passing through Calcite we observed 
> that a big amount of time is spend on code generation and class loading 
> making the EnumerableInterpretable#toBindable method a performance 
> bottleneck. 
> Among the 43 SQL queries there are many duplicates which are going to 
> lead to the generation of exactly the same code Java. Introducing, a small 
> cache at the level of EnumerableInterpretable class could avoid generating 
> and loading the same code over and over again.
> A simple implementation based on Guava improved the overall execution time of 
> the afforementioned test suite by more than 50%.
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (CALCITE-2703) Reduce code generation and class loading overhead when executing queries in the EnumerableConvention

2019-02-12 Thread Julian Hyde (JIRA)


[ 
https://issues.apache.org/jira/browse/CALCITE-2703?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16766414#comment-16766414
 ] 

Julian Hyde commented on CALCITE-2703:
--

The change looks fine.

I'd like some documentation around the property. I can't easily tell whether 
the cache is enabled or disabled by default.

How about a class, similar to CalciteResource or CalciteConnectionConfig, that 
holds all of the system properties that we use, throughout the code: their 
paths ("calcite.bindable.cache.maxSize"), types, default values, and a 
description of their use.

> Reduce code generation and class loading overhead when executing queries in 
> the EnumerableConvention
> 
>
> Key: CALCITE-2703
> URL: https://issues.apache.org/jira/browse/CALCITE-2703
> Project: Calcite
>  Issue Type: Improvement
>  Components: core
>Affects Versions: 1.17.0
>Reporter: Stamatis Zampetakis
>Assignee: Julian Hyde
>Priority: Major
>  Labels: performance, pull-request-available
>
> The queries using Calcite's EnumerableConvention always end-up generating new 
> java classes at runtime (using Janino) that are then instantiated using 
> reflection. This combination of class generation and class loading introduces 
> a big overhead in query response time.
> A quick profiling on our Company's internal test suite consisting in 4000 
> tests with roughly 43 SQL queries passing through Calcite we observed 
> that a big amount of time is spend on code generation and class loading 
> making the EnumerableInterpretable#toBindable method a performance 
> bottleneck. 
> Among the 43 SQL queries there are many duplicates which are going to 
> lead to the generation of exactly the same code Java. Introducing, a small 
> cache at the level of EnumerableInterpretable class could avoid generating 
> and loading the same code over and over again.
> A simple implementation based on Guava improved the overall execution time of 
> the afforementioned test suite by more than 50%.
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (CALCITE-2703) Reduce code generation and class loading overhead when executing queries in the EnumerableConvention

2019-02-12 Thread Stamatis Zampetakis (JIRA)


[ 
https://issues.apache.org/jira/browse/CALCITE-2703?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16766097#comment-16766097
 ] 

Stamatis Zampetakis commented on CALCITE-2703:
--

Thanks [~vladimirsitnikov]! Meanwhile, I will try to think of a better name.

> Reduce code generation and class loading overhead when executing queries in 
> the EnumerableConvention
> 
>
> Key: CALCITE-2703
> URL: https://issues.apache.org/jira/browse/CALCITE-2703
> Project: Calcite
>  Issue Type: Improvement
>  Components: core
>Affects Versions: 1.17.0
>Reporter: Stamatis Zampetakis
>Assignee: Julian Hyde
>Priority: Major
>  Labels: performance, pull-request-available
>
> The queries using Calcite's EnumerableConvention always end-up generating new 
> java classes at runtime (using Janino) that are then instantiated using 
> reflection. This combination of class generation and class loading introduces 
> a big overhead in query response time.
> A quick profiling on our Company's internal test suite consisting in 4000 
> tests with roughly 43 SQL queries passing through Calcite we observed 
> that a big amount of time is spend on code generation and class loading 
> making the EnumerableInterpretable#toBindable method a performance 
> bottleneck. 
> Among the 43 SQL queries there are many duplicates which are going to 
> lead to the generation of exactly the same code Java. Introducing, a small 
> cache at the level of EnumerableInterpretable class could avoid generating 
> and loading the same code over and over again.
> A simple implementation based on Guava improved the overall execution time of 
> the afforementioned test suite by more than 50%.
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (CALCITE-2703) Reduce code generation and class loading overhead when executing queries in the EnumerableConvention

2019-02-12 Thread Vladimir Sitnikov (JIRA)


[ 
https://issues.apache.org/jira/browse/CALCITE-2703?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16766095#comment-16766095
 ] 

Vladimir Sitnikov commented on CALCITE-2703:


[~zabetak], thanks for the ping, I'll review it later provided I have time (== 
don't count on me if I don't come back in a week).

By the way, I don't quite like property name here: 
Util.getIntProperty("calcite.bindable.cache.maxSize"

Property names are public API, and I think {{calcite.bindable.cache.maxSize}} 
is a quite obscure property name.


> Reduce code generation and class loading overhead when executing queries in 
> the EnumerableConvention
> 
>
> Key: CALCITE-2703
> URL: https://issues.apache.org/jira/browse/CALCITE-2703
> Project: Calcite
>  Issue Type: Improvement
>  Components: core
>Affects Versions: 1.17.0
>Reporter: Stamatis Zampetakis
>Assignee: Julian Hyde
>Priority: Major
>  Labels: performance, pull-request-available
>
> The queries using Calcite's EnumerableConvention always end-up generating new 
> java classes at runtime (using Janino) that are then instantiated using 
> reflection. This combination of class generation and class loading introduces 
> a big overhead in query response time.
> A quick profiling on our Company's internal test suite consisting in 4000 
> tests with roughly 43 SQL queries passing through Calcite we observed 
> that a big amount of time is spend on code generation and class loading 
> making the EnumerableInterpretable#toBindable method a performance 
> bottleneck. 
> Among the 43 SQL queries there are many duplicates which are going to 
> lead to the generation of exactly the same code Java. Introducing, a small 
> cache at the level of EnumerableInterpretable class could avoid generating 
> and loading the same code over and over again.
> A simple implementation based on Guava improved the overall execution time of 
> the afforementioned test suite by more than 50%.
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (CALCITE-2703) Reduce code generation and class loading overhead when executing queries in the EnumerableConvention

2019-02-12 Thread Stamatis Zampetakis (JIRA)


[ 
https://issues.apache.org/jira/browse/CALCITE-2703?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16766080#comment-16766080
 ] 

Stamatis Zampetakis commented on CALCITE-2703:
--

Hi [~vladimirsitnikov], [~julianhyde], [~hhlai1990], I think the PR for this 
Jira is in good shape. If you don't have further comments I can proceed on 
merging it to the master.   

> Reduce code generation and class loading overhead when executing queries in 
> the EnumerableConvention
> 
>
> Key: CALCITE-2703
> URL: https://issues.apache.org/jira/browse/CALCITE-2703
> Project: Calcite
>  Issue Type: Improvement
>  Components: core
>Affects Versions: 1.17.0
>Reporter: Stamatis Zampetakis
>Assignee: Julian Hyde
>Priority: Major
>  Labels: performance, pull-request-available
>
> The queries using Calcite's EnumerableConvention always end-up generating new 
> java classes at runtime (using Janino) that are then instantiated using 
> reflection. This combination of class generation and class loading introduces 
> a big overhead in query response time.
> A quick profiling on our Company's internal test suite consisting in 4000 
> tests with roughly 43 SQL queries passing through Calcite we observed 
> that a big amount of time is spend on code generation and class loading 
> making the EnumerableInterpretable#toBindable method a performance 
> bottleneck. 
> Among the 43 SQL queries there are many duplicates which are going to 
> lead to the generation of exactly the same code Java. Introducing, a small 
> cache at the level of EnumerableInterpretable class could avoid generating 
> and loading the same code over and over again.
> A simple implementation based on Guava improved the overall execution time of 
> the afforementioned test suite by more than 50%.
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (CALCITE-2703) Reduce code generation and class loading overhead when executing queries in the EnumerableConvention

2018-12-13 Thread Stamatis Zampetakis (JIRA)


[ 
https://issues.apache.org/jira/browse/CALCITE-2703?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16720057#comment-16720057
 ] 

Stamatis Zampetakis commented on CALCITE-2703:
--

[~hhlai1990], I think it is doable but I believe it will be complimentary to 
this cache so I would suggest to open a new Jira about it.

> Reduce code generation and class loading overhead when executing queries in 
> the EnumerableConvention
> 
>
> Key: CALCITE-2703
> URL: https://issues.apache.org/jira/browse/CALCITE-2703
> Project: Calcite
>  Issue Type: Improvement
>  Components: core
>Affects Versions: 1.17.0
>Reporter: Stamatis Zampetakis
>Assignee: Julian Hyde
>Priority: Major
>  Labels: performance
> Fix For: 1.18.0
>
>
> The queries using Calcite's EnumerableConvention always end-up generating new 
> java classes at runtime (using Janino) that are then instantiated using 
> reflection. This combination of class generation and class loading introduces 
> a big overhead in query response time.
> A quick profiling on our Company's internal test suite consisting in 4000 
> tests with roughly 43 SQL queries passing through Calcite we observed 
> that a big amount of time is spend on code generation and class loading 
> making the EnumerableInterpretable#toBindable method a performance 
> bottleneck. 
> Among the 43 SQL queries there are many duplicates which are going to 
> lead to the generation of exactly the same code Java. Introducing, a small 
> cache at the level of EnumerableInterpretable class could avoid generating 
> and loading the same code over and over again.
> A simple implementation based on Guava improved the overall execution time of 
> the afforementioned test suite by more than 50%.
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (CALCITE-2703) Reduce code generation and class loading overhead when executing queries in the EnumerableConvention

2018-12-13 Thread Lai Zhou (JIRA)


[ 
https://issues.apache.org/jira/browse/CALCITE-2703?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16720050#comment-16720050
 ] 

Lai Zhou commented on CALCITE-2703:
---

[~zabetak], in fact, we use calcite to execute a lot of  'hive etl sql'  which 
are big queries for online machine-learning. we extended the native enummerable 
implementation of calcite to support HiveOperator.

I think it will be great if we can cache more thing(maybe the whole query plan) 
for the same sql .

I think it's  a common and typical use case .

> Reduce code generation and class loading overhead when executing queries in 
> the EnumerableConvention
> 
>
> Key: CALCITE-2703
> URL: https://issues.apache.org/jira/browse/CALCITE-2703
> Project: Calcite
>  Issue Type: Improvement
>  Components: core
>Affects Versions: 1.17.0
>Reporter: Stamatis Zampetakis
>Assignee: Julian Hyde
>Priority: Major
>  Labels: performance
> Fix For: 1.18.0
>
>
> The queries using Calcite's EnumerableConvention always end-up generating new 
> java classes at runtime (using Janino) that are then instantiated using 
> reflection. This combination of class generation and class loading introduces 
> a big overhead in query response time.
> A quick profiling on our Company's internal test suite consisting in 4000 
> tests with roughly 43 SQL queries passing through Calcite we observed 
> that a big amount of time is spend on code generation and class loading 
> making the EnumerableInterpretable#toBindable method a performance 
> bottleneck. 
> Among the 43 SQL queries there are many duplicates which are going to 
> lead to the generation of exactly the same code Java. Introducing, a small 
> cache at the level of EnumerableInterpretable class could avoid generating 
> and loading the same code over and over again.
> A simple implementation based on Guava improved the overall execution time of 
> the afforementioned test suite by more than 50%.
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (CALCITE-2703) Reduce code generation and class loading overhead when executing queries in the EnumerableConvention

2018-12-13 Thread Stamatis Zampetakis (JIRA)


[ 
https://issues.apache.org/jira/browse/CALCITE-2703?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16720036#comment-16720036
 ] 

Stamatis Zampetakis commented on CALCITE-2703:
--

[~hhlai1990], as it is the case for this optimization it can be beneficial 
depending on the query workload. If you have big queries with many joins that 
happen to appear very often it can be beneficial. On the other hand, if the 
queries are very simple or if they don't appear often it will not bring much 
benefit, rather the opposite. 

> Reduce code generation and class loading overhead when executing queries in 
> the EnumerableConvention
> 
>
> Key: CALCITE-2703
> URL: https://issues.apache.org/jira/browse/CALCITE-2703
> Project: Calcite
>  Issue Type: Improvement
>  Components: core
>Affects Versions: 1.17.0
>Reporter: Stamatis Zampetakis
>Assignee: Julian Hyde
>Priority: Major
>  Labels: performance
> Fix For: 1.18.0
>
>
> The queries using Calcite's EnumerableConvention always end-up generating new 
> java classes at runtime (using Janino) that are then instantiated using 
> reflection. This combination of class generation and class loading introduces 
> a big overhead in query response time.
> A quick profiling on our Company's internal test suite consisting in 4000 
> tests with roughly 43 SQL queries passing through Calcite we observed 
> that a big amount of time is spend on code generation and class loading 
> making the EnumerableInterpretable#toBindable method a performance 
> bottleneck. 
> Among the 43 SQL queries there are many duplicates which are going to 
> lead to the generation of exactly the same code Java. Introducing, a small 
> cache at the level of EnumerableInterpretable class could avoid generating 
> and loading the same code over and over again.
> A simple implementation based on Guava improved the overall execution time of 
> the afforementioned test suite by more than 50%.
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (CALCITE-2703) Reduce code generation and class loading overhead when executing queries in the EnumerableConvention

2018-12-13 Thread Lai Zhou (JIRA)


[ 
https://issues.apache.org/jira/browse/CALCITE-2703?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16719993#comment-16719993
 ] 

Lai Zhou commented on CALCITE-2703:
---

[~julianhyde] [~zabetak] I use calcite(1.17.0) for real-time computation in 
product environment, is it  beneficial for low latency to cache the query plan 
, not just the instance?

 

 

 

> Reduce code generation and class loading overhead when executing queries in 
> the EnumerableConvention
> 
>
> Key: CALCITE-2703
> URL: https://issues.apache.org/jira/browse/CALCITE-2703
> Project: Calcite
>  Issue Type: Improvement
>  Components: core
>Affects Versions: 1.17.0
>Reporter: Stamatis Zampetakis
>Assignee: Julian Hyde
>Priority: Major
>  Labels: performance
> Fix For: 1.18.0
>
>
> The queries using Calcite's EnumerableConvention always end-up generating new 
> java classes at runtime (using Janino) that are then instantiated using 
> reflection. This combination of class generation and class loading introduces 
> a big overhead in query response time.
> A quick profiling on our Company's internal test suite consisting in 4000 
> tests with roughly 43 SQL queries passing through Calcite we observed 
> that a big amount of time is spend on code generation and class loading 
> making the EnumerableInterpretable#toBindable method a performance 
> bottleneck. 
> Among the 43 SQL queries there are many duplicates which are going to 
> lead to the generation of exactly the same code Java. Introducing, a small 
> cache at the level of EnumerableInterpretable class could avoid generating 
> and loading the same code over and over again.
> A simple implementation based on Guava improved the overall execution time of 
> the afforementioned test suite by more than 50%.
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (CALCITE-2703) Reduce code generation and class loading overhead when executing queries in the EnumerableConvention

2018-12-04 Thread Stamatis Zampetakis (JIRA)


[ 
https://issues.apache.org/jira/browse/CALCITE-2703?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16708539#comment-16708539
 ] 

Stamatis Zampetakis commented on CALCITE-2703:
--

I placed the cache inside EnumerableInterpretable#getBindable method as 
suggested by [~vladimirsitnikov] and made it configurable using runtime 
properties as suggested by [~julianhyde].

> Reduce code generation and class loading overhead when executing queries in 
> the EnumerableConvention
> 
>
> Key: CALCITE-2703
> URL: https://issues.apache.org/jira/browse/CALCITE-2703
> Project: Calcite
>  Issue Type: Improvement
>  Components: core
>Affects Versions: 1.17.0
>Reporter: Stamatis Zampetakis
>Assignee: Julian Hyde
>Priority: Major
>  Labels: performance
> Fix For: 1.18.0
>
>
> The queries using Calcite's EnumerableConvention always end-up generating new 
> java classes at runtime (using Janino) that are then instantiated using 
> reflection. This combination of class generation and class loading introduces 
> a big overhead in query response time.
> A quick profiling on our Company's internal test suite consisting in 4000 
> tests with roughly 43 SQL queries passing through Calcite we observed 
> that a big amount of time is spend on code generation and class loading 
> making the EnumerableInterpretable#toBindable method a performance 
> bottleneck. 
> Among the 43 SQL queries there are many duplicates which are going to 
> lead to the generation of exactly the same code Java. Introducing, a small 
> cache at the level of EnumerableInterpretable class could avoid generating 
> and loading the same code over and over again.
> A simple implementation based on Guava improved the overall execution time of 
> the afforementioned test suite by more than 50%.
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (CALCITE-2703) Reduce code generation and class loading overhead when executing queries in the EnumerableConvention

2018-12-01 Thread Julian Hyde (JIRA)


[ 
https://issues.apache.org/jira/browse/CALCITE-2703?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16706052#comment-16706052
 ] 

Julian Hyde commented on CALCITE-2703:
--

{quote}I opened this JIRA case because I believe a cache will be beneficial for 
the majority of users.{quote}

I don't believe it will be beneficial to a majority (taking into account the 
hidden cost of extra complexity). However, I believe it could be beneficial for 
some users with particular work-loads. I think it is the kind of feature that 
should be enabled using a runtime property, disabled by default. Could you do 
that, and without significantly increasing the complexity of the default code 
path? I will accept this PR if you can do that.

> Reduce code generation and class loading overhead when executing queries in 
> the EnumerableConvention
> 
>
> Key: CALCITE-2703
> URL: https://issues.apache.org/jira/browse/CALCITE-2703
> Project: Calcite
>  Issue Type: Improvement
>  Components: core
>Affects Versions: 1.17.0
>Reporter: Stamatis Zampetakis
>Assignee: Julian Hyde
>Priority: Major
>  Labels: performance
> Fix For: 1.18.0
>
>
> The queries using Calcite's EnumerableConvention always end-up generating new 
> java classes at runtime (using Janino) that are then instantiated using 
> reflection. This combination of class generation and class loading introduces 
> a big overhead in query response time.
> A quick profiling on our Company's internal test suite consisting in 4000 
> tests with roughly 43 SQL queries passing through Calcite we observed 
> that a big amount of time is spend on code generation and class loading 
> making the EnumerableInterpretable#toBindable method a performance 
> bottleneck. 
> Among the 43 SQL queries there are many duplicates which are going to 
> lead to the generation of exactly the same code Java. Introducing, a small 
> cache at the level of EnumerableInterpretable class could avoid generating 
> and loading the same code over and over again.
> A simple implementation based on Guava improved the overall execution time of 
> the afforementioned test suite by more than 50%.
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (CALCITE-2703) Reduce code generation and class loading overhead when executing queries in the EnumerableConvention

2018-12-01 Thread Stamatis Zampetakis (JIRA)


[ 
https://issues.apache.org/jira/browse/CALCITE-2703?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16705864#comment-16705864
 ] 

Stamatis Zampetakis commented on CALCITE-2703:
--

The first reason to skip dynamic functions is the restriction imposed by the 
Javadoc shown below:
{code:java}
/**
* @return true iff it is unsafe to cache query plans referencing this
* operator; false is assumed by default
*/
public boolean isDynamicFunction() {
  return false;
}
{code}
but the real reason is that some dynamic functions (e.g., RAND etc) have the 
@Determinstic annotation which ends up creating static fields in generated 
classes and this do not work well with the cache. Maybe we should re-consider 
which functions we mark as determinstic. 

{quote}
What do you mean by "server" here?
{quote}
I meant through the connection interface. Typically, connection indicates 
connection to a server so that's how server came into the discussion. I know 
that you don't have to start a server, sorry for the confusion.


 

> Reduce code generation and class loading overhead when executing queries in 
> the EnumerableConvention
> 
>
> Key: CALCITE-2703
> URL: https://issues.apache.org/jira/browse/CALCITE-2703
> Project: Calcite
>  Issue Type: Improvement
>  Components: core
>Affects Versions: 1.17.0
>Reporter: Stamatis Zampetakis
>Assignee: Julian Hyde
>Priority: Major
>  Labels: performance
> Fix For: 1.18.0
>
>
> The queries using Calcite's EnumerableConvention always end-up generating new 
> java classes at runtime (using Janino) that are then instantiated using 
> reflection. This combination of class generation and class loading introduces 
> a big overhead in query response time.
> A quick profiling on our Company's internal test suite consisting in 4000 
> tests with roughly 43 SQL queries passing through Calcite we observed 
> that a big amount of time is spend on code generation and class loading 
> making the EnumerableInterpretable#toBindable method a performance 
> bottleneck. 
> Among the 43 SQL queries there are many duplicates which are going to 
> lead to the generation of exactly the same code Java. Introducing, a small 
> cache at the level of EnumerableInterpretable class could avoid generating 
> and loading the same code over and over again.
> A simple implementation based on Guava improved the overall execution time of 
> the afforementioned test suite by more than 50%.
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (CALCITE-2703) Reduce code generation and class loading overhead when executing queries in the EnumerableConvention

2018-12-01 Thread Vladimir Sitnikov (JIRA)


[ 
https://issues.apache.org/jira/browse/CALCITE-2703?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16705823#comment-16705823
 ] 

Vladimir Sitnikov commented on CALCITE-2703:


{quote}Calcite as a server.{quote}
I don't get what do you mean by "server" here.
One can access in-process Calcite via {{DriverManager.getConnection}}. You 
don't have to "start server" for that.


> Reduce code generation and class loading overhead when executing queries in 
> the EnumerableConvention
> 
>
> Key: CALCITE-2703
> URL: https://issues.apache.org/jira/browse/CALCITE-2703
> Project: Calcite
>  Issue Type: Improvement
>  Components: core
>Affects Versions: 1.17.0
>Reporter: Stamatis Zampetakis
>Assignee: Julian Hyde
>Priority: Major
>  Labels: performance
> Fix For: 1.18.0
>
>
> The queries using Calcite's EnumerableConvention always end-up generating new 
> java classes at runtime (using Janino) that are then instantiated using 
> reflection. This combination of class generation and class loading introduces 
> a big overhead in query response time.
> A quick profiling on our Company's internal test suite consisting in 4000 
> tests with roughly 43 SQL queries passing through Calcite we observed 
> that a big amount of time is spend on code generation and class loading 
> making the EnumerableInterpretable#toBindable method a performance 
> bottleneck. 
> Among the 43 SQL queries there are many duplicates which are going to 
> lead to the generation of exactly the same code Java. Introducing, a small 
> cache at the level of EnumerableInterpretable class could avoid generating 
> and loading the same code over and over again.
> A simple implementation based on Guava improved the overall execution time of 
> the afforementioned test suite by more than 50%.
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (CALCITE-2703) Reduce code generation and class loading overhead when executing queries in the EnumerableConvention

2018-12-01 Thread Vladimir Sitnikov (JIRA)


[ 
https://issues.apache.org/jira/browse/CALCITE-2703?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16705819#comment-16705819
 ] 

Vladimir Sitnikov commented on CALCITE-2703:


[~zabetak], what is the reason to skip class cache in case the plan contains 
dynamic functions?

> Reduce code generation and class loading overhead when executing queries in 
> the EnumerableConvention
> 
>
> Key: CALCITE-2703
> URL: https://issues.apache.org/jira/browse/CALCITE-2703
> Project: Calcite
>  Issue Type: Improvement
>  Components: core
>Affects Versions: 1.17.0
>Reporter: Stamatis Zampetakis
>Assignee: Julian Hyde
>Priority: Major
>  Labels: performance
> Fix For: 1.18.0
>
>
> The queries using Calcite's EnumerableConvention always end-up generating new 
> java classes at runtime (using Janino) that are then instantiated using 
> reflection. This combination of class generation and class loading introduces 
> a big overhead in query response time.
> A quick profiling on our Company's internal test suite consisting in 4000 
> tests with roughly 43 SQL queries passing through Calcite we observed 
> that a big amount of time is spend on code generation and class loading 
> making the EnumerableInterpretable#toBindable method a performance 
> bottleneck. 
> Among the 43 SQL queries there are many duplicates which are going to 
> lead to the generation of exactly the same code Java. Introducing, a small 
> cache at the level of EnumerableInterpretable class could avoid generating 
> and loading the same code over and over again.
> A simple implementation based on Guava improved the overall execution time of 
> the afforementioned test suite by more than 50%.
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (CALCITE-2703) Reduce code generation and class loading overhead when executing queries in the EnumerableConvention

2018-12-01 Thread Stamatis Zampetakis (JIRA)


[ 
https://issues.apache.org/jira/browse/CALCITE-2703?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16705762#comment-16705762
 ] 

Stamatis Zampetakis commented on CALCITE-2703:
--

That's right! I am using many parts of it but not Calcite as a server. 

> Reduce code generation and class loading overhead when executing queries in 
> the EnumerableConvention
> 
>
> Key: CALCITE-2703
> URL: https://issues.apache.org/jira/browse/CALCITE-2703
> Project: Calcite
>  Issue Type: Improvement
>  Components: core
>Affects Versions: 1.17.0
>Reporter: Stamatis Zampetakis
>Assignee: Julian Hyde
>Priority: Major
>  Labels: performance
> Fix For: 1.18.0
>
>
> The queries using Calcite's EnumerableConvention always end-up generating new 
> java classes at runtime (using Janino) that are then instantiated using 
> reflection. This combination of class generation and class loading introduces 
> a big overhead in query response time.
> A quick profiling on our Company's internal test suite consisting in 4000 
> tests with roughly 43 SQL queries passing through Calcite we observed 
> that a big amount of time is spend on code generation and class loading 
> making the EnumerableInterpretable#toBindable method a performance 
> bottleneck. 
> Among the 43 SQL queries there are many duplicates which are going to 
> lead to the generation of exactly the same code Java. Introducing, a small 
> cache at the level of EnumerableInterpretable class could avoid generating 
> and loading the same code over and over again.
> A simple implementation based on Guava improved the overall execution time of 
> the afforementioned test suite by more than 50%.
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (CALCITE-2703) Reduce code generation and class loading overhead when executing queries in the EnumerableConvention

2018-12-01 Thread Vladimir Sitnikov (JIRA)


[ 
https://issues.apache.org/jira/browse/CALCITE-2703?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16705750#comment-16705750
 ] 

Vladimir Sitnikov commented on CALCITE-2703:


{quote} thus does not pass from the PreparedStatement{quote}
Do you mean you are not using PreparedStatement to access Calcite?


> Reduce code generation and class loading overhead when executing queries in 
> the EnumerableConvention
> 
>
> Key: CALCITE-2703
> URL: https://issues.apache.org/jira/browse/CALCITE-2703
> Project: Calcite
>  Issue Type: Improvement
>  Components: core
>Affects Versions: 1.17.0
>Reporter: Stamatis Zampetakis
>Assignee: Julian Hyde
>Priority: Major
>  Labels: performance
> Fix For: 1.18.0
>
>
> The queries using Calcite's EnumerableConvention always end-up generating new 
> java classes at runtime (using Janino) that are then instantiated using 
> reflection. This combination of class generation and class loading introduces 
> a big overhead in query response time.
> A quick profiling on our Company's internal test suite consisting in 4000 
> tests with roughly 43 SQL queries passing through Calcite we observed 
> that a big amount of time is spend on code generation and class loading 
> making the EnumerableInterpretable#toBindable method a performance 
> bottleneck. 
> Among the 43 SQL queries there are many duplicates which are going to 
> lead to the generation of exactly the same code Java. Introducing, a small 
> cache at the level of EnumerableInterpretable class could avoid generating 
> and loading the same code over and over again.
> A simple implementation based on Guava improved the overall execution time of 
> the afforementioned test suite by more than 50%.
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (CALCITE-2703) Reduce code generation and class loading overhead when executing queries in the EnumerableConvention

2018-12-01 Thread Stamatis Zampetakis (JIRA)


[ 
https://issues.apache.org/jira/browse/CALCITE-2703?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16705738#comment-16705738
 ] 

Stamatis Zampetakis commented on CALCITE-2703:
--

{quote}
In other words, what if we add a caching layer in-between PreparedStatement and 
EnumerableInterpretable#toBindable?
{quote}
[~vladimirsitnikov] That was kind of my original idea but the problem with this 
is that whoever does not use Calcite as a server (e.g., me) thus does not pass 
from the PreparedStatement. This mean, that if he uses the EnumerableConvention 
he will potentially need to add another cache at the application layer.

[~julianhyde] I opened this JIRA case because I believe a cache will be 
beneficial for the majority of users. If you believe it is not the case, I will 
close this issue and we will add the cache outside of Calcite. Other than that 
using Calcite's interpreter is not an alternative since it also passes from 
Janino thus it suffers from the same compilation/ class loading overhead.  

> Reduce code generation and class loading overhead when executing queries in 
> the EnumerableConvention
> 
>
> Key: CALCITE-2703
> URL: https://issues.apache.org/jira/browse/CALCITE-2703
> Project: Calcite
>  Issue Type: Improvement
>  Components: core
>Affects Versions: 1.17.0
>Reporter: Stamatis Zampetakis
>Assignee: Julian Hyde
>Priority: Major
>  Labels: performance
> Fix For: 1.18.0
>
>
> The queries using Calcite's EnumerableConvention always end-up generating new 
> java classes at runtime (using Janino) that are then instantiated using 
> reflection. This combination of class generation and class loading introduces 
> a big overhead in query response time.
> A quick profiling on our Company's internal test suite consisting in 4000 
> tests with roughly 43 SQL queries passing through Calcite we observed 
> that a big amount of time is spend on code generation and class loading 
> making the EnumerableInterpretable#toBindable method a performance 
> bottleneck. 
> Among the 43 SQL queries there are many duplicates which are going to 
> lead to the generation of exactly the same code Java. Introducing, a small 
> cache at the level of EnumerableInterpretable class could avoid generating 
> and loading the same code over and over again.
> A simple implementation based on Guava improved the overall execution time of 
> the afforementioned test suite by more than 50%.
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (CALCITE-2703) Reduce code generation and class loading overhead when executing queries in the EnumerableConvention

2018-11-30 Thread Julian Hyde (JIRA)


[ 
https://issues.apache.org/jira/browse/CALCITE-2703?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16705567#comment-16705567
 ] 

Julian Hyde commented on CALCITE-2703:
--

This always happens when someone proposes a cache. They have a use case where 
the cache makes things a lot better; for everyone else, it makes things 
slightly worse.

If your use case is low-latency queries, a cache isn't the answer; you should 
be using the interpreter.

> Reduce code generation and class loading overhead when executing queries in 
> the EnumerableConvention
> 
>
> Key: CALCITE-2703
> URL: https://issues.apache.org/jira/browse/CALCITE-2703
> Project: Calcite
>  Issue Type: Improvement
>  Components: core
>Affects Versions: 1.17.0
>Reporter: Stamatis Zampetakis
>Assignee: Julian Hyde
>Priority: Major
>  Labels: performance
> Fix For: 1.18.0
>
>
> The queries using Calcite's EnumerableConvention always end-up generating new 
> java classes at runtime (using Janino) that are then instantiated using 
> reflection. This combination of class generation and class loading introduces 
> a big overhead in query response time.
> A quick profiling on our Company's internal test suite consisting in 4000 
> tests with roughly 43 SQL queries passing through Calcite we observed 
> that a big amount of time is spend on code generation and class loading 
> making the EnumerableInterpretable#toBindable method a performance 
> bottleneck. 
> Among the 43 SQL queries there are many duplicates which are going to 
> lead to the generation of exactly the same code Java. Introducing, a small 
> cache at the level of EnumerableInterpretable class could avoid generating 
> and loading the same code over and over again.
> A simple implementation based on Guava improved the overall execution time of 
> the afforementioned test suite by more than 50%.
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (CALCITE-2703) Reduce code generation and class loading overhead when executing queries in the EnumerableConvention

2018-11-30 Thread Vladimir Sitnikov (JIRA)


[ 
https://issues.apache.org/jira/browse/CALCITE-2703?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16705089#comment-16705089
 ] 

Vladimir Sitnikov commented on CALCITE-2703:


{quote}Calcite's implementation of PreparedStatement ends up calling 
EnumerableInterpretable#toBindable{quote}
[~zabetak],what if Calcite's PreparedStatement would avoid call of #toBindable 
in case the execution plan can be reused?
In other words, what if we add a caching layer in-between PreparedStatement and 
EnumerableInterpretable#toBindable?

> Reduce code generation and class loading overhead when executing queries in 
> the EnumerableConvention
> 
>
> Key: CALCITE-2703
> URL: https://issues.apache.org/jira/browse/CALCITE-2703
> Project: Calcite
>  Issue Type: Improvement
>  Components: core
>Affects Versions: 1.17.0
>Reporter: Stamatis Zampetakis
>Assignee: Julian Hyde
>Priority: Major
>  Labels: performance
> Fix For: 1.18.0
>
>
> The queries using Calcite's EnumerableConvention always end-up generating new 
> java classes at runtime (using Janino) that are then instantiated using 
> reflection. This combination of class generation and class loading introduces 
> a big overhead in query response time.
> A quick profiling on our Company's internal test suite consisting in 4000 
> tests with roughly 43 SQL queries passing through Calcite we observed 
> that a big amount of time is spend on code generation and class loading 
> making the EnumerableInterpretable#toBindable method a performance 
> bottleneck. 
> Among the 43 SQL queries there are many duplicates which are going to 
> lead to the generation of exactly the same code Java. Introducing, a small 
> cache at the level of EnumerableInterpretable class could avoid generating 
> and loading the same code over and over again.
> A simple implementation based on Guava improved the overall execution time of 
> the afforementioned test suite by more than 50%.
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (CALCITE-2703) Reduce code generation and class loading overhead when executing queries in the EnumerableConvention

2018-11-30 Thread Stamatis Zampetakis (JIRA)


[ 
https://issues.apache.org/jira/browse/CALCITE-2703?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16704984#comment-16704984
 ] 

Stamatis Zampetakis commented on CALCITE-2703:
--

[~vladimirsitnikov] Calcite's implementation of PreparedStatement ends up 
calling EnumerableInterpretable#toBindable through 
CalcitePrepareImpl.CalcitePreparingStmt#implement. All the code generation 
magic happens there and to be more precise inside 
EnumerableInterpretable#getBindable method.
  
 [~julianhyde] I undestand your concern, but I still believe it is worth the 
effort. I agree that the test query patterns are not like production patterns 
but this doesn't change the fact the same query patterns appear multiple times. 
It is very common that production workloads tend to repeat the same queries 
multiple times. I would dare to say that production workloads do not really 
introduce new queries very often. I am well aware of the fact that the 
production system may have a performance bottleneck elsewhere but this doesn't 
mean that code compilation/class loading comes for free. The fact that the test 
suite needs *2h* *instead of 1h* because of  that certainly raises a warning 
flag.
  
 To better showcase the advantages/disadvantages of using a cache, I did some 
micro benchmarks (code included in the PR) using jmh. The full result report on 
my local machine can be found 
[here|https://docs.google.com/spreadsheets/d/1yVIjan8Nw-aCQOmvYLgj1PdlkrjeblNHihvfhnc5yf4/edit?usp=sharing].
 For convenience, I provide below a small extract with the average response 
times per operation:
||Benchmark||(cacheSize)||(queries)||avt||Units||
|CodeGenerationBenchmark.getBindableNoCache|N/A|1|11289368.49|ns/op|
| | |10|11826950.78|ns/op|
| | |100|12633361.83|ns/op|
| | |1000|13941583.80|ns/op|
|CodeGenerationBenchmark.getBindableWithCacheAndCandidateDetector|10|1|4864.99|ns/op|
| | |10|5916.71|ns/op|
| | |100|11749056.09|ns/op|
| | |1000|12479969.01|ns/op|
| |100|1|4102.78|ns/op|
| | |10|5377.52|ns/op|
| | |100|9507081.47|ns/op|
| | |1000|11093308.52|ns/op|
| |1000|1|5297.09|ns/op|
| | |10|7506.15|ns/op|
| | |100|10994651.69|ns/op|
| | |1000|15484723.50|ns/op|

+Note+: when the query workload above contains 100 or more queries most of the 
operations lead to cache misses (since the measurement time is set to 1sec).

*Bindables without cache*
  
 The existing code has a few drawbacks.
  
 _1. Compilation/Class loading overhead_: I would like  to comment the fact 
that the current implementation (getBindableNoCache) takes in average 12ms 
(with a hot VM this can go lower but still) which I consider a lot. For simple, 
high selective, queries this time can easily be more than the time of really 
executing the query. This is a general problem of compiled vs. interpreted 
programs. Someone could argue that a few ms is not really a big deal so this 
brings me to the second drawback.
  
 _2. Metaspace pollution/Class unloading overhead:_ No cache, means that a new 
class will be loaded for every query consuming both Heap and Metaspace memory. 
This translates to higher memory requirements and increased activity of the 
garbage collector affecting performance negatively. Altough, the cost of 
loading classes rougly appears in the previous benchmark the cost of unloading 
(performed by the gc) is more difficult to measure in a few benchmark 
iterations.

_3. Increased JIT activity:_ Since new classes  are generated for every query 
the JIT compiler has to get in the middle in order to optimise the new bytecode 
in order to make the queries run faster. Part of the high response times 
reported above are attributable to JIT. JIT runs in separate threads, each one 
occupying more than 50% of available cores processing capacity. In low threaded 
applications, this can easily pass unoticed since running threads can always 
find an available core but as the number of threads grows JIT threads will have 
to compete with application threads affecting the overall performance of the 
system. 
  
 *Bindables with cache*
  
 Adding a cache layer has also a few drawbacks.
  
 _1. Cache access overhead_: An additional cache layer means additional 
overhead in every query for accessing the cache. In the case that we never hit 
the cache we are paying an extra cost for nothing. In the previous benchmark 
this penalty is negligible (i.e., ~2ms extra in the worst case) and hopefully 
such scenarios should be rather rare. 
  
 _2. Heap space overhead_: The approach of using the Java code as the key for 
the cache is not memory efficient either but can be mitigated by choosing a 
relatively small size for the cache (e.g., 100, 1000).

_3. Code complexity_: An additional cache layer brings complexity issues in 
terms of implementation and code maintenance.

*Conclusion*
 Overall, I am rather positive in the idea of the cache for the various reasons 
outlined above so I will be happy to 

[jira] [Commented] (CALCITE-2703) Reduce code generation and class loading overhead when executing queries in the EnumerableConvention

2018-11-23 Thread Julian Hyde (JIRA)


[ 
https://issues.apache.org/jira/browse/CALCITE-2703?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16697373#comment-16697373
 ] 

Julian Hyde commented on CALCITE-2703:
--

Let’s be careful. You’re adding a layer of caching in production code path to 
make the test suite run faster. Test query patterns are not like production 
patterns. Are you optimizing the wrong thing?

> Reduce code generation and class loading overhead when executing queries in 
> the EnumerableConvention
> 
>
> Key: CALCITE-2703
> URL: https://issues.apache.org/jira/browse/CALCITE-2703
> Project: Calcite
>  Issue Type: Improvement
>  Components: core
>Affects Versions: 1.17.0
>Reporter: Stamatis Zampetakis
>Assignee: Julian Hyde
>Priority: Major
>  Labels: performance
> Fix For: 1.18.0
>
>
> The queries using Calcite's EnumerableConvention always end-up generating new 
> java classes at runtime (using Janino) that are then instantiated using 
> reflection. This combination of class generation and class loading introduces 
> a big overhead in query response time.
> A quick profiling on our Company's internal test suite consisting in 4000 
> tests with roughly 43 SQL queries passing through Calcite we observed 
> that a big amount of time is spend on code generation and class loading 
> making the EnumerableInterpretable#toBindable method a performance 
> bottleneck. 
> Among the 43 SQL queries there are many duplicates which are going to 
> lead to the generation of exactly the same code Java. Introducing, a small 
> cache at the level of EnumerableInterpretable class could avoid generating 
> and loading the same code over and over again.
> A simple implementation based on Guava improved the overall execution time of 
> the afforementioned test suite by more than 50%.
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (CALCITE-2703) Reduce code generation and class loading overhead when executing queries in the EnumerableConvention

2018-11-23 Thread Vladimir Sitnikov (JIRA)


[ 
https://issues.apache.org/jira/browse/CALCITE-2703?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16697316#comment-16697316
 ] 

Vladimir Sitnikov commented on CALCITE-2703:


I mean cache in Calcite's implementation of PreparedStatement.

PS. EnumerableInterpretable should probably just interpret things without code 
generation. Am I missing something?

> Reduce code generation and class loading overhead when executing queries in 
> the EnumerableConvention
> 
>
> Key: CALCITE-2703
> URL: https://issues.apache.org/jira/browse/CALCITE-2703
> Project: Calcite
>  Issue Type: Improvement
>  Components: core
>Affects Versions: 1.17.0
>Reporter: Stamatis Zampetakis
>Assignee: Julian Hyde
>Priority: Major
>  Labels: performance
> Fix For: 1.18.0
>
>
> The queries using Calcite's EnumerableConvention always end-up generating new 
> java classes at runtime (using Janino) that are then instantiated using 
> reflection. This combination of class generation and class loading introduces 
> a big overhead in query response time.
> A quick profiling on our Company's internal test suite consisting in 4000 
> tests with roughly 43 SQL queries passing through Calcite we observed 
> that a big amount of time is spend on code generation and class loading 
> making the EnumerableInterpretable#toBindable method a performance 
> bottleneck. 
> Among the 43 SQL queries there are many duplicates which are going to 
> lead to the generation of exactly the same code Java. Introducing, a small 
> cache at the level of EnumerableInterpretable class could avoid generating 
> and loading the same code over and over again.
> A simple implementation based on Guava improved the overall execution time of 
> the afforementioned test suite by more than 50%.
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (CALCITE-2703) Reduce code generation and class loading overhead when executing queries in the EnumerableConvention

2018-11-23 Thread Stamatis Zampetakis (JIRA)


[ 
https://issues.apache.org/jira/browse/CALCITE-2703?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16697297#comment-16697297
 ] 

Stamatis Zampetakis commented on CALCITE-2703:
--

I am not using Calcite as a server so I am not passing at all through the 
PreparedStatement infrustructure. It is totally feasible to add a cache at a 
higher level (without even touching Calcite code) but I would rather do it in 
EnumerableInterpretable class so that everybody can benefit of it.  

> Reduce code generation and class loading overhead when executing queries in 
> the EnumerableConvention
> 
>
> Key: CALCITE-2703
> URL: https://issues.apache.org/jira/browse/CALCITE-2703
> Project: Calcite
>  Issue Type: Improvement
>  Components: core
>Affects Versions: 1.17.0
>Reporter: Stamatis Zampetakis
>Assignee: Julian Hyde
>Priority: Major
>  Labels: performance
> Fix For: 1.18.0
>
>
> The queries using Calcite's EnumerableConvention always end-up generating new 
> java classes at runtime (using Janino) that are then instantiated using 
> reflection. This combination of class generation and class loading introduces 
> a big overhead in query response time.
> A quick profiling on our Company's internal test suite consisting in 4000 
> tests with roughly 43 SQL queries passing through Calcite we observed 
> that a big amount of time is spend on code generation and class loading 
> making the EnumerableInterpretable#toBindable method a performance 
> bottleneck. 
> Among the 43 SQL queries there are many duplicates which are going to 
> lead to the generation of exactly the same code Java. Introducing, a small 
> cache at the level of EnumerableInterpretable class could avoid generating 
> and loading the same code over and over again.
> A simple implementation based on Guava improved the overall execution time of 
> the afforementioned test suite by more than 50%.
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (CALCITE-2703) Reduce code generation and class loading overhead when executing queries in the EnumerableConvention

2018-11-23 Thread Vladimir Sitnikov (JIRA)


[ 
https://issues.apache.org/jira/browse/CALCITE-2703?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16697287#comment-16697287
 ] 

Vladimir Sitnikov commented on CALCITE-2703:


[~zabetak], have you checked the original SQL statements?
I wonder if caching at the level of "PreparedStatement" would be better.

> Reduce code generation and class loading overhead when executing queries in 
> the EnumerableConvention
> 
>
> Key: CALCITE-2703
> URL: https://issues.apache.org/jira/browse/CALCITE-2703
> Project: Calcite
>  Issue Type: Improvement
>  Components: core
>Affects Versions: 1.17.0
>Reporter: Stamatis Zampetakis
>Assignee: Julian Hyde
>Priority: Major
>  Labels: performance
> Fix For: 1.18.0
>
>
> The queries using Calcite's EnumerableConvention always end-up generating new 
> java classes at runtime (using Janino) that are then instantiated using 
> reflection. This combination of class generation and class loading introduces 
> a big overhead in query response time.
> A quick profiling on our Company's internal test suite consisting in 4000 
> tests with roughly 43 SQL queries passing through Calcite we observed 
> that a big amount of time is spend on code generation and class loading 
> making the EnumerableInterpretable#toBindable method a performance 
> bottleneck. 
> Among the 43 SQL queries there are many duplicates which are going to 
> lead to the generation of exactly the same code Java. Introducing, a small 
> cache at the level of EnumerableInterpretable class could avoid generating 
> and loading the same code over and over again.
> A simple implementation based on Guava improved the overall execution time of 
> the afforementioned test suite by more than 50%.
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)