[
https://issues.apache.org/jira/browse/CALCITE-3071?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17404705#comment-17404705
]
Vladimir Polukeev commented on CALCITE-3071:
--------------------------------------------
Hi, there!
I faced with the same issue.
I have one sql query at application that could be executed about 500 000 times
per hour. Sql query has more than 120 parameters, a lot of subqueries and
joins. I performed research about query execution performance and figure out
that parsing, validation and building query plan consume a lot of time. When I
localy added caching whole sql plan to Calcite than sql query execution time
reduced from 1500 ms to 12 ms.
Therefore, I performed some simple tests about caching whole sql plan and here
tests results:
*1. Without cache*
+Query:+ select t.* from test.sales_fact_1997 t order by cust_id
+Results:+
{code:java}
test execution : 9058 ms
test execution : 252 ms
test execution : 91 ms
test execution : 86 ms
test execution : 129 ms
test execution : 84 ms
test execution : 83 ms
test execution : 92 ms
test execution : 84 ms
test execution : 83 ms
total execution time : 1004 ms
test execution : 8909 ms
test execution : 253 ms
test execution : 91 ms
test execution : 86 ms
test execution : 90 ms
test execution : 83 ms
test execution : 84 ms
test execution : 102 ms
test execution : 80 ms
test execution : 80 ms
total execution time : 985 ms
test execution : 9070 ms
test execution : 264 ms
test execution : 98 ms
test execution : 93 ms
test execution : 90 ms
test execution : 79 ms
test execution : 87 ms
test execution : 99 ms
test execution : 81 ms
test execution : 77 ms
total execution time : 1003 ms
{code}
*2. With cache*
+Query:+ select t.* from test.sales_fact_1997 t order by cust_id
+Results:
+
{code:java}
test execution : 8924 ms
test execution : 7 ms
test execution : 7 ms
test execution : 6 ms
test execution : 6 ms
test execution : 5 ms
test execution : 6 ms
test execution : 6 ms
test execution : 5 ms
test execution : 6 ms
total execution time : 897 ms
test execution : 8938 ms
test execution : 9 ms
test execution : 8 ms
test execution : 7 ms
test execution : 7 ms
test execution : 7 ms
test execution : 7 ms
test execution : 7 ms
test execution : 12 ms
test execution : 6 ms
total execution time : 900 ms
test execution : 8618 ms
test execution : 9 ms
test execution : 7 ms
test execution : 7 ms
test execution : 8 ms
test execution : 6 ms
test execution : 6 ms
test execution : 6 ms
test execution : 7 ms
test execution : 7 ms
total execution time : 868 ms
{code}
+
+*3. Without cache (where expression is added)*+
Query+: select t.* from test.sales_fact_1997 t where t.cust_id = 100 and
t.prod_id = 10 order by cust_id+
+Results:
++
{code:java}
test execution : 9906 ms
test execution : 393 ms
test execution : 165 ms
test execution : 150 ms
test execution : 143 ms
test execution : 141 ms
test execution : 131 ms
test execution : 154 ms
test execution : 145 ms
test execution : 148 ms
total execution time : 1147 ms
test execution : 9461 ms
test execution : 358 ms
test execution : 161 ms
test execution : 153 ms
test execution : 140 ms
test execution : 146 ms
test execution : 126 ms
test execution : 144 ms
test execution : 146 ms
test execution : 162 ms
total execution time : 1099 ms
test execution : 9563 ms
test execution : 420 ms
test execution : 145 ms
test execution : 152 ms
test execution : 138 ms
test execution : 145 ms
test execution : 136 ms
test execution : 148 ms
test execution : 124 ms
test execution : 143 ms
total execution time : 1111 ms
{code}
*4. With cache (where expression is added)*+
Query+: select t.* from test.sales_fact_1997 t where t.cust_id = 100 and
t.prod_id = 10 order by cust_id+
+Results:+
+
{code:java}
test execution : 9959 ms
test execution : 8 ms
test execution : 7 ms
test execution : 6 ms
test execution : 6 ms
test execution : 5 ms
test execution : 5 ms
test execution : 5 ms
test execution : 5 ms
test execution : 5 ms
total execution time : 1001 ms
test execution : 9546 ms
test execution : 9 ms
test execution : 8 ms
test execution : 7 ms
test execution : 6 ms
test execution : 6 ms
test execution : 6 ms
test execution : 6 ms
test execution : 6 ms
test execution : 5 ms
total execution time : 960 ms
test execution : 9318 ms
test execution : 7 ms
test execution : 9 ms
test execution : 8 ms
test execution : 6 ms
test execution : 6 ms
test execution : 5 ms
test execution : 5 ms
test execution : 6 ms
test execution : 6 ms
total execution time : 937 ms
{code}
*Conclusion:*
I add Guava Cache to CalcitePrepareImpl class. Cache is used at "prepare2_"
method. I am new at Calcite project. So there is no guarantee that I found out
the right place for caching.
> Cache the whole sql plan to reduce the latency and improve the performance
> --------------------------------------------------------------------------
>
> Key: CALCITE-3071
> URL: https://issues.apache.org/jira/browse/CALCITE-3071
> Project: Calcite
> Issue Type: Improvement
> Components: core
> Affects Versions: 1.19.0
> Reporter: Lai Zhou
> Priority: Major
>
> In real business, when sql queries become complex, the overhead of sql plan
> will increase quickly , and many of the sql queries are duplicates.
> We already have some caching issue about improving the performance, such as
> the issue
> https://issues.apache.org/jira/browse/CALCITE-2703,
> which reduce code generation and class loading overhead when executing
> queries in the EnumerableConvention, but I think it's not enough.
> I propose to cache the whole sql plan to reduce the latency ,for the same sql
> , ignoring the cost optimizing based on statistics here, we can cache the
> generated code for it.
> I use the FrameworkConfig API to execute sql queries, in this way I can
> easily do this job .
> but it's not easy to make a whole sql execution plan(that says code-gen)
> cache in the sql processing flow based on JDBC Connection, because there're
> many intermediate state in this processing flow.
>
> Let's discuss this feature and the probable solutions.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)