[jira] [Commented] (CALCITE-5284) JDBC rules create inefficient plan

Stamatis Zampetakis (Jira) Tue, 22 Nov 2022 01:04:04 -0800


    [ 
https://issues.apache.org/jira/browse/CALCITE-5284?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17637095#comment-17637095
 ]


Stamatis Zampetakis commented on CALCITE-5284:
----------------------------------------------

[~kramerul] [~libenchao] , the reason that there are multipliers is usually to 
"trick" Volcano planner to choose the JDBC expression over the 
Enumerable/Bindable one. If you check the history of the respective changes or 
the code itself this becomes more evident. 

Examples:
* For JdbcProject, check https://issues.apache.org/jira/browse/CALCITE-557
* For JdbcSort, check 
https://github.com/apache/calcite/pull/1840/files#r389205194
* For JdbcTableModify, check 
https://issues.apache.org/jira/browse/CALCITE-1527?focusedCommentId=15721810&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-15721810

When combining multiple conventions (e.g., Jdbc + Enumerable) it is not always 
the case that executing everything in one or the other is the best option (one 
recent discussion around this can be found 
[here|https://github.com/apache/calcite/pull/2620#discussion_r762877661]). In 
other words pushing everything to Jdbc may not be the best option but in the 
default cost model of Calcite this is what people have been doing since the 
beginning.

I suspect that the inefficient plan reported here is a result of the code in 
[RelMdRowCount|https://github.com/apache/calcite/blob/5a9cd9f259c4dcc34a1bc1291dfcd188b3128156/core/src/main/java/org/apache/calcite/rel/metadata/RelMdRowCount.java#L158]
 probably making {{EnumerableLimit}} cheaper than {{JdbcSort}}.

> JDBC rules create inefficient plan
> ----------------------------------
>
>                 Key: CALCITE-5284
>                 URL: https://issues.apache.org/jira/browse/CALCITE-5284
>             Project: Calcite
>          Issue Type: Bug
>         Environment: Calcite 1.31.1. on Mac
>            Reporter: Ulrich Kramer
>            Priority: Major
>
> The following unit test for {{JdbcAdapterTest}} fails:
> {code:java}
>   @Test void testOffset() {
>     CalciteAssert.model(FoodmartSchema.FOODMART_MODEL)
>         .query("select * from \"sales_fact_1997\" limit 10 offset 20")
>         .explainContains("PLAN=JdbcToEnumerableConverter\n" +
>             "  JdbcSort(offset=[20], fetch=[10])\n" +
>             "    JdbcTableScan(table=[[foodmart, sales_fact_1997]])")
>         .runs();
>   }
> {code}
> For an offset less than 13, an efficient plan is created. With an offset 
> above 13, the plan looks like this:
> {code}
> EnumerableLimit(offset=[20], fetch=[10])
>   JdbcToEnumerableConverter
>     JdbcTableScan(table=[[foodmart, sales_fact_1997]])
> {code}
> which can lead to enormous latency times, since the entire table is loaded 
> via JDBC.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (CALCITE-5284) JDBC rules create inefficient plan

Reply via email to