[
https://issues.apache.org/jira/browse/CALCITE-5284?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17637095#comment-17637095
]
Stamatis Zampetakis commented on CALCITE-5284:
----------------------------------------------
[~kramerul] [~libenchao] , the reason that there are multipliers is usually to
"trick" Volcano planner to choose the JDBC expression over the
Enumerable/Bindable one. If you check the history of the respective changes or
the code itself this becomes more evident.
Examples:
* For JdbcProject, check https://issues.apache.org/jira/browse/CALCITE-557
* For JdbcSort, check
https://github.com/apache/calcite/pull/1840/files#r389205194
* For JdbcTableModify, check
https://issues.apache.org/jira/browse/CALCITE-1527?focusedCommentId=15721810&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-15721810
When combining multiple conventions (e.g., Jdbc + Enumerable) it is not always
the case that executing everything in one or the other is the best option (one
recent discussion around this can be found
[here|https://github.com/apache/calcite/pull/2620#discussion_r762877661]). In
other words pushing everything to Jdbc may not be the best option but in the
default cost model of Calcite this is what people have been doing since the
beginning.
I suspect that the inefficient plan reported here is a result of the code in
[RelMdRowCount|https://github.com/apache/calcite/blob/5a9cd9f259c4dcc34a1bc1291dfcd188b3128156/core/src/main/java/org/apache/calcite/rel/metadata/RelMdRowCount.java#L158]
probably making {{EnumerableLimit}} cheaper than {{JdbcSort}}.
> JDBC rules create inefficient plan
> ----------------------------------
>
> Key: CALCITE-5284
> URL: https://issues.apache.org/jira/browse/CALCITE-5284
> Project: Calcite
> Issue Type: Bug
> Environment: Calcite 1.31.1. on Mac
> Reporter: Ulrich Kramer
> Priority: Major
>
> The following unit test for {{JdbcAdapterTest}} fails:
> {code:java}
> @Test void testOffset() {
> CalciteAssert.model(FoodmartSchema.FOODMART_MODEL)
> .query("select * from \"sales_fact_1997\" limit 10 offset 20")
> .explainContains("PLAN=JdbcToEnumerableConverter\n" +
> " JdbcSort(offset=[20], fetch=[10])\n" +
> " JdbcTableScan(table=[[foodmart, sales_fact_1997]])")
> .runs();
> }
> {code}
> For an offset less than 13, an efficient plan is created. With an offset
> above 13, the plan looks like this:
> {code}
> EnumerableLimit(offset=[20], fetch=[10])
> JdbcToEnumerableConverter
> JdbcTableScan(table=[[foodmart, sales_fact_1997]])
> {code}
> which can lead to enormous latency times, since the entire table is loaded
> via JDBC.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)