[
https://issues.apache.org/jira/browse/CALCITE-3070?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16841091#comment-16841091
]
Stamatis Zampetakis commented on CALCITE-3070:
----------------------------------------------
It is probably the EnumerableCalc
{noformat}
EnumerableCalc(expr#0..84=[\{inputs}], expr#85=[1997-02-12 00:00:00],
expr#86=[>=($t78, $t85)], expr#87=[1997-02-12 00:01:00], expr#88=[<($t78,
$t87)], expr#89=[AND($t86, $t88)], product_name=[$t5], customer_id=[$t22],
$f85=[$t86], $f86=[$t88], $condition=[$t89]): rowcount = 25.0, cumulative cost
= {135.0 rows, 9611.0 cpu, 0.0 io}, id = 338
{noformat}
operator in the second plan that is filtering more than necessary. Two possible
reasons may be the following:
* the JdbcToEnumerableConverter does not convert correctly the timestamp
columns;
* the comparison operators (>=, <) does not work correctly for timestamps (less
likely).
Other than that, as far as I can see from the cost of the plan both base tables
have row estimation of 100, which rather means that statistics are not
exploited so the optimal plan cannot be found in any case.
> Inner join between MySQL tables produce wrong results if join order changed
> ---------------------------------------------------------------------------
>
> Key: CALCITE-3070
> URL: https://issues.apache.org/jira/browse/CALCITE-3070
> Project: Calcite
> Issue Type: Bug
> Components: core, jdbc-adapter
> Affects Versions: 1.19.0
> Reporter: Zhibin Zhou
> Priority: Critical
> Attachments: log.txt
>
>
> Summary of issue:
> * We are trying to inner join 2 MySQL tables.
> ** persons table: a very small table
> ** foodmart table: a large table
> * The query produce different results if we change the join order in the
> "from" clause: "from A join B" vs "from B join A". One query produces correct
> results while the other query produces empty results.
> * The physical plan generated for the produce-empty query was not optimal.
> it tried to scan be big table.
>
> Working query and physical plan:
> {code:sql}
> select "t2"."product_name", "t2"."customer_id" from
> "foodmart-mysql"."foodmart" as "t2" join "persons"."persons" as "t1"
> on"t1"."person_id_int"="t2"."customer_id" and"t2"."timestamp" >=
> '1997-02-12 00:00:00' and "t2"."timestamp" < '1997-02-12 00:01:00'
> {code}
> {noformat}
> EnumerableCalc(expr#0..4=[\{inputs}], proj#0..1=[\{exprs}]): rowcount =
> 375.0, cumulative cost = {1287.971895621705 rows, 3719.5 cpu, 0.0 io}, id =
> 671
> EnumerableJoin(condition=[=($1, $4)], joinType=[inner]): rowcount = 375.0,
> cumulative cost = {912.971895621705 rows, 1094.5 cpu, 0.0 io}, id = 667
> JdbcToEnumerableConverter: rowcount = 25.0, cumulative cost = {147.5
> rows, 283.5 cpu, 0.0 io}, id = 660
> JdbcProject(product_name=[$5], customer_id=[$22], $f85=[>=($78,
> 1997-02-12 00:00:00)], $f86=[<($78, 1997-02-12 00:01:00)]): rowcount = 25.0,
> cumulative cost = {145.0 rows, 281.0 cpu, 0.0 io}, id = 658
> JdbcFilter(condition=[AND(>=($78, 1997-02-12 00:00:00), <($78,
> 1997-02-12 00:01:00))]): rowcount = 25.0, cumulative cost = {125.0 rows,
> 201.0 cpu, 0.0 io}, id = 656
> JdbcTableScan(table=[[foodmart-mysql, foodmart]]): rowcount =
> 100.0, cumulative cost = {100.0 rows, 101.0 cpu, 0.0 io}, id = 340
> EnumerableCalc(expr#0..5=[\{inputs}], person_id_int=[$t5]): rowcount =
> 100.0, cumulative cost = {210.0 rows, 811.0 cpu, 0.0 io}, id = 673
> JdbcToEnumerableConverter: rowcount = 100.0, cumulative cost = {110.0
> rows, 111.0 cpu, 0.0 io}, id = 663
> JdbcTableScan(table=[[persons, persons]]): rowcount = 100.0,
> cumulative cost = {100.0 rows, 101.0 cpu, 0.0 io}, id = 341
> {noformat}
> Produce-empty query and physical plan:
> {code:sql}
> select "t2"."product_name", "t2"."customer_id" from
> "persons"."persons" as "t1" join "foodmart-mysql"."foodmart" as "t2"
> on "t1"."person_id_int"="t2"."customer_id" and"t2"."timestamp" >=
> '1997-02-12 00:00:00' and "t2"."timestamp" < '1997-02-12 00:01:00'
> {code}
> {noformat}
> EnumerableCalc(expr#0..4=[\{inputs}], proj#0..1=[\{exprs}]): rowcount =
> 375.0, cumulative cost = {1255.471895621705 rows, 12427.0 cpu, 0.0 io}, id =
> 334
> EnumerableJoin(condition=[=($1, $4)], joinType=[inner]): rowcount = 375.0,
> cumulative cost = {880.471895621705 rows, 9802.0 cpu, 0.0 io}, id = 328
> EnumerableCalc(expr#0..84=[\{inputs}], expr#85=[1997-02-12 00:00:00],
> expr#86=[>=($t78, $t85)], expr#87=[1997-02-12 00:01:00], expr#88=[<($t78,
> $t87)], expr#89=[AND($t86, $t88)], product_name=[$t5], customer_id=[$t22],
> $f85=[$t86], $f86=[$t88], $condition=[$t89]): rowcount = 25.0, cumulative
> cost = {135.0 rows, 9611.0 cpu, 0.0 io}, id = 338
> JdbcToEnumerableConverter: rowcount = 100.0, cumulative cost = {110.0
> rows, 111.0 cpu, 0.0 io}, id = 317
> JdbcTableScan(table=[[foodmart-mysql, foodmart]]): rowcount = 100.0,
> cumulative cost = {100.0 rows, 101.0 cpu, 0.0 io}, id = 1
> JdbcToEnumerableConverter: rowcount = 100.0, cumulative cost = {190.0
> rows, 191.0 cpu, 0.0 io}, id = 326
> JdbcProject(person_id_int=[$5]): rowcount = 100.0, cumulative cost =
> {180.0 rows, 181.0 cpu, 0.0 io}, id = 324
> JdbcTableScan(table=[[persons, persons]]): rowcount = 100.0,
> cumulative cost = {100.0 rows, 101.0 cpu, 0.0 io}, id = 0
> {noformat}
>
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)