[
https://issues.apache.org/jira/browse/CALCITE-6216?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17809638#comment-17809638
]
Julian Hyde commented on CALCITE-6216:
--------------------------------------
If a join involves two tables in different databases, and if the databases
cannot talk to each other, what possible plan could "push down" the join?
> The JDBC adapter is unable to push down queries when there is a join involved
> in the query.
> -------------------------------------------------------------------------------------------
>
> Key: CALCITE-6216
> URL: https://issues.apache.org/jira/browse/CALCITE-6216
> Project: Calcite
> Issue Type: Improvement
> Components: jdbc-adapter
> Reporter: xjlem
> Priority: Major
>
> I want to know why the description of JdbcToEnumerableConverterRule in the
> JDBC rule includes "JdbcConvention out" information while others do not,
> which leads to different data sources sharing the same JDBC rule.
> for example:
> config:
> {code:java}
> {
> "defaultSchema": "db1",
> "schemas": [
> {
> "factory": "org.apache.calcite.adapter.jdbc.JdbcSchema$Factory",
> "name": "db1",
> "operand": {
> "jdbcDriver": "com.mysql.jdbc.Driver",
> "jdbcPassword": "",
> "jdbcUrl": "",
> "jdbcUser": ""
> },
> "type": "custom"
> },
> {
> "factory": "org.apache.calcite.adapter.jdbc.JdbcSchema$Factory",
> "name": "db2",
> "operand": {
> "jdbcDriver": "com.mysql.jdbc.Driver",
> "jdbcPassword": "",
> "jdbcUrl": "",
> "jdbcUser": ""
> },
> "type": "custom"
> }
> ],
> "version": "1.0"
> } {code}
> sql:
> {code:java}
> SELECT pv / sales, pv, sales
> FROM (
> SELECT sku_id, COUNT(*) AS sales
> FROM db1.sku_sales
> WHERE SKU_ID = 1
> GROUP BY sku_id
> ) sku_sales
> FULL JOIN (
> SELECT sku_id, COUNT(pv) AS pv
> FROM db2.sku_flow
> WHERE SKU_ID = 1
> GROUP BY sku_id
> ) sku_flow
> ON sku_sales.sku_id = sku_flow.sku_id{code}
> it will convert to logic relNode:
> {code:java}
> LogicalProject(EXPR$0=[/($3, $1)], pv=[$3], sales=[$1])
> LogicalJoin(condition=[=($0, $2)], joinType=[full])
> LogicalAggregate(group=[{0}], sales=[COUNT()])
> LogicalFilter(condition=[=($0, 1)])
> LogicalProject(sku_id=[$2])
> JdbcTableScan(table=[[db1, sku_sales]])
> LogicalAggregate(group=[{0}], pv=[COUNT($1)])
> LogicalFilter(condition=[=($0, 1)])
> LogicalProject(sku_id=[$2], pv=[$3])
> JdbcTableScan(table=[[db2, sku_flow]]) {code}
> optimize:
> When registering the rule, the rule for the DB2 convention will not be
> registered because the jdbcRule has the same description as the rule for the
> DB1 convention, causing the addRule method to return false.
> org.apache.calcite.adapter.jdbc.JdbcConvention#register
> {code:java}
> @Override public void register(RelOptPlanner planner) {
> for (RelOptRule rule : JdbcRules.rules(this)) {
> planner.addRule(rule);
> }
> planner.addRule(FilterSetOpTransposeRule.INSTANCE);
> planner.addRule(ProjectRemoveRule.INSTANCE);
> } {code}
> after optimize:
> {code:java}
> EnumerableCalc(expr#0..3=[{inputs}], expr#4=[/($t3, $t1)], EXPR$0=[$t4],
> pv=[$t3], sales=[$t1])
> EnumerableJoin(condition=[=($0, $2)], joinType=[full])
> JdbcToEnumerableConverter
> JdbcAggregate(group=[{2}], sales=[COUNT()])
> JdbcFilter(condition=[=($2, 1)])
> JdbcTableScan(table=[[db1, sku_sales]])
> EnumerableAggregate(group=[{2}], pv=[COUNT($3)])
> EnumerableCalc(expr#0..4=[{inputs}], expr#5=[1], expr#6=[=($t2, $t5)],
> proj#0..4=[{exprs}], $condition=[$t6])
> JdbcToEnumerableConverter
> JdbcTableScan(table=[[db2, sku_flow]])
> {code}
> As a result, we can see that the push down for DB2 fails.
>
--
This message was sent by Atlassian Jira
(v8.20.10#820010)