[ https://issues.apache.org/jira/browse/KYLIN-2773?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16126992#comment-16126992 ]
liyang commented on KYLIN-2773: ------------------------------- pending on me... > Should not push down join condition related columns are compatible while not > consistent > --------------------------------------------------------------------------------------- > > Key: KYLIN-2773 > URL: https://issues.apache.org/jira/browse/KYLIN-2773 > Project: Kylin > Issue Type: Bug > Components: Query Engine > Reporter: Zhong Yanghong > Assignee: Zhong Yanghong > Fix For: v2.2.0 > > Attachments: APACHE-KYLIN-2773.patch > > > For sql, > {code} > select PART_DT, META_CATEG_NAME, sum(price) > from KYLIN_SALES > INNER JOIN KYLIN_CATEGORY_GROUPINGS ON KYLIN_SALES.LEAF_CATEG_ID = > KYLIN_CATEGORY_GROUPINGS.LEAF_CATEG_ID > AND KYLIN_SALES.LSTG_SITE_ID = KYLIN_CATEGORY_GROUPINGS.SITE_ID > INNER JOIN KYLIN_CAL_DT ON KYLIN_SALES.PART_DT = KYLIN_CAL_DT.CAL_DT > group by PART_DT, META_CATEG_NAME > order by PART_DT, META_CATEG_NAME > {code} > the datatype of KYLIN_SALES.LEAF_CATEG_ID is bigint, while the one of > KYLIN_CATEGORY_GROUPINGS.LEAF_CATEG_ID is integer. > Then the plan transformed by kylin is as follows: > {code} > OLAPToEnumerableConverter > OLAPLimitRel(fetch=[50000]) > OLAPSortRel(sort0=[$0], dir0=[ASC]) > OLAPAggregateRel(group=[{0}], EXPR$1=[SUM($1)]) > OLAPProjectRel(expr#0..22=[{inputs}], META_CATEG_NAME=[$t16], > PRICE=[$t5]) > OLAPJoinRel(condition=[=($0, $22)], joinType=[inner]) > {code} > {color:#f79232} > {code} > OLAPProjectRel(expr#0..19=[{inputs}], PART_DT=[$t0], > LSTG_FORMAT_NAME=[$t1], SLR_SEGMENT_CD=[$t2], LEAF_CATEG_ID=[$t3], > LSTG_SITE_ID=[$t4], PRICE=[$t5], SELLER_ID=[$t6], COUNT__=[$t7], > MIN_PRICE_=[$t8], COUNT_DISTINCT_SELLER_ID_=[$t9], > USER_DEFINED_FIELD1=[$t10], USER_DEFINED_FIELD3=[$t11], UPD_DATE=[$t12], > UPD_USER=[$t13], LEAF_CATEG_ID0=[$t14], SITE_ID=[$t15], > META_CATEG_NAME=[$t16], CATEG_LVL2_NAME=[$t17], CATEG_LVL3_NAME=[$t18]) > OLAPJoinRel(condition=[AND(=($3, $19), =($4, $15))], > joinType=[inner]) > OLAPTableScan(table=[[DEFAULT, KYLIN_SALES]], fields=[[0, 1, > 2, 3, 4, 5, 6, 7, 8, 9]]) > OLAPProjectRel(expr#0..8=[{inputs}], > expr#9=[CAST($t4):BIGINT], USER_DEFINED_FIELD1=[$t0], > USER_DEFINED_FIELD3=[$t1], UPD_DATE=[$t2], UPD_USER=[$t3], > LEAF_CATEG_ID=[$t4], SITE_ID=[$t5], META_CATEG_NAME=[$t6], > CATEG_LVL2_NAME=[$t7], CATEG_LVL3_NAME=[$t8], LEAF_CATEG_ID9=[$t9]) > {code} > {color} > {code} > OLAPTableScan(table=[[DEFAULT, KYLIN_CATEGORY_GROUPINGS]], > fields=[[0, 1, 2, 3, 4, 5, 6, 7, 8]]) > OLAPTableScan(table=[[DEFAULT, KYLIN_CAL_DT]], fields=[[0, 1, 2, > 3]]) > {code} > However, what we expect is as follows: > {code} > OLAPToEnumerableConverter > OLAPLimitRel(fetch=[50000]) > OLAPSortRel(sort0=[$0], sort1=[$1], dir0=[ASC], dir1=[ASC]) > OLAPAggregateRel(group=[{0, 1}], EXPR$2=[SUM($2)]) > OLAPProjectRel(expr#0..21=[{inputs}], PART_DT=[$t0], > META_CATEG_NAME=[$t17], PRICE=[$t4]) > {code} > {color:#f79232} > {code} > OLAPJoinRel(condition=[=($0, $21)], joinType=[inner]) > OLAPJoinRel(condition=[AND(=($1, $15), =($2, $16))], > joinType=[inner]) > OLAPTableScan(table=[[DEFAULT, KYLIN_SALES]], fields=[[0, 1, 2, > 3, 4, 5, 6, 7, 8, 9, 10]]) > {code} > {color} > {code} > OLAPTableScan(table=[[DEFAULT, KYLIN_CATEGORY_GROUPINGS]], > fields=[[0, 1, 2, 3, 4, 5, 6, 7, 8]]) > OLAPTableScan(table=[[DEFAULT, KYLIN_CAL_DT]], fields=[[0, 1]]) > {code} > The reason for this difference is as follows: > * Although we remove the {{JoinPushExpressionsRule}} in {{OLAPTableScan}}, > the method {{RelOptUtil.pushDownJoinConditions()}} is still invoked when > creating a join in {{SqlToRelConverter}}. > * In the method of {{RelOptUtil.pushDownJoinConditions()}}, since the > datatypes of the join related columns are not the same, *cast* function will > be automatically assigned to KYLIN_CATEGORY_GROUPINGS.LEAF_CATEG_ID. Then a > {{OLAPProjectRel}} will be introduced. > In kylin, we don't need {{RelOptUtil.pushDownJoinConditions()}}. Therefore, > the solution for this is just remove that logic. -- This message was sent by Atlassian JIRA (v6.4.14#64029)