[ https://issues.apache.org/jira/browse/KYLIN-2773?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Zhong Yanghong updated KYLIN-2773: ---------------------------------- Description: For sql, {code} select PART_DT, META_CATEG_NAME, sum(price) from KYLIN_SALES INNER JOIN KYLIN_CATEGORY_GROUPINGS ON KYLIN_SALES.LEAF_CATEG_ID = KYLIN_CATEGORY_GROUPINGS.LEAF_CATEG_ID AND KYLIN_SALES.LSTG_SITE_ID = KYLIN_CATEGORY_GROUPINGS.SITE_ID INNER JOIN KYLIN_CAL_DT ON KYLIN_SALES.PART_DT = KYLIN_CAL_DT.CAL_DT group by PART_DT, META_CATEG_NAME order by PART_DT, META_CATEG_NAME {code} the datatype of KYLIN_SALES.LEAF_CATEG_ID is bigint, while the one of KYLIN_CATEGORY_GROUPINGS.LEAF_CATEG_ID is integer. Then the plan transformed by kylin is as follows: {code} OLAPToEnumerableConverter OLAPLimitRel(fetch=[50000]) OLAPSortRel(sort0=[$0], dir0=[ASC]) OLAPAggregateRel(group=[{0}], EXPR$1=[SUM($1)]) OLAPProjectRel(expr#0..22=[{inputs}], META_CATEG_NAME=[$t16], PRICE=[$t5]) OLAPJoinRel(condition=[=($0, $22)], joinType=[inner]) {code} {color:#f79232} {code} OLAPProjectRel(expr#0..19=[{inputs}], PART_DT=[$t0], LSTG_FORMAT_NAME=[$t1], SLR_SEGMENT_CD=[$t2], LEAF_CATEG_ID=[$t3], LSTG_SITE_ID=[$t4], PRICE=[$t5], SELLER_ID=[$t6], COUNT__=[$t7], MIN_PRICE_=[$t8], COUNT_DISTINCT_SELLER_ID_=[$t9], USER_DEFINED_FIELD1=[$t10], USER_DEFINED_FIELD3=[$t11], UPD_DATE=[$t12], UPD_USER=[$t13], LEAF_CATEG_ID0=[$t14], SITE_ID=[$t15], META_CATEG_NAME=[$t16], CATEG_LVL2_NAME=[$t17], CATEG_LVL3_NAME=[$t18]) OLAPJoinRel(condition=[AND(=($3, $19), =($4, $15))], joinType=[inner]) OLAPTableScan(table=[[DEFAULT, KYLIN_SALES]], fields=[[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]]) OLAPProjectRel(expr#0..8=[{inputs}], expr#9=[CAST($t4):BIGINT], USER_DEFINED_FIELD1=[$t0], USER_DEFINED_FIELD3=[$t1], UPD_DATE=[$t2], UPD_USER=[$t3], LEAF_CATEG_ID=[$t4], SITE_ID=[$t5], META_CATEG_NAME=[$t6], CATEG_LVL2_NAME=[$t7], CATEG_LVL3_NAME=[$t8], LEAF_CATEG_ID9=[$t9]) {code} {color} {code} OLAPTableScan(table=[[DEFAULT, KYLIN_CATEGORY_GROUPINGS]], fields=[[0, 1, 2, 3, 4, 5, 6, 7, 8]]) OLAPTableScan(table=[[DEFAULT, KYLIN_CAL_DT]], fields=[[0, 1, 2, 3]]) {code} However, what we expect is as follows: {code} OLAPToEnumerableConverter OLAPLimitRel(fetch=[50000]) OLAPSortRel(sort0=[$0], sort1=[$1], dir0=[ASC], dir1=[ASC]) OLAPAggregateRel(group=[{0, 1}], EXPR$2=[SUM($2)]) OLAPProjectRel(expr#0..21=[{inputs}], PART_DT=[$t0], META_CATEG_NAME=[$t17], PRICE=[$t4]) {code} {color:#f79232} {code} OLAPJoinRel(condition=[=($0, $21)], joinType=[inner]) OLAPJoinRel(condition=[AND(=($1, $15), =($2, $16))], joinType=[inner]) OLAPTableScan(table=[[DEFAULT, KYLIN_SALES]], fields=[[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10]]) {code} {color} OLAPTableScan(table=[[DEFAULT, KYLIN_CATEGORY_GROUPINGS]], fields=[[0, 1, 2, 3, 4, 5, 6, 7, 8]]) {code} OLAPTableScan(table=[[DEFAULT, KYLIN_CAL_DT]], fields=[[0, 1]]) {code} The reason for this difference is as follows: * Although we remove the {{JoinPushExpressionsRule}} in {{OLAPTableScan}}, the method {{RelOptUtil.pushDownJoinConditions()}} is still invoked when creating a join in {{SqlToRelConverter}}. * In the method of {{RelOptUtil.pushDownJoinConditions()}}, since the datatypes of the join related columns are not the same, *cast* function will be automatically assigned to KYLIN_CATEGORY_GROUPINGS.LEAF_CATEG_ID. Then a {{OLAPProjectRel}} will be introduced. In kylin, we don't need {{RelOptUtil.pushDownJoinConditions()}}. Therefore, the solution for this is just remove that logic. was: For sql, {code} select PART_DT, META_CATEG_NAME, sum(price) from KYLIN_SALES INNER JOIN KYLIN_CATEGORY_GROUPINGS ON KYLIN_SALES.LEAF_CATEG_ID = KYLIN_CATEGORY_GROUPINGS.LEAF_CATEG_ID AND KYLIN_SALES.LSTG_SITE_ID = KYLIN_CATEGORY_GROUPINGS.SITE_ID INNER JOIN KYLIN_CAL_DT ON KYLIN_SALES.PART_DT = KYLIN_CAL_DT.CAL_DT group by PART_DT, META_CATEG_NAME order by PART_DT, META_CATEG_NAME {code} the datatype of KYLIN_SALES.LEAF_CATEG_ID is bigint, while the one of KYLIN_CATEGORY_GROUPINGS.LEAF_CATEG_ID is integer. Then the plan transformed by kylin is as follows: {code} OLAPToEnumerableConverter OLAPLimitRel(fetch=[50000]) OLAPSortRel(sort0=[$0], dir0=[ASC]) OLAPAggregateRel(group=[{0}], EXPR$1=[SUM($1)]) OLAPProjectRel(expr#0..22=[{inputs}], META_CATEG_NAME=[$t16], PRICE=[$t5]) OLAPJoinRel(condition=[=($0, $22)], joinType=[inner]) {code} {color:#f79232} {code} OLAPProjectRel(expr#0..19=[{inputs}], PART_DT=[$t0], LSTG_FORMAT_NAME=[$t1], SLR_SEGMENT_CD=[$t2], LEAF_CATEG_ID=[$t3], LSTG_SITE_ID=[$t4], PRICE=[$t5], SELLER_ID=[$t6], COUNT__=[$t7], MIN_PRICE_=[$t8], COUNT_DISTINCT_SELLER_ID_=[$t9], USER_DEFINED_FIELD1=[$t10], USER_DEFINED_FIELD3=[$t11], UPD_DATE=[$t12], UPD_USER=[$t13], LEAF_CATEG_ID0=[$t14], SITE_ID=[$t15], META_CATEG_NAME=[$t16], CATEG_LVL2_NAME=[$t17], CATEG_LVL3_NAME=[$t18]) OLAPJoinRel(condition=[AND(=($3, $19), =($4, $15))], joinType=[inner]) OLAPTableScan(table=[[DEFAULT, KYLIN_SALES]], fields=[[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]]) OLAPProjectRel(expr#0..8=[{inputs}], expr#9=[CAST($t4):BIGINT], USER_DEFINED_FIELD1=[$t0], USER_DEFINED_FIELD3=[$t1], UPD_DATE=[$t2], UPD_USER=[$t3], LEAF_CATEG_ID=[$t4], SITE_ID=[$t5], META_CATEG_NAME=[$t6], CATEG_LVL2_NAME=[$t7], CATEG_LVL3_NAME=[$t8], LEAF_CATEG_ID9=[$t9]) {code} {color} {code} OLAPTableScan(table=[[DEFAULT, KYLIN_CATEGORY_GROUPINGS]], fields=[[0, 1, 2, 3, 4, 5, 6, 7, 8]]) OLAPTableScan(table=[[DEFAULT, KYLIN_CAL_DT]], fields=[[0, 1, 2, 3]]) {code} However, what we expect is as follows: {code} OLAPToEnumerableConverter OLAPLimitRel(fetch=[50000]) OLAPSortRel(sort0=[$0], sort1=[$1], dir0=[ASC], dir1=[ASC]) OLAPAggregateRel(group=[{0, 1}], EXPR$2=[SUM($2)]) OLAPProjectRel(expr#0..21=[{inputs}], PART_DT=[$t0], META_CATEG_NAME=[$t17], PRICE=[$t4]) {code} {color:#f79232} {code} OLAPJoinRel(condition=[=($0, $21)], joinType=[inner]) OLAPJoinRel(condition=[AND(=($1, $15), =($2, $16))], joinType=[inner]) OLAPTableScan(table=[[DEFAULT, KYLIN_SALES]], fields=[[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10]]) OLAPTableScan(table=[[DEFAULT, KYLIN_CATEGORY_GROUPINGS]], fields=[[0, 1, 2, 3, 4, 5, 6, 7, 8]]) {code} {color} {code} OLAPTableScan(table=[[DEFAULT, KYLIN_CAL_DT]], fields=[[0, 1]]) {code} The reason for this difference is as follows: * Although we remove the {{JoinPushExpressionsRule}} in {{OLAPTableScan}}, the method {{RelOptUtil.pushDownJoinConditions()}} is still invoked when creating a join in {{SqlToRelConverter}}. * In the method of {{RelOptUtil.pushDownJoinConditions()}}, since the datatypes of the join related columns are not the same, *cast* function will be automatically assigned to KYLIN_CATEGORY_GROUPINGS.LEAF_CATEG_ID. Then a {{OLAPProjectRel}} will be introduced. In kylin, we don't need {{RelOptUtil.pushDownJoinConditions()}}. Therefore, the solution for this is just remove that logic. > Should not push down join condition related columns are compatible while not > consistent > --------------------------------------------------------------------------------------- > > Key: KYLIN-2773 > URL: https://issues.apache.org/jira/browse/KYLIN-2773 > Project: Kylin > Issue Type: Bug > Components: Query Engine > Reporter: Zhong Yanghong > Assignee: Zhong Yanghong > Fix For: v2.1.0 > > Attachments: APACHE-KYLIN-2773.patch > > > For sql, > {code} > select PART_DT, META_CATEG_NAME, sum(price) > from KYLIN_SALES > INNER JOIN KYLIN_CATEGORY_GROUPINGS ON KYLIN_SALES.LEAF_CATEG_ID = > KYLIN_CATEGORY_GROUPINGS.LEAF_CATEG_ID > AND KYLIN_SALES.LSTG_SITE_ID = KYLIN_CATEGORY_GROUPINGS.SITE_ID > INNER JOIN KYLIN_CAL_DT ON KYLIN_SALES.PART_DT = KYLIN_CAL_DT.CAL_DT > group by PART_DT, META_CATEG_NAME > order by PART_DT, META_CATEG_NAME > {code} > the datatype of KYLIN_SALES.LEAF_CATEG_ID is bigint, while the one of > KYLIN_CATEGORY_GROUPINGS.LEAF_CATEG_ID is integer. > Then the plan transformed by kylin is as follows: > {code} > OLAPToEnumerableConverter > OLAPLimitRel(fetch=[50000]) > OLAPSortRel(sort0=[$0], dir0=[ASC]) > OLAPAggregateRel(group=[{0}], EXPR$1=[SUM($1)]) > OLAPProjectRel(expr#0..22=[{inputs}], META_CATEG_NAME=[$t16], > PRICE=[$t5]) > OLAPJoinRel(condition=[=($0, $22)], joinType=[inner]) > {code} > {color:#f79232} > {code} > OLAPProjectRel(expr#0..19=[{inputs}], PART_DT=[$t0], > LSTG_FORMAT_NAME=[$t1], SLR_SEGMENT_CD=[$t2], LEAF_CATEG_ID=[$t3], > LSTG_SITE_ID=[$t4], PRICE=[$t5], SELLER_ID=[$t6], COUNT__=[$t7], > MIN_PRICE_=[$t8], COUNT_DISTINCT_SELLER_ID_=[$t9], > USER_DEFINED_FIELD1=[$t10], USER_DEFINED_FIELD3=[$t11], UPD_DATE=[$t12], > UPD_USER=[$t13], LEAF_CATEG_ID0=[$t14], SITE_ID=[$t15], > META_CATEG_NAME=[$t16], CATEG_LVL2_NAME=[$t17], CATEG_LVL3_NAME=[$t18]) > OLAPJoinRel(condition=[AND(=($3, $19), =($4, $15))], > joinType=[inner]) > OLAPTableScan(table=[[DEFAULT, KYLIN_SALES]], fields=[[0, 1, > 2, 3, 4, 5, 6, 7, 8, 9]]) > OLAPProjectRel(expr#0..8=[{inputs}], > expr#9=[CAST($t4):BIGINT], USER_DEFINED_FIELD1=[$t0], > USER_DEFINED_FIELD3=[$t1], UPD_DATE=[$t2], UPD_USER=[$t3], > LEAF_CATEG_ID=[$t4], SITE_ID=[$t5], META_CATEG_NAME=[$t6], > CATEG_LVL2_NAME=[$t7], CATEG_LVL3_NAME=[$t8], LEAF_CATEG_ID9=[$t9]) > {code} > {color} > {code} > OLAPTableScan(table=[[DEFAULT, KYLIN_CATEGORY_GROUPINGS]], > fields=[[0, 1, 2, 3, 4, 5, 6, 7, 8]]) > OLAPTableScan(table=[[DEFAULT, KYLIN_CAL_DT]], fields=[[0, 1, 2, > 3]]) > {code} > However, what we expect is as follows: > {code} > OLAPToEnumerableConverter > OLAPLimitRel(fetch=[50000]) > OLAPSortRel(sort0=[$0], sort1=[$1], dir0=[ASC], dir1=[ASC]) > OLAPAggregateRel(group=[{0, 1}], EXPR$2=[SUM($2)]) > OLAPProjectRel(expr#0..21=[{inputs}], PART_DT=[$t0], > META_CATEG_NAME=[$t17], PRICE=[$t4]) > {code} > {color:#f79232} > {code} > OLAPJoinRel(condition=[=($0, $21)], joinType=[inner]) > OLAPJoinRel(condition=[AND(=($1, $15), =($2, $16))], > joinType=[inner]) > OLAPTableScan(table=[[DEFAULT, KYLIN_SALES]], fields=[[0, 1, 2, > 3, 4, 5, 6, 7, 8, 9, 10]]) > {code} > {color} > OLAPTableScan(table=[[DEFAULT, KYLIN_CATEGORY_GROUPINGS]], > fields=[[0, 1, 2, 3, 4, 5, 6, 7, 8]]) > {code} > OLAPTableScan(table=[[DEFAULT, KYLIN_CAL_DT]], fields=[[0, 1]]) > {code} > The reason for this difference is as follows: > * Although we remove the {{JoinPushExpressionsRule}} in {{OLAPTableScan}}, > the method {{RelOptUtil.pushDownJoinConditions()}} is still invoked when > creating a join in {{SqlToRelConverter}}. > * In the method of {{RelOptUtil.pushDownJoinConditions()}}, since the > datatypes of the join related columns are not the same, *cast* function will > be automatically assigned to KYLIN_CATEGORY_GROUPINGS.LEAF_CATEG_ID. Then a > {{OLAPProjectRel}} will be introduced. > In kylin, we don't need {{RelOptUtil.pushDownJoinConditions()}}. Therefore, > the solution for this is just remove that logic. -- This message was sent by Atlassian JIRA (v6.4.14#64029)