[
https://issues.apache.org/jira/browse/KYLIN-5743?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17795303#comment-17795303
]
zhong.zhu commented on KYLIN-5743:
----------------------------------
h1.Root Cause
With the _*kylin.query.convert-sum-expression-enabled=true*_ conversion switch
turned on, the original SQL generates an execution plan as follows
{code:shell}
KapOLAPToEnumerableConverter
KapLimitRel(ctx=[], fetch=[500])
KapAggregateRel(group-set=[[]], groups=[null], EXPR$0=[SUM($0)], ctx=[])
KapProjectRel($f0=[$1], ctx=[])
KapJoinRel(condition=[=($0, $2)], joinType=[inner], ctx=[])
KapProjectRel(LO_COMMITDATE=[$15], CASE=[CASE(=($15,
CAST('20230501'):DATE NOT NULL), $11, null)], ctx=[])
KapTableScan(table=[[SSB, LINEORDER]], ctx=[], fields=[[0, 1, 2, 3,
4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19]])
KapProjectRel(EXPR$0=[$0], ctx=[])
KapAggregateRel(group-set=[[]], groups=[null], EXPR$0=[MAX($0)],
ctx=[])
KapProjectRel(LO_COMMITDATE=[$15], ctx=[])
KapTableScan(table=[[SSB, LINEORDER]], ctx=[], fields=[[0, 1,
2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19]])
{code}
It can be seen that case when is pushed over TableScan (in combination with
another ProjectMergeRule to the above result), which makes SumExpressionRule
not work. It is possible that this is unstable, and it is also possible to get
a different execution plan
{code:shell}
KapOLAPToEnumerableConverter
KapLimitRel(ctx=[], fetch=[500])
KapAggregateRel(group-set=[[]], groups=[null], AGG$0=[SUM($0)], ctx=[])
KapProjectRel($f0=[CASE(=($0, CAST('20230501'):DATE NOT NULL), $1,
null)], ctx=[])
KapAggregateRel(group-set=[[0]], groups=[null], TOP_AGG$0=[SUM($1)],
TOP_AGG$1=[SUM($2)], ctx=[])
KapProjectRel(LO_COMMITDATE=[$0], SUM_CASE$0$0=[$1], $f2=[*(0, $2)],
ctx=[])
KapAggregateRel(group-set=[[0]], groups=[null],
SUM_CASE$0$0=[SUM($1)], SUM_CONST$1=[COUNT()], ctx=[])
KapProjectRel(LO_COMMITDATE=[$0], LO_DISCOUNT=[$1], ctx=[])
KapProjectRel(LO_COMMITDATE=[$0], LO_DISCOUNT=[$1],
LO_ORDERDATE=[$2], ctx=[])
KapJoinRel(condition=[=($0, $3)], joinType=[inner], ctx=[])
KapProjectRel(LO_COMMITDATE=[$15], LO_DISCOUNT=[$11],
LO_ORDERDATE=[$5], ctx=[])
KapTableScan(table=[[SSB, LINEORDER]], ctx=[],
fields=[[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19]])
KapAggregateRel(group-set=[[]], groups=[null],
EXPR$0=[MAX($0)], ctx=[])
KapProjectRel(LO_COMMITDATE=[$15], ctx=[])
KapTableScan(table=[[SSB, LINEORDER]], ctx=[],
fields=[[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19]])
{code}
h1.Fix Design
With *_kylin.query.convert-sum-expression-enabled=true_*, then skipping the
_*KapProjectJoinTransposeRule*_ then ensures that the second execution plan
above is stabilized, so that it can hit two aggregated indexes instead of one
aggregated and one detailed.
> Set kylin.query.convert-sum-expression-enabled=true, fail to completely hit
> the aggregate index when the query contains sum (case when) expressions
> ---------------------------------------------------------------------------------------------------------------------------------------------------
>
> Key: KYLIN-5743
> URL: https://issues.apache.org/jira/browse/KYLIN-5743
> Project: Kylin
> Issue Type: Bug
> Affects Versions: 5.0-beta
> Reporter: zhong.zhu
> Assignee: zhong.zhu
> Priority: Major
> Fix For: 5.0.0
>
>
> {code:sql}
> select
> sum(
> case
> when LO_COMMITDATE = '20230501' then LO_DISCOUNT
> end
> )
> from
> (
> select
> LO_COMMITDATE,
> LO_DISCOUNT,
> LINEORDER.LO_ORDERDATE
> from
> ssb.LINEORDER
> ) a
> where
> LO_COMMITDATE = (
> select
> max(LO_COMMITDATE)
> from
> ssb.LINEORDER
> )
> LIMIT
> 500
> {code}
> Fix the sum case when in this scenario so that it hits aggregated indexes
--
This message was sent by Atlassian Jira
(v8.20.10#820010)