[
https://issues.apache.org/jira/browse/KYLIN-3392?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16934404#comment-16934404
]
wangrupeng commented on KYLIN-3392:
-----------------------------------
This pr can cause other serious problem, there are details about the problem I
met.
when I query with aggregate function such as count, sum, max, min, etc, if
the column values are all NULL or the values filter by where condition
contains NULL , the query thread will be hang up and after about one minute it
will throw an exception.
To reproduce this problem, I create a new hive table whose table structure is
same as KYLIN_SALES and contains 4 rows, the column contains two NULL values is
PRICE :
1 2013/10/19 FP-non 37831 0 13 30 3 10000209
10003717 ANALYST Beijing
2 2012/10/22 Others 140746 100 11 70 20 10000154
10006076 ADMIN Shanghai
3 2013/10/19 FP-non 37831 0 13 NULL 3 10000209
10003717 ANALYST Beijing
4 2012/10/22 Others 140746 100 11 NULL NULL 10000154 10006076
ADMIN Shanghai
And then I build model and cube using this table named KYLIN_SALES_3392 as fact
table and no lookup table. I compare the kylin which merged this pr with the
one that didn't merge.
Merged this pr, like I said , when where condition filter column values
contains NULL, it will throw an exception:
select min(price) from kylin_sales_3392 where trans_id>=1 and trans_id<=4
!KYLIN-3392.png|width=654,height=296!
if where condition filter column values don't contain NULL, it will work
properlyselect min(price) from kylin_sales_3392 where trans_id>=1 and
trans_id<=2
!KYLIN-3392-2.png|width=680,height=256!
And below is the result from origin kylin version
select min(price) from kylin_sales_3392 where trans_id>=1 and trans_id<=4
!kylin-3.0.0-alpha2.png|width=668,height=215!
> Support NULL value in Sum, Max, Min Aggregation
> -----------------------------------------------
>
> Key: KYLIN-3392
> URL: https://issues.apache.org/jira/browse/KYLIN-3392
> Project: Kylin
> Issue Type: Bug
> Reporter: Yifei Wu
> Assignee: Yifei Wu
> Priority: Major
> Fix For: v3.0.0-beta, v2.6.4
>
> Attachments: KYLIN-3392-2.png, KYLIN-3392.png, kylin-3.0.0-alpha2.png
>
>
> It is treated as 0 when confronted with NULL value in KYLIN's basic aggregate
> measure (like sum, max, min). However, to distinguish the NULL value with 0
> is very necessary.
> It should be like this
> *sum(null, null) = null*
> *sum(null, 1) = 1*
> *max(null, null) = null*
> *max(null, -1) = -1*
> *min(null, -1)= -1*
> in accordance with Hive and SparkSQL
--
This message was sent by Atlassian Jira
(v8.3.4#803005)