[
https://issues.apache.org/jira/browse/KYLIN-3634?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16656295#comment-16656295
]
Shaofeng SHI edited comment on KYLIN-3634 at 10/19/18 5:36 AM:
---------------------------------------------------------------
Hi Bo, thanks for the patch. Do you mean that, "NULL" should not be matched as
" <> 'abc'" ?
I couldn't understand why the incorrect result is 0 row today; It should be the
sum of all columns. Please clarify.
was (Author: shaofengshi):
Hi Bo, thanks for the patch. Do you mean that, "NULL" should not be matched as
" <> 'abc'" ?
> when filter column has null value may cause incorrect query result
> ------------------------------------------------------------------
>
> Key: KYLIN-3634
> URL: https://issues.apache.org/jira/browse/KYLIN-3634
> Project: Kylin
> Issue Type: Bug
> Components: Query Engine
> Affects Versions: v2.0.0
> Reporter: WangBo
> Priority: Major
> Attachments:
> 0001-KYLIN-3634-when-filter-column-has-null-value-may-cau.patch
>
>
> h1. Question
> when a column has null value,and using it as a filter column when querying,
> and the filter value is not exist in the table,this may cause incorrect result
> h1. An Example
> h2. Table A
> the table A has three rows,city column of one row has null value
>
> ||day||...||city||price||
> |20180101| |null|10|
> |20180101| |beijing|20|
> |20180101| |shanghai|10|
> h2. Query SQL
> select day,sum(price) from a where city <> 'abc' group by day
> h2. Correct Result
> exclude the row contains null city value
> ||day||col||
> |20180101|30|
> h2. InCorrect Result
> resullt 0 rows
> this happens in our production environment,the kylin version is 2.0.0
> h1. Analysis process
> 1,city column dosen't have a value,so the CompareTupleFilter will turn into
> ConstantTupleFilter(see GTUtil.java)
> 2,if dimensions in the sql dosen't match all the columns using in group
> by,the bytesComparator used in hbase aggregation map will only compare the
> columns using in group by
> 3,when GTAggregateScanner constructs key of aggBufMap,the key may contains
> null value,because the comparator of aggBufMap only compares group by
> columns,so the tuple share same group by columns may also share the same keys
> which contains null value;This may cause kylin server receives tuples
> contains null value;
> 4,when the code which dynamically generated by calcilte deals tuples using
> filter,it first judges whether the column is null.Because filter column in
> the tuple contains null value,so it always return false, no tuples will
> return.
> h1. Solution
> when the filter column value is a invalid means not in the table,turn the
> CompareTupleFiter into IS_NOT_NULL filter,instead of ConstantTupleFilter.TURE
>
> Now I have test the feature in our production environment ;
> test in “mvn test” had passed,but not test in sandbox
>
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)