[ 
https://issues.apache.org/jira/browse/KYLIN-5203?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

wang closed KYLIN-5203.
-----------------------
    Resolution: Won't Do

> From Kylin or Hive, the same query Sql, but the results are inconsistent
> ------------------------------------------------------------------------
>
>                 Key: KYLIN-5203
>                 URL: https://issues.apache.org/jira/browse/KYLIN-5203
>             Project: Kylin
>          Issue Type: Bug
>          Components: Query Engine
>    Affects Versions: v3.1.2
>            Reporter: wang
>            Priority: Blocker
>
> SQL(SUM, COUNT):
> SELECT 
>     SUM(t1.a1),
>     COUNT(1)
> FROM
>     T1 JOIN T2 ON...
>     JOIN T3 ON...
>     JOIN T4 ON...
>     ...
>     JOIN T9 ON...
> WHERE
>     T1.c1 = '10000'
>     T1.date between '2022-06-11' and '2022-06-21'
>     {color:#ff0000}T9.b_type IN ('7', '11', '12');{color}
> Result:
> || ||sum||count||
> |Hive|2134980.9451|36330|
> |Kylin|1135892.3346|19765|
> h3. If remove T9 Filter:
> SELECT 
>     SUM(t1.a1),
>     COUNT(1)
> FROM
>     T1 JOIN T2 ON...
>     JOIN T3 ON...
>     JOIN T4 ON...
>     ...
>     JOIN T9 ON...
> WHERE
>     T1.c1 = '10000'
>     T1.date between '2022-06-11' and '2022-06-21';
> Result:
> || ||sum||count||
> |Hive|3184089.5551|65333|
> |Kylin|3184089.5551|65333|
> 理论上,Hive和kylin的结果一致,但是不加上T9表的过滤条件,结果一致,加上Filter,结果丢失;
> In theory, the results of Hive and kylin are the same, but the filter 
> conditions of the T9 table are not added, the results are the same, and the 
> results are lost when Filter is added;
> env:
>     Hive, 
>     一共九张表,主表Fact Table是分区表,其余八张表中,两个千万大表,剩下的是维表,表类型是分桶表
>     There are nine tables. The main table, Fact Table, is a partition table. 
> The other eight tables, there are two large tables. The rest are dimension 
> tables , bucket tables.
>     Kylin:
>     Create Intermediate Flat Hive Table
>     Redistribute Flat Hive Table
>     Extract Fact Table Distinct Columns(Map Input)
>     Segment: 
>         Source Count: ???
>     From log, the same data count
>  
>  
>  
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to