[
https://issues.apache.org/jira/browse/HIVE-24712?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
liuyan updated HIVE-24712:
--------------------------
Description:
When Both param set to false , seems the result is not correct, only 35 rows.
This is tested on HDP 3.1.5
----------------------------------------------------------------------------------------------
VERTICES MODE STATUS TOTAL COMPLETED RUNNING PENDING
FAILED KILLED
----------------------------------------------------------------------------------------------
Map 1 .......... llap SUCCEEDED 33 33 0 0
0 0
Reducer 2 ...... llap SUCCEEDED 4 4 0 0
0 0
Reducer 3 ...... llap SUCCEEDED 4 4 0 0
0 0
Reducer 4 ...... llap SUCCEEDED 1 1 0 0
0 0
----------------------------------------------------------------------------------------------
VERTICES: 04/04 [==========================>>] 100% ELAPSED TIME: 38.23 s
----------------------------------------------------------------------------------------------
FO :
INFO : Task Execution Summary
INFO :
----------------------------------------------------------------------------------------------
INFO : VERTICES DURATION(ms) CPU_TIME(ms) GC_TIME(ms)
INPUT_RECORDS OUTPUT_RECORDS
INFO :
----------------------------------------------------------------------------------------------
INFO : Map 1 38097.00 0 0
143,997,065 57,447
INFO : Reducer 2 9003.00 0 0
57,447 13,108
INFO : Reducer 3 0.00 0 0
13,108 35
INFO : Reducer 4 0.00 0 0
35 0
INFO :
----------------------------------------------------------------------------------------------
INFO :
INFO : LLAP IO Summary
set hive.map.aggr=true;
set hive.optimize.reducededuplication=false;
select cs_sold_date_sk,count(distinct cs_order_number) from
tpcds_orc.catalog_sales_orc group by cs_sold_date_sk order by cs_sold_date_sk
limit 200;
----------------------------------------------------------------------------------------------
VERTICES MODE STATUS TOTAL COMPLETED RUNNING PENDING
FAILED KILLED
----------------------------------------------------------------------------------------------
Map 1 .......... llap SUCCEEDED 33 33 0 0
0 0
Reducer 2 ...... llap SUCCEEDED 4 4 0 0
0 0
Reducer 3 ...... llap SUCCEEDED 2 2 0 0
0 0
Reducer 4 ...... llap SUCCEEDED 1 1 0 0
0 0
----------------------------------------------------------------------------------------------
VERTICES: 04/04 [==========================>>] 100% ELAPSED TIME: 36.24 s
----------------------------------------------------------------------------------------------
INFO :
----------------------------------------------------------------------------------------------
INFO : VERTICES DURATION(ms) CPU_TIME(ms) GC_TIME(ms)
INPUT_RECORDS OUTPUT_RECORDS
INFO :
----------------------------------------------------------------------------------------------
INFO : Map 1 25595.00 0 0
143,997,065 16,703,757
INFO : Reducer 2 18556.00 0 0
16,703,757 800
INFO : Reducer 3 8018.00 0 0
800 200
INFO : Reducer 4 0.00 0 0
200 0
INFO :
----------------------------------------------------------------------------------------------
INFO :
was:
When Both param set to false , seems the result is not correct, only 35 rows.
This is tested on HDP 3.1.5
set hive.map.aggr=false;
set hive.optimize.reducededuplication=false;
select cs_sold_date_sk,count(distinct cs_order_number) from
tpcds_orc.catalog_sales_orc group by cs_sold_date_sk order by cs_sold_date_sk
limit 200;
----------------------------------------------------------------------------------------------
VERTICES MODE STATUS TOTAL COMPLETED RUNNING PENDING FAILED KILLED
----------------------------------------------------------------------------------------------
Map 1 .......... llap SUCCEEDED 33 33 0 0 0 0
Reducer 2 ...... llap SUCCEEDED 4 4 0 0 0 0
Reducer 3 ...... llap SUCCEEDED 4 4 0 0 0 0
Reducer 4 ...... llap SUCCEEDED 1 1 0 0 0 0
----------------------------------------------------------------------------------------------
VERTICES: 04/04 [==========================>>] 100% ELAPSED TIME: 38.23 s
----------------------------------------------------------------------------------------------
FO :
INFO : Task Execution Summary
INFO :
----------------------------------------------------------------------------------------------
INFO : VERTICES DURATION(ms) CPU_TIME(ms) GC_TIME(ms) INPUT_RECORDS
OUTPUT_RECORDS
INFO :
----------------------------------------------------------------------------------------------
INFO : Map 1 38097.00 0 0 143,997,065 57,447
INFO : Reducer 2 9003.00 0 0 57,447 13,108
INFO : Reducer 3 0.00 0 0 13,108 35
INFO : Reducer 4 0.00 0 0 35 0
INFO :
----------------------------------------------------------------------------------------------
INFO :
INFO : LLAP IO Summary
set hive.map.aggr=true;
set hive.optimize.reducededuplication=false;
select cs_sold_date_sk,count(distinct cs_order_number) from
tpcds_orc.catalog_sales_orc group by cs_sold_date_sk order by cs_sold_date_sk
limit 200;
----------------------------------------------------------------------------------------------
VERTICES MODE STATUS TOTAL COMPLETED RUNNING PENDING FAILED KILLED
----------------------------------------------------------------------------------------------
Map 1 .......... llap SUCCEEDED 33 33 0 0 0 0
Reducer 2 ...... llap SUCCEEDED 4 4 0 0 0 0
Reducer 3 ...... llap SUCCEEDED 2 2 0 0 0 0
Reducer 4 ...... llap SUCCEEDED 1 1 0 0 0 0
----------------------------------------------------------------------------------------------
VERTICES: 04/04 [==========================>>] 100% ELAPSED TIME: 36.24 s
----------------------------------------------------------------------------------------------
INFO :
----------------------------------------------------------------------------------------------
INFO : VERTICES DURATION(ms) CPU_TIME(ms) GC_TIME(ms) INPUT_RECORDS
OUTPUT_RECORDS
INFO :
----------------------------------------------------------------------------------------------
INFO : Map 1 25595.00 0 0 143,997,065 16,703,757
INFO : Reducer 2 18556.00 0 0 16,703,757 800
INFO : Reducer 3 8018.00 0 0 800 200
INFO : Reducer 4 0.00 0 0 200 0
INFO :
----------------------------------------------------------------------------------------------
INFO :
> hive.map.aggr=false and hive.optimize.reducededuplication=false provide
> incorrect result on order by with limit
> ---------------------------------------------------------------------------------------------------------------
>
> Key: HIVE-24712
> URL: https://issues.apache.org/jira/browse/HIVE-24712
> Project: Hive
> Issue Type: Improvement
> Components: CBO
> Affects Versions: 3.1.0
> Reporter: liuyan
> Priority: Critical
>
> When Both param set to false , seems the result is not correct, only 35
> rows. This is tested on HDP 3.1.5
> ----------------------------------------------------------------------------------------------
> VERTICES MODE STATUS TOTAL COMPLETED RUNNING PENDING
> FAILED KILLED
> ----------------------------------------------------------------------------------------------
> Map 1 .......... llap SUCCEEDED 33 33 0 0
> 0 0
> Reducer 2 ...... llap SUCCEEDED 4 4 0 0
> 0 0
> Reducer 3 ...... llap SUCCEEDED 4 4 0 0
> 0 0
> Reducer 4 ...... llap SUCCEEDED 1 1 0 0
> 0 0
> ----------------------------------------------------------------------------------------------
> VERTICES: 04/04 [==========================>>] 100% ELAPSED TIME: 38.23 s
>
> ----------------------------------------------------------------------------------------------
> FO :
> INFO : Task Execution Summary
> INFO :
> ----------------------------------------------------------------------------------------------
> INFO : VERTICES DURATION(ms) CPU_TIME(ms) GC_TIME(ms)
> INPUT_RECORDS OUTPUT_RECORDS
> INFO :
> ----------------------------------------------------------------------------------------------
> INFO : Map 1 38097.00 0 0
> 143,997,065 57,447
> INFO : Reducer 2 9003.00 0 0
> 57,447 13,108
> INFO : Reducer 3 0.00 0 0
> 13,108 35
> INFO : Reducer 4 0.00 0 0
> 35 0
> INFO :
> ----------------------------------------------------------------------------------------------
> INFO :
> INFO : LLAP IO Summary
>
>
> set hive.map.aggr=true;
> set hive.optimize.reducededuplication=false;
> select cs_sold_date_sk,count(distinct cs_order_number) from
> tpcds_orc.catalog_sales_orc group by cs_sold_date_sk order by
> cs_sold_date_sk limit 200;
> ----------------------------------------------------------------------------------------------
> VERTICES MODE STATUS TOTAL COMPLETED RUNNING PENDING
> FAILED KILLED
> ----------------------------------------------------------------------------------------------
> Map 1 .......... llap SUCCEEDED 33 33 0 0
> 0 0
> Reducer 2 ...... llap SUCCEEDED 4 4 0 0
> 0 0
> Reducer 3 ...... llap SUCCEEDED 2 2 0 0
> 0 0
> Reducer 4 ...... llap SUCCEEDED 1 1 0 0
> 0 0
> ----------------------------------------------------------------------------------------------
> VERTICES: 04/04 [==========================>>] 100% ELAPSED TIME: 36.24 s
>
> ----------------------------------------------------------------------------------------------
> INFO :
> ----------------------------------------------------------------------------------------------
> INFO : VERTICES DURATION(ms) CPU_TIME(ms) GC_TIME(ms)
> INPUT_RECORDS OUTPUT_RECORDS
> INFO :
> ----------------------------------------------------------------------------------------------
> INFO : Map 1 25595.00 0 0
> 143,997,065 16,703,757
> INFO : Reducer 2 18556.00 0 0
> 16,703,757 800
> INFO : Reducer 3 8018.00 0 0
> 800 200
> INFO : Reducer 4 0.00 0 0
> 200 0
> INFO :
> ----------------------------------------------------------------------------------------------
> INFO :
--
This message was sent by Atlassian Jira
(v8.3.4#803005)