qingfa zhou created HIVE-22181:
----------------------------------

             Summary: Same query but different result occasionally
                 Key: HIVE-22181
                 URL: https://issues.apache.org/jira/browse/HIVE-22181
             Project: Hive
          Issue Type: Bug
          Components: Hive
    Affects Versions: 2.3.0
         Environment: hadoop:2.7.3.2.6.1.0-129

hive:2.3.0

tez:0.8.4

java:1.8.0_212 
            Reporter: qingfa zhou
         Attachments: WX20190910-123604.png, WX20190910-123621.png, 
WX20190910-123641.png

h3. 1.Phenomenon

    Running the same SQL(hive on tez) several times will result in inconsistent 
results . The following results are generated respectively

    1)

     OK

     3951864 808

    2)

     OK

     1822979 353

    But the first result is true.
h3. 2.SQL

select
 count(1),count(distinct store_code)
 from 
 (
 select
 store_code,
 all_pay_id,
 payable_price,
 row_number()over(partition by store_code order by payable_price desc) as rw,
 count(1)over(partition by store_code) as store_user
 from 
 (
 select 
 store_code,
 all_pay_id,
 sum(payable_price) as payable_price
 from data_promotion.mdw_user_promotion_shopping_behaivor_four_week_detail a
 where dt='20190904'
 group by store_code,all_pay_id
 ) a 
 order by rw 
 ) a 
 where rw/store_user<=0.8
 ;

 
h3. 3.Troubleshooting

    I lookup task result from tez web ui,give the result as follows:

           the first: !image-2019-09-10-12-16-45-385.png!

        the second:

!image-2019-09-10-12-18-08-769.png!

      so , The second RECORDS_OUT_INTERMEDIATE_Reducer_4's result is much lower 
than the first.

     This is detailed info of the second result from reducer_3 to reducer_4 

!image-2019-09-10-12-26-41-681.png!  

This is execution plan of the sql:

OK
Vertex dependency in root stage
Reducer 2 <- Map 1 (SIMPLE_EDGE)
Reducer 3 <- Reducer 2 (SIMPLE_EDGE)
Reducer 4 <- Reducer 3 (SIMPLE_EDGE)
Reducer 5 <- Reducer 4 (SIMPLE_EDGE)
Reducer 6 <- Reducer 5 (SIMPLE_EDGE)

Stage-0
 Fetch Operator
 limit:-1
 Stage-1
 Reducer 6
 File Output Operator [FS_24]
 Group By Operator [GBY_22] (rows=1 width=128)
 Output:["_col0","_col1"],aggregations:["count(VALUE._col0)","count(DISTINCT 
KEY._col0:0._col0)"]
 <-Reducer 5 [SIMPLE_EDGE]
 SHUFFLE [RS_21]
 Group By Operator [GBY_20] (rows=5619870 width=2328)
 Output:["_col0","_col1","_col2"],aggregations:["count(1)","count(DISTINCT 
_col0)"],keys:_col0
 Select Operator [SEL_17] (rows=5619870 width=2328)
 Output:["_col0"]
 <-Reducer 4 [SIMPLE_EDGE]
 SHUFFLE [RS_16]
 Select Operator [SEL_14] (rows=5619870 width=2328)
 Output:["_col0","_col3"]
 Filter Operator [FIL_25] (rows=5619870 width=2328)
 predicate:((_col0 / count_window_1) <= 0.8)
 PTF Operator [PTF_13] (rows=16859610 width=2328)
 Function definitions:[{},\{"name:":"windowingtablefunction","order by:":"_col1 
ASC NULLS FIRST","partition by:":"_col1"}]
 Select Operator [SEL_12] (rows=16859610 width=2328)
 Output:["_col0","_col1"]
 <-Reducer 3 [SIMPLE_EDGE]
 SHUFFLE [RS_11]
 PartitionCols:_col0
 Select Operator [SEL_10] (rows=16859610 width=2328)
 Output:["_col0","row_number_window_0"]
 PTF Operator [PTF_9] (rows=16859610 width=2328)
 Function definitions:[{},\{"name:":"windowingtablefunction","order by:":"_col2 
DESC NULLS LAST","partition by:":"_col0"}]
 Select Operator [SEL_8] (rows=16859610 width=2328)
 Output:["_col0","_col2"]
 <-Reducer 2 [SIMPLE_EDGE]
 SHUFFLE [RS_7]
 PartitionCols:_col0
 Select Operator [SEL_6] (rows=16859610 width=2328)
 Output:["_col0","_col2"]
 Group By Operator [GBY_5] (rows=16859610 width=2328)
 
Output:["_col0","_col1","_col2"],aggregations:["sum(VALUE._col0)"],keys:KEY._col0,
 KEY._col1
 <-Map 1 [SIMPLE_EDGE]
 SHUFFLE [RS_4]
 PartitionCols:_col0, _col1
 Group By Operator [GBY_3] (rows=33719220 width=2328)
 
Output:["_col0","_col1","_col2"],aggregations:["sum(payable_price)"],keys:store_code,
 all_pay_id
 Select Operator [SEL_2] (rows=33719220 width=2328)
 Output:["store_code","all_pay_id","payable_price"]
 TableScan [TS_0] (rows=33719220 width=2328)
 
data_promotion@mdw_user_promotion_shopping_behaivor_four_week_detail,a,Tbl:COMPLETE,Col:NONE,Output:["all_pay_id","store_code","payable_price"]

 

    Reducer 4 show PTF , so I don't know if this result has anything to do with 
PTF? Can you help me.

 

 



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

Reply via email to