kuqiqi created HIVE-24078:
-----------------------------

             Summary: result rows not equal in mr and tez
                 Key: HIVE-24078
                 URL: https://issues.apache.org/jira/browse/HIVE-24078
             Project: Hive
          Issue Type: Bug
          Components: HiveServer2, Tez
    Affects Versions: 3.1.2
            Reporter: kuqiqi


select
rank_num,
province_name,
programset_id,
programset_name,
programset_type,
cv,
uv,
pt,
rank_num2,
rank_num3,
city_name,
level,
cp_code,
cp_name,
version_type,
zz.city_code,
zz.province_alias,
'20200815' dt
from 
(SELECT row_number() over(partition BY 
a1.province_alias,a1.city_code,a1.version_type
 ORDER BY cast(a1.cv AS bigint) DESC) AS rank_num,
 province_name(a1.province_alias) AS province_name,
 a1.program_set_id AS programset_id,
 a2.programset_name,
 a2.type_name AS programset_type,
 a1.cv,
 a1.uv,
 cast(a1.pt/3600000 as decimal(20,2)) pt,
 row_number() over (partition by a1.province_alias,a1.city_code,a1.version_type 
order by cast(a1.uv as bigint) desc ) as rank_num2,
 row_number() over (partition by a1.province_alias,a1.city_code,a1.version_type 
order by cast(a1.pt as bigint) desc ) as rank_num3,
 a1.city_code,
 a1.city_name,
 '3' as level,
 a2.cp_code,
 a2.cp_name,
 '20200815'as dt,
 a1.province_alias,
 a1.version_type
FROM temp.dmp_device_vod_valid_day_v1_20200815_hn a1
LEFT JOIN temp.dmp_device_vod_valid_day_v2_20200815_hn a2 ON 
a1.program_set_id=a2.programset_id
WHERE a2.programset_name IS NOT NULL ) zz
where rank_num<1000 or rank_num2<1000 or rank_num3<1000
;

 

This sql gets 76742 rows in mr, but 76681 rows in tez.How to fix it?

I think the problem maybe lies in row_number.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to