kuqiqi created HIVE-24078: ----------------------------- Summary: result rows not equal in mr and tez Key: HIVE-24078 URL: https://issues.apache.org/jira/browse/HIVE-24078 Project: Hive Issue Type: Bug Components: HiveServer2, Tez Affects Versions: 3.1.2 Reporter: kuqiqi
select rank_num, province_name, programset_id, programset_name, programset_type, cv, uv, pt, rank_num2, rank_num3, city_name, level, cp_code, cp_name, version_type, zz.city_code, zz.province_alias, '20200815' dt from (SELECT row_number() over(partition BY a1.province_alias,a1.city_code,a1.version_type ORDER BY cast(a1.cv AS bigint) DESC) AS rank_num, province_name(a1.province_alias) AS province_name, a1.program_set_id AS programset_id, a2.programset_name, a2.type_name AS programset_type, a1.cv, a1.uv, cast(a1.pt/3600000 as decimal(20,2)) pt, row_number() over (partition by a1.province_alias,a1.city_code,a1.version_type order by cast(a1.uv as bigint) desc ) as rank_num2, row_number() over (partition by a1.province_alias,a1.city_code,a1.version_type order by cast(a1.pt as bigint) desc ) as rank_num3, a1.city_code, a1.city_name, '3' as level, a2.cp_code, a2.cp_name, '20200815'as dt, a1.province_alias, a1.version_type FROM temp.dmp_device_vod_valid_day_v1_20200815_hn a1 LEFT JOIN temp.dmp_device_vod_valid_day_v2_20200815_hn a2 ON a1.program_set_id=a2.programset_id WHERE a2.programset_name IS NOT NULL ) zz where rank_num<1000 or rank_num2<1000 or rank_num3<1000 ; This sql gets 76742 rows in mr, but 76681 rows in tez.How to fix it? I think the problem maybe lies in row_number. -- This message was sent by Atlassian Jira (v8.3.4#803005)