Chiran Ravani created HIVE-23265: ------------------------------------ Summary: Duplicate rowsets are returned with Limit and Offset ste Key: HIVE-23265 URL: https://issues.apache.org/jira/browse/HIVE-23265 Project: Hive Issue Type: Bug Components: HiveServer2, Vectorization Affects Versions: 3.1.2, 3.1.0 Reporter: Chiran Ravani Attachments: 000000_0
We have a query which produces duplicate results even when there is no duplicate records in underlying tables. Sample Query {code:java} select * from orderdatatest_ext order by col1 limit 1000,50 {code} The problem appears when order by clause is used with col1 having non-unique rows. Apparently the duplicates are being produced during reducer phase of the query. set hive.vectorized.execution.reduce.enabled=false does not cause the problem. Data in table is as follows. {code:java} 1,1 1,2 1,3 . . 1,1500 {code} Results with hive.vectorized.execution.reduce.enabled=true {code:java} +-------------------------+-------------------------+ | orderdatatest_ext.col1 | orderdatatest_ext.col2 | +-------------------------+-------------------------+ | 1 | 1001 | | 1 | 1002 | | 1 | 1003 | | 1 | 1004 | | 1 | 1005 | | 1 | 1006 | | 1 | 1007 | | 1 | 1008 | | 1 | 1009 | | 1 | 1010 | | 1 | 1011 | | 1 | 1012 | | 1 | 1013 | | 1 | 1014 | | 1 | 1015 | | 1 | 1016 | | 1 | 1017 | | 1 | 1018 | | 1 | 1019 | | 1 | 1020 | | 1 | 1021 | | 1 | 1022 | | 1 | 1023 | | 1 | 1024 | | 1 | 1 | | 1 | 1 | | 1 | 1 | | 1 | 1 | | 1 | 1 | | 1 | 1 | | 1 | 1 | | 1 | 1 | | 1 | 1 | | 1 | 1 | | 1 | 1 | | 1 | 1 | | 1 | 1 | | 1 | 1 | | 1 | 1 | | 1 | 1 | | 1 | 1 | | 1 | 1 | | 1 | 1 | | 1 | 1 | | 1 | 1 | | 1 | 1 | | 1 | 1 | | 1 | 1 | | 1 | 1 | | 1 | 1 | +-------------------------+-------------------------+ {code} Results with hive.vectorized.execution.reduce.enabled=false {code:java} +-------------------------+-------------------------+ | orderdatatest_ext.col1 | orderdatatest_ext.col2 | +-------------------------+-------------------------+ | 1 | 1001 | | 1 | 1002 | | 1 | 1003 | | 1 | 1004 | | 1 | 1005 | | 1 | 1006 | | 1 | 1007 | | 1 | 1008 | | 1 | 1009 | | 1 | 1010 | | 1 | 1011 | | 1 | 1012 | | 1 | 1013 | | 1 | 1014 | | 1 | 1015 | | 1 | 1016 | | 1 | 1017 | | 1 | 1018 | | 1 | 1019 | | 1 | 1020 | | 1 | 1021 | | 1 | 1022 | | 1 | 1023 | | 1 | 1024 | | 1 | 1025 | | 1 | 1026 | | 1 | 1027 | | 1 | 1028 | | 1 | 1029 | | 1 | 1030 | | 1 | 1031 | | 1 | 1032 | | 1 | 1033 | | 1 | 1034 | | 1 | 1035 | | 1 | 1036 | | 1 | 1037 | | 1 | 1038 | | 1 | 1039 | | 1 | 1040 | | 1 | 1041 | | 1 | 1042 | | 1 | 1043 | | 1 | 1044 | | 1 | 1045 | | 1 | 1046 | | 1 | 1047 | | 1 | 1048 | | 1 | 1049 | | 1 | 1050 | +-------------------------+-------------------------+ {code} Table DDL {code:java} CREATE EXTERNAL TABLE orderdatatest_ext (col1 int, col2 int) stored as orc {code} Attached sample ORC file. -- This message was sent by Atlassian Jira (v8.3.4#803005)