[
https://issues.apache.org/jira/browse/HIVE-27342?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17733864#comment-17733864
]
okumin commented on HIVE-27342:
-------------------------------
[~jimmydeng] I didn't reproduce the issue on 4.0.0-alpha-2.
{code:java}
0: jdbc:hive2://hive-hiveserver2:10000/defaul> create table t1(f1 int);
...
No rows affected (0.057 seconds)
0: jdbc:hive2://hive-hiveserver2:10000/defaul> insert into t1
values(111),(222),(333),(444),(555),(666),(777),(888),(999);
...
9 rows affected (11.162 seconds)
...
0: jdbc:hive2://hive-hiveserver2:10000/defaul> select * from t1 order by f1
limit 0,3;
...
+--------+
| t1.f1 |
+--------+
| 111 |
| 222 |
| 333 |
+--------+
...
0: jdbc:hive2://hive-hiveserver2:10000/defaul> select * from t1 order by f1
limit 3,3;
...
+--------+
| t1.f1 |
+--------+
| 444 |
| 555 |
| 666 |
+--------+{code}
I remember we applied [several patches to
VectorLimitOperator|https://github.com/apache/hive/commits/master/ql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorLimitOperator.java]
to fix issues related to OFFSET.
Actually, we backported
[HIVE-22120|https://issues.apache.org/jira/browse/HIVE-22120],
[HIVE-22164|https://issues.apache.org/jira/browse/HIVE-22164], and
[HIVE-23265|https://issues.apache.org/jira/browse/HIVE-23265]. I guess
HIVE-22164 would fix your problem though my memory could be wrong.
> Duplicate row retured using Order by, Limit and Offset
> ------------------------------------------------------
>
> Key: HIVE-27342
> URL: https://issues.apache.org/jira/browse/HIVE-27342
> Project: Hive
> Issue Type: Bug
> Affects Versions: 3.1.1
> Reporter: jimmydeng
> Priority: Major
>
> Create an example table:
> {code:java}
> create table t1(f1 int);
> insert into t1 values(111),(222),(333),(444),(555),(666),(777),(888),(999);
> {code}
>
> Query using order by, limit, offset. Page 1 is correct:
> {code:java}
> select * from t1 order by f1 limit 0,3;
> +---------+
> | t1.f1 |
> +---------+
> | 111 |
> | 222 |
> | 333 |
> +---------+{code}
>
> But there is an duplicate row `333` on page 2:
> {code:java}
> select * from t1 order by f1 limit 3,3;
> +---------+
> | t1.f1 |
> +---------+
> | 333 |
> | 444 |
> | 555 |
> +---------+
> {code}
> set hive.vectorized.execution.reduce.enabled=false does not cause the problem.
>
>
>
>
>
--
This message was sent by Atlassian Jira
(v8.20.10#820010)