[
https://issues.apache.org/jira/browse/HIVE-13072?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15166317#comment-15166317
]
Yongzhi Chen commented on HIVE-13072:
-------------------------------------
I can not reproduce the issue in the master branch with query:
insert overwrite table rowninfo select row_number() over( order by num) as
rowid, num from disrow;
disrow has 329210 rows with distinct values.
After the insert statement, rowninfo has same number of rows with distinct row
values. There is no duplicate.
[~Zyrix], could you share your reproduce? Thanks
> ROW_NUMBER() function creates wrong results
> -------------------------------------------
>
> Key: HIVE-13072
> URL: https://issues.apache.org/jira/browse/HIVE-13072
> Project: Hive
> Issue Type: Bug
> Affects Versions: 1.1.0
> Reporter: Philipp Brandl
> Assignee: Yongzhi Chen
>
> When using ROW_NUMBER() on tables with more than 25000 rows, the function
> ROW_NUMBER() duplicates rows with separate row numbers.
> Reproduce by using a large table with more than 25000 rows with distinct
> values and then using a query involving ROW_NUMBER(). It will then result in
> getting the same distinct values twice with separate row numbers apart by
> 25000.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)