[
https://issues.apache.org/jira/browse/TAJO-1283?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14302933#comment-14302933
]
Hudson commented on TAJO-1283:
------------------------------
SUCCESS: Integrated in Tajo-master-build #576 (See
[https://builds.apache.org/job/Tajo-master-build/576/])
TAJO-1283: ORDER BY with the first descending order causes wrong results.
(Keuntae Park) (sirpkt: rev 02c6c266c013d8174a287bc57a6d4131da51ba96)
* tajo-core/src/main/java/org/apache/tajo/querymaster/Repartitioner.java
* CHANGES
* tajo-core/src/test/resources/queries/TestSortQuery/testSortFirstDesc.sql
* tajo-core/src/test/resources/results/TestSortQuery/testSortFirstDesc.result
* tajo-core/src/test/java/org/apache/tajo/engine/query/TestSortQuery.java
* tajo-core/src/main/java/org/apache/tajo/querymaster/Stage.java
*
tajo-storage/tajo-storage-common/src/main/java/org/apache/tajo/storage/TupleRange.java
> ORDER BY with the first descending order causes wrong results
> -------------------------------------------------------------
>
> Key: TAJO-1283
> URL: https://issues.apache.org/jira/browse/TAJO-1283
> Project: Tajo
> Issue Type: Bug
> Components: distributed query plan, planner/optimizer
> Reporter: Hyunsik Choi
> Assignee: Keuntae Park
> Priority: Critical
> Fix For: 0.10, 0.11
>
>
> Each order key by can be specified with ascending or descending order.
> Recently, I found that ORDER BY with the first descending order key causes
> wrong result.
> If second key is a descending order, it works well. Other cases work
> correctly.
> {code}
> select l_orderkey, l_partkey from lineitem order by l_orderkey, l_partkey
> desc;
> l_orderkey, l_partkey
> -------------------------------
> 1, 155190
> 1, 67310
> 1, 63700
> 1, 24027
> 1, 15635
> 1, 2132
> 2, 106170
> 3, 183095
> 3, 128449
> 3, 62143
> 3, 29380
> 3, 19036
> 3, 4297
> ...
> {code}
> But, if the first sort key is a descending order, it causes wrong row number
> and shows wrong range part as follows:
> {code}
> default> select l_orderkey, l_partkey from lineitem order by l_orderkey desc,
> l_partkey;
> l_orderkey, l_partkey
> -------------------------------
> 3000000, 61045
> 3000000, 159113
> 3000000, 167695
> 3000000, 167904
> 3000000, 196339
> ...
> {code}
> According to my investigation, it seems to be related to offset problem of
> RowFile or index problem. The final result includes duplicated rows and the
> final row was wrong as follows:
> {code:title=part-02-000000-000}
> 3000000|61045
> 3000000|159113
> 3000000|167695
> 3000000|167904
> 3000000|196339
> 2999975|28334
> 2999975|194023
> 2999974|8020
> 2999974|124152
> 2999974|129921
> 2999974|139248
> 2999974|168914
> 2999974|187923
> 2999973|30533
> 2999973|36196
> ...
> 2919713|133486
> 2919713|195963
> 2919712|86257
> 2919712|94542
> 2919712|107370
> 2919712|166342 <- duplicated rows
> 2919712|178277
> ....
> 1|63700
> 1|67310
> 1|155190
> [EOF]
> {code}
> {code:title=part-02-000001-000}
> |96127 <- looks wrong
> 6000000|32255
> 6000000|96127
> 5999975|6452
> 5999975|7272
> 5999975|37131
> ....
> ....
> 2919713|133486
> 2919713|195963
> 2919712|94542
> 2919712|107370
> 2919712|166342 <- duplicated rows
> [EOF]
> {code}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)