[
https://issues.apache.org/jira/browse/HIVE-23265?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17094171#comment-17094171
]
Hive QA commented on HIVE-23265:
--------------------------------
Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/13001331/HIVE-23265.2.patch
{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.
{color:red}ERROR:{color} -1 due to 2 failed/errored test(s), 17140 tests
executed
*Failed tests:*
{noformat}
TestMiniLlapCliDriver - did not produce a TEST-*.xml file (likely timed out)
(batchId=58)
[intersect_all.q,unionDistinct_1.q,orc_ppd_schema_evol_3a.q,import_exported_table.q,tez_union_dynamic_partition.q,vector_offset_limit.q,orc_merge10.q,tez_union_dynamic_partition_2.q,temp_table_external.q,llap_udf.q,schemeAuthority.q,cte_2.q,rcfile_createas1.q,cttl.q,remote_script.q]
TestStatsReplicationScenariosACID - did not produce a TEST-*.xml file (likely
timed out) (batchId=184)
{noformat}
Test results:
https://builds.apache.org/job/PreCommit-HIVE-Build/21982/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/21982/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-21982/
Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.YetusPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 2 tests failed
{noformat}
This message is automatically generated.
ATTACHMENT ID: 13001331 - PreCommit-HIVE-Build
> Duplicate rowsets are returned with Limit and Offset ste
> --------------------------------------------------------
>
> Key: HIVE-23265
> URL: https://issues.apache.org/jira/browse/HIVE-23265
> Project: Hive
> Issue Type: Bug
> Components: HiveServer2, Vectorization
> Affects Versions: 3.1.0, 3.1.2
> Reporter: Chiran Ravani
> Assignee: Attila Magyar
> Priority: Critical
> Attachments: 000000_0, HIVE-23265.1.patch, HIVE-23265.2.patch
>
>
> We have a query which produces duplicate results even when there is no
> duplicate records in underlying tables.
> Sample Query
> {code:java}
> select * from orderdatatest_ext order by col1 limit 1000,50
> {code}
> The problem appears when order by clause is used with col1 having non-unique
> rows. Apparently the duplicates are being produced during reducer phase of
> the query.
> set hive.vectorized.execution.reduce.enabled=false does not cause the problem.
> Data in table is as follows.
> {code:java}
> 1,1
> 1,2
> 1,3
> .
> .
> 1,1500
> {code}
> Results with hive.vectorized.execution.reduce.enabled=true
> {code:java}
> +-------------------------+-------------------------+
> | orderdatatest_ext.col1 | orderdatatest_ext.col2 |
> +-------------------------+-------------------------+
> | 1 | 1001 |
> | 1 | 1002 |
> | 1 | 1003 |
> | 1 | 1004 |
> | 1 | 1005 |
> | 1 | 1006 |
> | 1 | 1007 |
> | 1 | 1008 |
> | 1 | 1009 |
> | 1 | 1010 |
> | 1 | 1011 |
> | 1 | 1012 |
> | 1 | 1013 |
> | 1 | 1014 |
> | 1 | 1015 |
> | 1 | 1016 |
> | 1 | 1017 |
> | 1 | 1018 |
> | 1 | 1019 |
> | 1 | 1020 |
> | 1 | 1021 |
> | 1 | 1022 |
> | 1 | 1023 |
> | 1 | 1024 |
> | 1 | 1 |
> | 1 | 1 |
> | 1 | 1 |
> | 1 | 1 |
> | 1 | 1 |
> | 1 | 1 |
> | 1 | 1 |
> | 1 | 1 |
> | 1 | 1 |
> | 1 | 1 |
> | 1 | 1 |
> | 1 | 1 |
> | 1 | 1 |
> | 1 | 1 |
> | 1 | 1 |
> | 1 | 1 |
> | 1 | 1 |
> | 1 | 1 |
> | 1 | 1 |
> | 1 | 1 |
> | 1 | 1 |
> | 1 | 1 |
> | 1 | 1 |
> | 1 | 1 |
> | 1 | 1 |
> | 1 | 1 |
> +-------------------------+-------------------------+
> {code}
> Results with hive.vectorized.execution.reduce.enabled=false
> {code:java}
> +-------------------------+-------------------------+
> | orderdatatest_ext.col1 | orderdatatest_ext.col2 |
> +-------------------------+-------------------------+
> | 1 | 1001 |
> | 1 | 1002 |
> | 1 | 1003 |
> | 1 | 1004 |
> | 1 | 1005 |
> | 1 | 1006 |
> | 1 | 1007 |
> | 1 | 1008 |
> | 1 | 1009 |
> | 1 | 1010 |
> | 1 | 1011 |
> | 1 | 1012 |
> | 1 | 1013 |
> | 1 | 1014 |
> | 1 | 1015 |
> | 1 | 1016 |
> | 1 | 1017 |
> | 1 | 1018 |
> | 1 | 1019 |
> | 1 | 1020 |
> | 1 | 1021 |
> | 1 | 1022 |
> | 1 | 1023 |
> | 1 | 1024 |
> | 1 | 1025 |
> | 1 | 1026 |
> | 1 | 1027 |
> | 1 | 1028 |
> | 1 | 1029 |
> | 1 | 1030 |
> | 1 | 1031 |
> | 1 | 1032 |
> | 1 | 1033 |
> | 1 | 1034 |
> | 1 | 1035 |
> | 1 | 1036 |
> | 1 | 1037 |
> | 1 | 1038 |
> | 1 | 1039 |
> | 1 | 1040 |
> | 1 | 1041 |
> | 1 | 1042 |
> | 1 | 1043 |
> | 1 | 1044 |
> | 1 | 1045 |
> | 1 | 1046 |
> | 1 | 1047 |
> | 1 | 1048 |
> | 1 | 1049 |
> | 1 | 1050 |
> +-------------------------+-------------------------+
> {code}
> Table DDL
> {code:java}
> CREATE EXTERNAL TABLE orderdatatest_ext (col1 int, col2 int) stored as orc
> {code}
> Attached sample ORC file.
> Problem appears to be with VectorLimitOperator.
> {code}
> 2020-04-20 15:35:49,693 [INFO] [TezChild] |vector.VectorSelectOperator|:
> Initializing operator SEL[6]
> 2020-04-20 15:35:49,747 [INFO] [TezChild] |vector.VectorSelectOperator|:
> RECORDS_OUT_INTERMEDIATE_Map_1:0, RECORDS_OUT_OPERATOR_SEL_6:1500,
> 2020-04-20 15:35:50,142 [INFO] [TezChild] |vector.VectorSelectOperator|:
> Initializing operator SEL[8]
> 2020-04-20 15:35:50,303 [INFO] [TezChild] |vector.VectorSelectOperator|:
> RECORDS_OUT_OPERATOR_SEL_8:1050, RECORDS_OUT_INTERMEDIATE_Reducer_2:0,
> 2020-04-20 15:35:50,142 [INFO] [TezChild] |vector.VectorLimitOperator|:
> Initializing operator LIM[9]
> 2020-04-20 15:35:50,303 [INFO] [TezChild] |vector.VectorLimitOperator|:
> RECORDS_OUT_INTERMEDIATE_Reducer_2:0, RECORDS_OUT_OPERATOR_LIM_9:1050,
> {code}
--
This message was sent by Atlassian Jira
(v8.3.4#803005)