[
https://issues.apache.org/jira/browse/HIVE-15493?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15769724#comment-15769724
]
Hive QA commented on HIVE-15493:
--------------------------------
Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12844325/HIVE-15493.patch
{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.
{color:red}ERROR:{color} -1 due to 4 failed/errored test(s), 10897 tests
executed
*Failed tests:*
{noformat}
TestDerbyConnector - did not produce a TEST-*.xml file (likely timed out)
(batchId=234)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[dbtxnmgr_showlocks]
(batchId=72)
org.apache.hive.hcatalog.api.TestHCatClientNotification.createTable
(batchId=220)
org.apache.hive.jdbc.TestMultiSessionsHS2WithLocalClusterSpark.testSparkQuery
(batchId=216)
{noformat}
Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/2693/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/2693/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-2693/
Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 4 tests failed
{noformat}
This message is automatically generated.
ATTACHMENT ID: 12844325 - PreCommit-HIVE-Build
> Wrong result for LEFT outer join in Tez using MapJoinOperator
> -------------------------------------------------------------
>
> Key: HIVE-15493
> URL: https://issues.apache.org/jira/browse/HIVE-15493
> Project: Hive
> Issue Type: Bug
> Affects Versions: 2.2.0
> Reporter: Jesus Camacho Rodriguez
> Assignee: Jesus Camacho Rodriguez
> Priority: Critical
> Attachments: HIVE-15493.patch
>
>
> To reproduce, we can run in Tez:
> {code:sql}
> set hive.auto.convert.join=true;
> DROP TABLE IF EXISTS test_1;
> CREATE TABLE test_1
> (
> member BIGINT
> , age VARCHAR (100)
> )
> STORED AS TEXTFILE
> ;
> DROP TABLE IF EXISTS test_2;
> CREATE TABLE test_2
> (
> member BIGINT
> )
> STORED AS TEXTFILE
> ;
> INSERT INTO test_1 VALUES (1, '20'), (2, '30'), (3, '40');
> INSERT INTO test_2 VALUES (1), (2), (3);
> SELECT
> t2.member
> , t1.age_1
> , t1.age_2
> FROM
> test_2 t2
> LEFT JOIN (
> SELECT
> member
> , age as age_1
> , age as age_2
> FROM
> test_1
> ) t1
> ON t2.member = t1.member
> ;
> {code}
> Result is:
> {noformat}
> 1 20 NULL
> 3 40 NULL
> 2 30 NULL
> {noformat}
> Correct result is:
> {noformat}
> 1 20 20
> 3 40 40
> 2 30 30
> {noformat}
> Bug was introduced by HIVE-10582. Though the fix in HIVE-10582 does not
> contain tests, it does look legit. In fact, the problem seems to be in the
> MapJoinOperator itself. It only happens for LEFT outer join (not with RIGHT
> outer or FULL outer). Although I am still trying to understand part of the
> MapJoinOperator code path, the bug could be in the initialization of the
> operator. It only happens when we have duplicate values in the right part of
> the output.
> Till we have more time to study the problem in detail and fix the
> MapJoinOperator, I will submit a fix that removes the code in
> SemanticAnalyzer that reuses duplicated value expressions from RS to create
> multiple columns in the join output (this is equivalent to reverting
> HIVE-10582).
> Once this is pushed, I will create a follow-up issue to take this code back
> and tackle the problem in the MapJoinOperator.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)