[ https://issues.apache.org/jira/browse/HIVE-17148?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16114533#comment-16114533 ]
Hive QA commented on HIVE-17148: -------------------------------- Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12880173/HIVE-17148.1.patch {color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified. {color:red}ERROR:{color} -1 due to 19 failed/errored test(s), 11145 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[innerjoin1] (batchId=23) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[llap_uncompressed] (batchId=56) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[nested_column_pruning] (batchId=32) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[semijoin4] (batchId=82) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[semijoin5] (batchId=15) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[columnstats_part_coltype] (batchId=158) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[dynamic_semijoin_reduction_2] (batchId=159) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[lineage2] (batchId=156) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[subquery_in] (batchId=157) org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[spark_vectorized_dynamic_partition_pruning] (batchId=168) org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainanalyze_2] (batchId=100) org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query14] (batchId=236) org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query23] (batchId=236) org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query24] (batchId=236) org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query8] (batchId=236) org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[subquery_in] (batchId=128) org.apache.hive.hcatalog.api.TestHCatClient.testPartitionRegistrationWithCustomSchema (batchId=179) org.apache.hive.hcatalog.api.TestHCatClient.testPartitionSpecRegistrationWithCustomSchema (batchId=179) org.apache.hive.hcatalog.api.TestHCatClient.testTableSchemaPropagation (batchId=179) {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/6259/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/6259/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-6259/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 19 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12880173 - PreCommit-HIVE-Build > Incorrect result for Hive join query with COALESCE in WHERE condition > --------------------------------------------------------------------- > > Key: HIVE-17148 > URL: https://issues.apache.org/jira/browse/HIVE-17148 > Project: Hive > Issue Type: Bug > Components: CBO > Affects Versions: 2.1.1 > Reporter: Vlad Gudikov > Assignee: Vlad Gudikov > Attachments: HIVE-17148.1.patch, HIVE-17148.patch > > > The issue exists in Hive-2.1. In Hive-1.2 the query works fine with cbo > enabled: > STEPS TO REPRODUCE: > {code} > Step 1: Create a table ct1 > create table ct1 (a1 string,b1 string); > Step 2: Create a table ct2 > create table ct2 (a2 string); > Step 3 : Insert following data into table ct1 > insert into table ct1 (a1) values ('1'); > Step 4 : Insert following data into table ct2 > insert into table ct2 (a2) values ('1'); > Step 5 : Execute the following query > select * from ct1 c1, ct2 c2 where COALESCE(a1,b1)=a2; > {code} > ACTUAL RESULT: > {code} > The query returns nothing; > {code} > EXPECTED RESULT: > {code} > 1 NULL 1 > {code} > The issue seems to be because of the incorrect query plan. In the plan we can > see: > predicate:(a1 is not null and b1 is not null) > which does not look correct. As a result, it is filtering out all the rows is > any column mentioned in the COALESCE has null value. > Please find the query plan below: > {code} > Plan optimized by CBO. > Vertex dependency in root stage > Map 1 <- Map 2 (BROADCAST_EDGE) > Stage-0 > Fetch Operator > limit:-1 > Stage-1 > Map 1 > File Output Operator [FS_10] > Map Join Operator [MAPJOIN_15] (rows=1 width=4) > > Conds:SEL_2.COALESCE(_col0,_col1)=RS_7._col0(Inner),HybridGraceHashJoin:true,Output:["_col0","_col1","_col2"] > <-Map 2 [BROADCAST_EDGE] > BROADCAST [RS_7] > PartitionCols:_col0 > Select Operator [SEL_5] (rows=1 width=1) > Output:["_col0"] > Filter Operator [FIL_14] (rows=1 width=1) > predicate:a2 is not null > TableScan [TS_3] (rows=1 width=1) > default@ct2,c2,Tbl:COMPLETE,Col:NONE,Output:["a2"] > <-Select Operator [SEL_2] (rows=1 width=4) > Output:["_col0","_col1"] > Filter Operator [FIL_13] (rows=1 width=4) > predicate:(a1 is not null and b1 is not null) > TableScan [TS_0] (rows=1 width=4) > default@ct1,c1,Tbl:COMPLETE,Col:NONE,Output:["a1","b1"] > {code} > This happens only if join is inner type, otherwise HiveJoinAddNotRule which > creates this problem is skipped. -- This message was sent by Atlassian JIRA (v6.4.14#64029)