[ https://issues.apache.org/jira/browse/HIVE-21915?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16871686#comment-16871686 ]
Hive QA commented on HIVE-21915: -------------------------------- Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12972724/HIVE-21915.01.patch {color:red}ERROR:{color} -1 due to no test(s) being added or modified. {color:red}ERROR:{color} -1 due to 1 failed/errored test(s), 16339 tests executed *Failed tests:* {noformat} org.apache.hive.service.TestDFSErrorHandling.testAccessDenied (batchId=272) {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/17710/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/17710/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-17710/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.YetusPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 1 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12972724 - PreCommit-HIVE-Build > Hive with TEZ UNION ALL and UDTF results in data loss > ----------------------------------------------------- > > Key: HIVE-21915 > URL: https://issues.apache.org/jira/browse/HIVE-21915 > Project: Hive > Issue Type: Bug > Components: Query Processor > Affects Versions: 1.2.1 > Reporter: Wei Zhang > Assignee: Wei Zhang > Priority: Major > Attachments: HIVE-21915.01.patch > > > The HQL syntax is like this: > CREATE TEMPORARY TABLE tez_union_all_loss_data AS > SELECT xxx, yyy, zzz,1 as tag > FROM ods_1 > UNION ALL > SELECT xxx, yyy, zzz, tag > FROM > ( > SELECT xxx > ,get_json_object(get_json_object(tb,'$.a'),'$.b') AS yyy > ,zzz > ,2 as tag > FROM ods_2 > LATERAL VIEW EXPLODE(some_udf(uuu)) team_number AS tb > ) tbl > ; > > With above HQL, we are expecting that rows with both tag = 2 and tag = 1 > appear. In our case however, all the rows with tag = 1 are lost. > Dig deeper we can find that the generated two maps have identical task tmp > paths. And that results from when UDTF is present, the FileSinkOperator will > be processed twice generating the tmp path in > GenTezUtils.removeUnionOperators(); > -- This message was sent by Atlassian JIRA (v7.6.3#76005)