[jira] [Commented] (HIVE-20570) Union ALL with hive.optimize.union.remove=true has incorrect plan

Hive QA (JIRA) Sun, 16 Sep 2018 21:24:01 -0700


    [ 
https://issues.apache.org/jira/browse/HIVE-20570?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16617063#comment-16617063
 ]


Hive QA commented on HIVE-20570:
--------------------------------



Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12939909/HIVE-20570.1.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 36 failed/errored test(s), 14967 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[skewjoin_union_remove_1] 
(batchId=92)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[skewjoin_union_remove_2] 
(batchId=30)
org.apache.hadoop.hive.ql.TestTxnCommands2WithSplitUpdateAndVectorization.testACIDwithSchemaEvolutionAndCompaction
 (batchId=311)
org.apache.hadoop.hive.ql.TestTxnCommands2WithSplitUpdateAndVectorization.testAcidWithSchemaEvolution
 (batchId=311)
org.apache.hadoop.hive.ql.TestTxnCommands2WithSplitUpdateAndVectorization.testAlterTable
 (batchId=311)
org.apache.hadoop.hive.ql.TestTxnCommands2WithSplitUpdateAndVectorization.testBucketCodec
 (batchId=311)
org.apache.hadoop.hive.ql.TestTxnCommands2WithSplitUpdateAndVectorization.testBucketizedInputFormat
 (batchId=311)
org.apache.hadoop.hive.ql.TestTxnCommands2WithSplitUpdateAndVectorization.testCleanerForTxnToWriteId
 (batchId=311)
org.apache.hadoop.hive.ql.TestTxnCommands2WithSplitUpdateAndVectorization.testCompactWithDelete
 (batchId=311)
org.apache.hadoop.hive.ql.TestTxnCommands2WithSplitUpdateAndVectorization.testDeleteIn
 (batchId=311)
org.apache.hadoop.hive.ql.TestTxnCommands2WithSplitUpdateAndVectorization.testDynamicPartitionsMerge
 (batchId=311)
org.apache.hadoop.hive.ql.TestTxnCommands2WithSplitUpdateAndVectorization.testDynamicPartitionsMerge2
 (batchId=311)
org.apache.hadoop.hive.ql.TestTxnCommands2WithSplitUpdateAndVectorization.testETLSplitStrategyForACID
 (batchId=311)
org.apache.hadoop.hive.ql.TestTxnCommands2WithSplitUpdateAndVectorization.testEmptyInTblproperties
 (batchId=311)
org.apache.hadoop.hive.ql.TestTxnCommands2WithSplitUpdateAndVectorization.testFailHeartbeater
 (batchId=311)
org.apache.hadoop.hive.ql.TestTxnCommands2WithSplitUpdateAndVectorization.testFileSystemUnCaching
 (batchId=311)
org.apache.hadoop.hive.ql.TestTxnCommands2WithSplitUpdateAndVectorization.testInitiatorWithMultipleFailedCompactions
 (batchId=311)
org.apache.hadoop.hive.ql.TestTxnCommands2WithSplitUpdateAndVectorization.testInsertOverwrite1
 (batchId=311)
org.apache.hadoop.hive.ql.TestTxnCommands2WithSplitUpdateAndVectorization.testInsertOverwrite2
 (batchId=311)
org.apache.hadoop.hive.ql.TestTxnCommands2WithSplitUpdateAndVectorization.testInsertOverwriteWithSelfJoin
 (batchId=311)
org.apache.hadoop.hive.ql.TestTxnCommands2WithSplitUpdateAndVectorization.testMerge
 (batchId=311)
org.apache.hadoop.hive.ql.TestTxnCommands2WithSplitUpdateAndVectorization.testMerge2
 (batchId=311)
org.apache.hadoop.hive.ql.TestTxnCommands2WithSplitUpdateAndVectorization.testMerge3
 (batchId=311)
org.apache.hadoop.hive.ql.TestTxnCommands2WithSplitUpdateAndVectorization.testMergeWithPredicate
 (batchId=311)
org.apache.hadoop.hive.ql.TestTxnCommands2WithSplitUpdateAndVectorization.testMmTableCompaction
 (batchId=311)
org.apache.hadoop.hive.ql.TestTxnCommands2WithSplitUpdateAndVectorization.testMultiInsert
 (batchId=311)
org.apache.hadoop.hive.ql.TestTxnCommands2WithSplitUpdateAndVectorization.testMultiInsertStatement
 (batchId=311)
org.apache.hadoop.hive.ql.TestTxnCommands2WithSplitUpdateAndVectorization.testNonAcidInsert
 (batchId=311)
org.apache.hadoop.hive.ql.TestTxnCommands2WithSplitUpdateAndVectorization.testNonAcidToAcidConversion02
 (batchId=311)
org.apache.hadoop.hive.ql.TestTxnCommands2WithSplitUpdateAndVectorization.testOpenTxnsCounter
 (batchId=311)
org.apache.hadoop.hive.ql.TestTxnCommands2WithSplitUpdateAndVectorization.testOrcNoPPD
 (batchId=311)
org.apache.hadoop.hive.ql.TestTxnCommands2WithSplitUpdateAndVectorization.testOrcPPD
 (batchId=311)
org.apache.hadoop.hive.ql.TestTxnCommands2WithSplitUpdateAndVectorization.testOriginalFileReaderWhenNonAcidConvertedToAcid
 (batchId=311)
org.apache.hadoop.hive.ql.TestTxnCommands2WithSplitUpdateAndVectorization.testUpdateMixedCase
 (batchId=311)
org.apache.hadoop.hive.ql.TestTxnCommands2WithSplitUpdateAndVectorization.updateDeletePartitioned
 (batchId=311)
org.apache.hadoop.hive.ql.TestTxnCommands2WithSplitUpdateAndVectorization.writeBetweenWorkerAndCleaner
 (batchId=311)
{noformat}

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-Build/13848/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/13848/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-13848/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.YetusPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 36 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12939909 - PreCommit-HIVE-Build

> Union ALL with hive.optimize.union.remove=true has incorrect plan
> -----------------------------------------------------------------
>
>                 Key: HIVE-20570
>                 URL: https://issues.apache.org/jira/browse/HIVE-20570
>             Project: Hive
>          Issue Type: Bug
>            Reporter: Janaki Lahorani
>            Assignee: Janaki Lahorani
>            Priority: Major
>         Attachments: HIVE-20570.1.patch
>
>
> When hive.optimize.union.remove=true and a select query is run with group by, 
> the final fetch is waiting only for one of the branches and not both.
> Test Case:
> {code}
> create table if not exists test_table(column1 string, column2 int);
> insert into test_table values('a',1),('b',2);
> set hive.optimize.union.remove=true;
> set mapred.input.dir.recursive=true;
> explain
> select column1 from test_table group by column1
> union all
> select column1 from test_table group by column1;
> {code}
> In the below the two stages correspond to the two parts of union all.  But 
> the final fetch operator (Stage 0) only depends on one of the stages, but it 
> should depend on both.
> Plan:
> {code}
> STAGE DEPENDENCIES:
>   Stage-1 is a root stage
>   Stage-2 is a root stage
>   *Stage-0 depends on stages: Stage-1*
> STAGE PLANS:
>   Stage: Stage-1
>     Map Reduce
>       Map Operator Tree:
>           TableScan
>             alias: test_table
>             Statistics: Num rows: 2 Data size: 6 Basic stats: COMPLETE Column 
> stats: NONE
>             Select Operator
>               expressions: column1 (type: string)
>               outputColumnNames: column1
>               Statistics: Num rows: 2 Data size: 6 Basic stats: COMPLETE 
> Column stats: NONE
>               Group By Operator
>                 keys: column1 (type: string)
>                 mode: hash
>                 outputColumnNames: _col0
>                 Statistics: Num rows: 2 Data size: 6 Basic stats: COMPLETE 
> Column stats: NONE
>                 Reduce Output Operator
>                   key expressions: _col0 (type: string)
>                   sort order: +
>                   Map-reduce partition columns: _col0 (type: string)
>                   Statistics: Num rows: 2 Data size: 6 Basic stats: COMPLETE 
> Column stats: NONE
>       Execution mode: vectorized
>       Reduce Operator Tree:
>         Group By Operator
>           keys: KEY._col0 (type: string)
>           mode: mergepartial
>           outputColumnNames: _col0
>           Statistics: Num rows: 1 Data size: 3 Basic stats: COMPLETE Column 
> stats: NONE
>           File Output Operator
>             compressed: false
>             Statistics: Num rows: 1 Data size: 3 Basic stats: COMPLETE Column 
> stats: NONE
>             table:
>                 input format: org.apache.hadoop.mapred.SequenceFileInputFormat
>                 output format: 
> org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat
>                 serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
>   Stage: Stage-2
>     Map Reduce
>       Map Operator Tree:
>           TableScan
>             alias: test_table
>             Statistics: Num rows: 2 Data size: 6 Basic stats: COMPLETE Column 
> stats: NONE
>             Select Operator
>               expressions: column1 (type: string)
>               outputColumnNames: column1
>               Statistics: Num rows: 2 Data size: 6 Basic stats: COMPLETE 
> Column stats: NONE
>               Group By Operator
>                 keys: column1 (type: string)
>                 mode: hash
>                 outputColumnNames: _col0
>                 Statistics: Num rows: 2 Data size: 6 Basic stats: COMPLETE 
> Column stats: NONE
>                 Reduce Output Operator
>                   key expressions: _col0 (type: string)
>                   sort order: +
>                   Map-reduce partition columns: _col0 (type: string)
>                   Statistics: Num rows: 2 Data size: 6 Basic stats: COMPLETE 
> Column stats: NONE
>       Execution mode: vectorized
>       Reduce Operator Tree:
>         Group By Operator
>           keys: KEY._col0 (type: string)
>           mode: mergepartial
>           outputColumnNames: _col0
>           Statistics: Num rows: 1 Data size: 3 Basic stats: COMPLETE Column 
> stats: NONE
>           File Output Operator
>             compressed: false
>             Statistics: Num rows: 1 Data size: 3 Basic stats: COMPLETE Column 
> stats: NONE
>             table:
>                 input format: org.apache.hadoop.mapred.SequenceFileInputFormat
>                 output format: 
> org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat
>                 serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
>   Stage: Stage-0
>     Fetch Operator
>       limit: -1
>       Processor Tree:
>         ListSink
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-20570) Union ALL with hive.optimize.union.remove=true has incorrect plan

Reply via email to