[
https://issues.apache.org/jira/browse/HIVE-13737?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15302294#comment-15302294
]
frank luo commented on HIVE-13737:
----------------------------------
hive> explain INSERT INTO TABLE test SELECT * from src UNION ALL SELECT * from
src;
OK
Plan not optimized by CBO.
Vertex dependency in root stage
Map 1 <- Union 2 (CONTAINS)
Map 3 <- Union 2 (CONTAINS)
Stage-4
Stats-Aggr Operator
Stage-0
Move Operator
table:{"serde:":"org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe","name:":"jluo.test","input
format:":"org.apache.hadoop.mapred.TextInputFormat","output
format:":"org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat"}
Stage-2
Dependency Collection{}
Stage-1
Union 2
|<-Map 1 [CONTAINS]
| File Output Operator [FS_6]
| compressed:false
| Statistics:Num rows: 2 Data size: 2 Basic stats:
COMPLETE Column stats: NONE
|
table:{"serde:":"org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe","name:":"jluo.test","input
format:":"org.apache.hadoop.mapred.TextInputFormat","output
format:":"org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat"}
| Select Operator [SEL_1]
| outputColumnNames:["_col0"]
| Statistics:Num rows: 1 Data size: 1 Basic stats:
COMPLETE Column stats: NONE
| TableScan [TS_0]
| alias:src
| Statistics:Num rows: 1 Data size: 1 Basic
stats: COMPLETE Column stats: NONE
|<-Map 3 [CONTAINS]
File Output Operator [FS_6]
compressed:false
Statistics:Num rows: 2 Data size: 2 Basic stats:
COMPLETE Column stats: NONE
table:{"serde:":"org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe","name:":"jluo.test","input
format:":"org.apache.hadoop.mapred.TextInputFormat","output
format:":"org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat"}
Select Operator [SEL_3]
outputColumnNames:["_col0"]
Statistics:Num rows: 1 Data size: 1 Basic stats:
COMPLETE Column stats: NONE
TableScan [TS_2]
alias:src
Statistics:Num rows: 1 Data size: 1 Basic
stats: COMPLETE Column stats: NONE
Stage-3
Stats-Aggr Operator
Please refer to the previous Stage-0
Time taken: 0.088 seconds, Fetched: 41 row(s)
hive> explain SELECT count(*) FROM test;
OK
Plan not optimized by CBO.
Stage-0
Fetch Operator
limit:1
Time taken: 0.037 seconds, Fetched: 6 row(s)
> incorrect count when multiple inserts with union all
> ----------------------------------------------------
>
> Key: HIVE-13737
> URL: https://issues.apache.org/jira/browse/HIVE-13737
> Project: Hive
> Issue Type: Bug
> Components: Hive
> Affects Versions: 1.2.1
> Environment: hdp 2.3.4.7 on Red Hat 6
> Reporter: Frank Luo
> Priority: Critical
>
> Here is a test case to illustrate the issue. It seems MR works fine but Tez
> is having the problem.
> CREATE TABLE test(col1 STRING);
> CREATE TABLE src (col1 string);
> insert into table src values ('a');
> INSERT into TABLE test
> select * from (
> SELECT * from src
> UNION ALL
> SELECT * from src) x;
> -- do it one more time
> INSERT INTO TABLE test
> SELECT * from src
> UNION ALL
> SELECT * from src;
> --below gives correct result
> SELECT * FROM TEST;
> --count is incorrect. It might give either '1' or '2', but I am expecting '4'
> SELECT count (*) FROM test;
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)