[ 
https://issues.apache.org/jira/browse/HIVE-10062?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong reassigned HIVE-10062:
--------------------------------------

    Assignee: Pengcheng Xiong

> HiveOnTez: Union followed by Multi-GB followed by Multi-insert loses data
> -------------------------------------------------------------------------
>
>                 Key: HIVE-10062
>                 URL: https://issues.apache.org/jira/browse/HIVE-10062
>             Project: Hive
>          Issue Type: Bug
>            Reporter: Pengcheng Xiong
>            Assignee: Pengcheng Xiong
>            Priority: Critical
>
> In q.test environment with src table, execute the following query: 
> {code}
> CREATE TABLE DEST1(key STRING, value STRING) STORED AS TEXTFILE;
> CREATE TABLE DEST2(key STRING, val1 STRING, val2 STRING) STORED AS TEXTFILE;
> FROM (select 'tst1' as key, cast(count(1) as string) as value from src s1
>                          UNION all 
>       select s2.key as key, s2.value as value from src s2) unionsrc
> INSERT OVERWRITE TABLE DEST1 SELECT unionsrc.key, COUNT(DISTINCT 
> SUBSTR(unionsrc.value,5)) GROUP BY unionsrc.key
> INSERT OVERWRITE TABLE DEST2 SELECT unionsrc.key, unionsrc.value, 
> COUNT(DISTINCT SUBSTR(unionsrc.value,5)) 
> GROUP BY unionsrc.key, unionsrc.value;
> select * from DEST1;
> select * from DEST2;
> {code}
> DEST1 and DEST2 should both have 310 rows. However, DEST2 only has 1 row 
> "tst1    500     1"



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to