Pengcheng Xiong created HIVE-10062:
--------------------------------------

             Summary: HiveOnTez: Union followed by Multi-GB followed by 
Multi-insert loses data
                 Key: HIVE-10062
                 URL: https://issues.apache.org/jira/browse/HIVE-10062
             Project: Hive
          Issue Type: Bug
            Reporter: Pengcheng Xiong
            Priority: Critical


In q.test environment with src table, execute the following query: 
{code}
CREATE TABLE DEST1(key STRING, value STRING) STORED AS TEXTFILE;

CREATE TABLE DEST2(key STRING, val1 STRING, val2 STRING) STORED AS TEXTFILE;

FROM (select 'tst1' as key, cast(count(1) as string) as value from src s1
                         UNION all 
      select s2.key as key, s2.value as value from src s2) unionsrc
INSERT OVERWRITE TABLE DEST1 SELECT unionsrc.key, COUNT(DISTINCT 
SUBSTR(unionsrc.value,5)) GROUP BY unionsrc.key
INSERT OVERWRITE TABLE DEST2 SELECT unionsrc.key, unionsrc.value, 
COUNT(DISTINCT SUBSTR(unionsrc.value,5)) 
GROUP BY unionsrc.key, unionsrc.value;

select * from DEST1;
select * from DEST2;
{code}

DEST1 and DEST2 should both have 310 rows. However, DEST2 only has 1 row "tst1  
  500     1"




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to