[ https://issues.apache.org/jira/browse/HIVE-16017?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Sergey Shelukhin updated HIVE-16017: ------------------------------------ Summary: MM tables - many queries duplicate the data after master merge (was: MM tables - buckets + union is broken) > MM tables - many queries duplicate the data after master merge > -------------------------------------------------------------- > > Key: HIVE-16017 > URL: https://issues.apache.org/jira/browse/HIVE-16017 > Project: Hive > Issue Type: Sub-task > Reporter: Sergey Shelukhin > > This duplicates the data (given that the original query is a self-union, > essentially outputs it 4 times instead of 2) for either MM or non-MM tables, > on MM branch. > It seems to be adding correct inputs (esp. in non-MM case the inputs are the > same as before). Presumably something in the output changes in the branch is > broken for this case. Not sure what yet. > {noformat} > CREATE TABLE tbl1_mm(key int, value string) CLUSTERED BY (key) SORTED BY > (key) INTO 2 BUCKETS; > insert overwrite table tbl1_mm select * from src where key < 10; > select key, value from tbl1_mm a where key < 6 > union all > select key, value from tbl1_mm a where key < 6; > {noformat} -- This message was sent by Atlassian JIRA (v6.3.15#6346)