----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/38268/ -----------------------------------------------------------
Review request for hive and Gopal V. Bugs: HIVE-10980 https://issues.apache.org/jira/browse/HIVE-10980 Repository: hive-git Description ------- https://issues.apache.org/jira/browse/HIVE-10980 Conditions that lead to the issue: 1. Execution engine set to MapReduce 2. Partition columns have different types 3. Both static and dynamic partitions are used in the query 4. Dynamically generated partitions require merge Result: Final data is loaded to "__HIVE_DEFAULT_PARTITION__". Steps to reproduce: set hive.exec.dynamic.partition=true; set hive.exec.dynamic.partition.mode=strict; set hive.optimize.sort.dynamic.partition=false; set hive.merge.mapfiles=true; set hive.merge.mapredfiles=true; set hive.input.format=org.apache.hadoop.hive.ql.io.HiveInputFormat; set hive.execution.engine=mr; create external table sdp ( dataint bigint, hour int, req string, cid string, caid string ) row format delimited fields terminated by ','; load data local inpath '../../data/files/dynpartdata1.txt' into table sdp; load data local inpath '../../data/files/dynpartdata2.txt' into table sdp; ... load data local inpath '../../data/files/dynpartdataN.txt' into table sdp; create table tdp (cid string, caid string) partitioned by (dataint bigint, hour int, req string); insert overwrite table tdp partition (dataint=20150316, hour=16, req) select cid, caid, req from sdp where dataint=20150316 and hour=16; select * from tdp order by caid; show partitions tdp; Example of the input file: 20150316,16,reqA,clusterIdA,cacheId1 20150316,16,reqB,clusterIdB,cacheId2 20150316,16,reqA,clusterIdC,cacheId3 20150316,16,reqD,clusterIdD,cacheId4 20150316,16,reqA,clusterIdA,cacheId5 Actual result: clusterIdA cacheId1 20150316 16 __HIVE_DEFAULT_PARTITION__ clusterIdA cacheId1 20150316 16 __HIVE_DEFAULT_PARTITION__ clusterIdB cacheId2 20150316 16 __HIVE_DEFAULT_PARTITION__ clusterIdC cacheId3 20150316 16 __HIVE_DEFAULT_PARTITION__ clusterIdD cacheId4 20150316 16 __HIVE_DEFAULT_PARTITION__ clusterIdA cacheId5 20150316 16 __HIVE_DEFAULT_PARTITION__ clusterIdD cacheId8 20150316 16 __HIVE_DEFAULT_PARTITION__ clusterIdB cacheId9 20150316 16 __HIVE_DEFAULT_PARTITION__ dataint=20150316/hour=16/req=__HIVE_DEFAULT_PARTITION__ Diffs ----- data/files/dynpartdata1.txt PRE-CREATION data/files/dynpartdata2.txt PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/optimizer/GenMapRedUtils.java 4a325fb ql/src/test/org/apache/hadoop/hive/ql/optimizer/TestGenMapRedUtilsUsePartitionColumnsNegative.java PRE-CREATION ql/src/test/org/apache/hadoop/hive/ql/optimizer/TestGenMapRedUtilsUsePartitionColumnsPositive.java PRE-CREATION ql/src/test/queries/clientpositive/dynpart_merge.q PRE-CREATION ql/src/test/results/clientpositive/dynpart_merge.q.out PRE-CREATION ql/src/test/results/clientpositive/list_bucket_dml_6.q.java1.7.out d223234 ql/src/test/results/clientpositive/list_bucket_dml_6.q.java1.8.out f884ace ql/src/test/results/clientpositive/list_bucket_dml_7.q.out 541944d Diff: https://reviews.apache.org/r/38268/diff/ Testing ------- 1. Added new unit tests 2. Added qtest 3. Updated old qtests Thanks, Illya Yalovyy