Jinyang Li created HIVE-22251:
---------------------------------

             Summary: Existing files on new partition path not cleared during 
insert overwrite
                 Key: HIVE-22251
                 URL: https://issues.apache.org/jira/browse/HIVE-22251
             Project: Hive
          Issue Type: Bug
    Affects Versions: 2.3.4, 0.13.0
            Reporter: Jinyang Li


*Description*

When insert overwrite to a new partition, if there are files already exist on 
the partition path, Hive will not clear them and cause extra files in final 
partition location.

Reading the partition may return extra incorrect result.

 

*Reproduce*
 # Make file exist on partition path 
hdfs dfs -mkdir 
hdfs://airfs-silver/user/hive/warehouse/jinyang_test.db/jl_test1/ds=1/
hdfs dfs -put 000000_0 
hdfs://airfs-silver/user/hive/warehouse/jinyang_test.db/jl_test1/ds=1/000000_1
 # insert overwrite table jl_test1 partition (ds='1') select `(ds)?+.+` from 
src_table limit 100;
 # Found two files in the partition location
hdfs dfs -ls 
hdfs://airfs-silver/user/hive/warehouse/jinyang_test.db/jl_test1/ds=1/
Found 2 items
-rw-r--r-- 3 jinyang_li supergroup 2770 2019-09-27 06:53 
hdfs://airfs-silver/user/hive/warehouse/jinyang_test.db/jl_test1/ds=1/000000_0
-rw-r--r-- 3 jinyang_li supergroup 8483 2019-09-27 06:50 
hdfs://airfs-silver/user/hive/warehouse/jinyang_test.db/jl_test1/ds=1/000000_1



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to