Jinyang Li created HIVE-22251:
---------------------------------
Summary: Existing files on new partition path not cleared during
insert overwrite
Key: HIVE-22251
URL: https://issues.apache.org/jira/browse/HIVE-22251
Project: Hive
Issue Type: Bug
Affects Versions: 2.3.4, 0.13.0
Reporter: Jinyang Li
*Description*
When insert overwrite to a new partition, if there are files already exist on
the partition path, Hive will not clear them and cause extra files in final
partition location.
Reading the partition may return extra incorrect result.
*Reproduce*
# Make file exist on partition path
hdfs dfs -mkdir
hdfs://airfs-silver/user/hive/warehouse/jinyang_test.db/jl_test1/ds=1/
hdfs dfs -put 000000_0
hdfs://airfs-silver/user/hive/warehouse/jinyang_test.db/jl_test1/ds=1/000000_1
# insert overwrite table jl_test1 partition (ds='1') select `(ds)?+.+` from
src_table limit 100;
# Found two files in the partition location
hdfs dfs -ls
hdfs://airfs-silver/user/hive/warehouse/jinyang_test.db/jl_test1/ds=1/
Found 2 items
-rw-r--r-- 3 jinyang_li supergroup 2770 2019-09-27 06:53
hdfs://airfs-silver/user/hive/warehouse/jinyang_test.db/jl_test1/ds=1/000000_0
-rw-r--r-- 3 jinyang_li supergroup 8483 2019-09-27 06:50
hdfs://airfs-silver/user/hive/warehouse/jinyang_test.db/jl_test1/ds=1/000000_1
--
This message was sent by Atlassian Jira
(v8.3.4#803005)