[
https://issues.apache.org/jira/browse/HIVE-17001?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Barna Zsombor Klara updated HIVE-17001:
---------------------------------------
Status: Open (was: Patch Available)
Cancelling the patch as after some discussions it was decided that this should
not be an issue. Data in the directory could be copied there on purpose by the
user and should not be deleted without a warning.
> Insert overwrite table doesn't clean partition directory on HDFS if partition
> is missing from HMS
> -------------------------------------------------------------------------------------------------
>
> Key: HIVE-17001
> URL: https://issues.apache.org/jira/browse/HIVE-17001
> Project: Hive
> Issue Type: Bug
> Components: HiveServer2, Metastore
> Reporter: Barna Zsombor Klara
> Assignee: Barna Zsombor Klara
> Attachments: HIVE-17001.01.patch
>
>
> Insert overwrite table should clear existing data before creating the new
> data files.
> For a partitioned table we will clean any folder of existing partitions on
> HDFS, however if the partition folder exists only on HDFS and the partition
> definition is missing in HMS, the folder is not cleared.
> Reproduction steps:
> 1. CREATE TABLE test( col1 string) PARTITIONED BY (ds string);
> 2. INSERT INTO test PARTITION(ds='p1') values ('a');
> 3. Copy the data to a different folder with different name.
> 4. ALTER TABLE test DROP PARTITION (ds='p1');
> 5. Recreate the partition directory, copy and rename the data file back
> 6. INSERT OVERWRITE TABLE test PARTITION(ds='p1') values ('b');
> 7. SELECT * from test;
> will result in 2 records being returned instead of 1.
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)