[
https://issues.apache.org/jira/browse/IMPALA-14189?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Work on IMPALA-14189 started by Daniel Becker.
----------------------------------------------
> Cleanup subdirectories in truncate/insert overwrite if recursing listing is
> enabled
> -----------------------------------------------------------------------------------
>
> Key: IMPALA-14189
> URL: https://issues.apache.org/jira/browse/IMPALA-14189
> Project: IMPALA
> Issue Type: Improvement
> Components: Catalog
> Reporter: Csaba Ringhofer
> Assignee: Daniel Becker
> Priority: Critical
>
> Currently Impala doesn't delete files in sub directories while Hive does,
> though both Hive and Impala do recursive listing by default in external
> tables (can be disabled with
> impala.disable.recursive.listing).
> insert overwrite: deletes subdirectories for partitioned tables, do not
> delete for non-partitioned tables
> truncate: never deletes subdirectories
> Example:
> {code}
> show files in texternal; -- return a single file in a subdirectory
> (nested_dir)
> -> hdfs://localhost:20500/test-warehouse/texternal/nested_dir/a.txt
> truncate texternal;
> show files in texternal; --returns the same result
> -> hdfs://localhost:20500/test-warehouse/texternal/nested_dir/a.txt
> insert overwrite texternal select * from texternal;
> show files in texternal; -- the file in the subdir is still kept after insert
> overwrite
>
> hdfs://localhost:20500/test-warehouse/texternal/f549975b8cf16b86-19a0de0d00000000_1586861351_data.0.txt
>
> hdfs://localhost:20500/test-warehouse/texternal/nested_dir/a.txt
> {code}
> Hive deletes sub directories both during truncate and insert overwrite
> (probably skips hidden folders, didn't check)
> I think that the correct solution would be to always delete the files that
> are considered part of the table.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]