[
https://issues.apache.org/jira/browse/FALCON-60?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13893415#comment-13893415
]
Satish Mittal commented on FALCON-60:
-------------------------------------
While testing feed retention with Hcatalog based feed, I observed that empty
parent dirs (except leaf node) are being left behind. E.g. if partition is on
<date, country> then an empty date dir is left behind.
In FeedEvictor.dropPartition(), we have:
{code}
boolean deleted = true;
if (isTableExternal) { // nuke the dirs if an external table
final String location = partitionToDrop.getLocation();
final Path path = new Path(location);
deleted = path.getFileSystem(new Configuration()).delete(path,
true);
}
{code}
In case of HCat External Table, following cases arise during partition
registration step:
1) If no external location is specified, then we can safely assume that HDFS
dirs for the partition are created by HCat in its 'native' format:
key1=value1/key2=value2/.... and delete bottom-up by those many levels.
2) Else if it is a static partition and external location is specified, then
there is no guarantee that the user-specified HDFS location will always have
those many levels or cater to any particular format.
3) Else if it is a dynamic partition and a custom dynamic path pattern is
specified, then we can go through the pattern and figure out the appropriate
level upwards to delete the partition dirs.
> Feed retention doesn't delete empty parent dirs
> -----------------------------------------------
>
> Key: FALCON-60
> URL: https://issues.apache.org/jira/browse/FALCON-60
> Project: Falcon
> Issue Type: Bug
> Affects Versions: 0.4
> Reporter: Shwetha G S
> Assignee: Shaik Idris Ali
> Fix For: 0.4
>
> Attachments: FALCON-60-v2.patch, FALCON-60-v3.patch, FALCON-60.patch
>
>
--
This message was sent by Atlassian JIRA
(v6.1.5#6160)