[
https://issues.apache.org/jira/browse/HDFS-6382?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14017992#comment-14017992
]
Colin Patrick McCabe commented on HDFS-6382:
--------------------------------------------
The xattrs branch was merged to trunk two weeks ago. Since trunk is where
development happens anyway, you should be able to start now if you like.
Maybe post a design doc first if you want feedback. It seems like the big
question to be answered is: where is this going to live? We have had proposals
for doing this as an MR job, a separate daemon, or part of the balancer. They
all have pros and cons... it would be good to write down the benefits and
disadvantages of each option before making a choice.
I think any of these 3 options is possible and I wouldn't vote against any of
them. It's up to you. If it's a separate daemon, at minimum, we can put it in
contrib/. But you may find that some options have a higher maintenance burden
on you. I also think that users don't like running more daemons if they can
help it. But perhaps there is something I haven't thought of that makes a
separate daemon a good choice.
> HDFS File/Directory TTL
> -----------------------
>
> Key: HDFS-6382
> URL: https://issues.apache.org/jira/browse/HDFS-6382
> Project: Hadoop HDFS
> Issue Type: Improvement
> Components: hdfs-client, namenode
> Affects Versions: 2.4.0
> Reporter: Zesheng Wu
> Assignee: Zesheng Wu
>
> In production environment, we always have scenario like this, we want to
> backup files on hdfs for some time and then hope to delete these files
> automatically. For example, we keep only 1 day's logs on local disk due to
> limited disk space, but we need to keep about 1 month's logs in order to
> debug program bugs, so we keep all the logs on hdfs and delete logs which are
> older than 1 month. This is a typical scenario of HDFS TTL. So here we
> propose that hdfs can support TTL.
> Following are some details of this proposal:
> 1. HDFS can support TTL on a specified file or directory
> 2. If a TTL is set on a file, the file will be deleted automatically after
> the TTL is expired
> 3. If a TTL is set on a directory, the child files and directories will be
> deleted automatically after the TTL is expired
> 4. The child file/directory's TTL configuration should override its parent
> directory's
> 5. A global configuration is needed to configure that whether the deleted
> files/directories should go to the trash or not
> 6. A global configuration is needed to configure that whether a directory
> with TTL should be deleted when it is emptied by TTL mechanism or not.
--
This message was sent by Atlassian JIRA
(v6.2#6252)