[ 
https://issues.apache.org/jira/browse/HDFS-6382?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14015535#comment-14015535
 ] 

Hangjun Ye commented on HDFS-6382:
----------------------------------

Thanks Colin for your summarization, I'd like to try to address some of your 
concerns and questions:

bq. security / correctness concerns: it's easy to make a mistake that could 
bring down the NameNode or entire FS
I agree that's the cost, the developer must be careful and guarantee the code 
quality.

bq. non-generality to systems using s3 or another FS in addition to HDFS
Yes, It's only applicable to HDFS. I guess snapshot is only applicable to HDFS 
too (I could be wrong here as I haven't read snapshot code), so it shouldn't 
bring much confusion.

bq. issues with federation (which NN does the cleanup? How do you decide?)
Each NN only takes care of the cleanup of files/directories in its own 
namespace. Let's consider TTL as an attribute attached to files/directories, no 
much difference under federation or non-federation configuration.

bq. complexities surrounding our client-side Trash implementation and our 
server-side snapshots
No much difference whether implemented inside NN or outside NN.

bq. configuration burden on sysadmins
We need to think about the total cost of ownership. Implementing inside NN 
increases HDFS's own configuration burden for sure, but implementing in a 
separate system just moves the burden from HDFS to a new system, it would have 
higher total cost in general.

bq. inability to change the cleanup code without restarting the NameNode
Yes, that's the cost, but should be minor. Users might change TTL policies 
frequently according to their requirements, but the cleanup code shouldn't 
change frequently (unless the implementation code is crappy).

bq. HA concerns (need to avoid split-brain or lost updates)
That's a good question. We haven't thought over this, seems the cleanup code 
should only run at the active NN as standby doesn't have the latest updates and 
can't initiate edits.
It shouldn't introduce split-brain as it doesn't change NN's core flow, but 
should be implemented carefully anyway.

bq. error handling (where do users find out about errors?)
I haven't thought of any runtime errors (at cleanup stage) that need to be 
notified of the end users. It should be the sys admin who cares about errors at 
this stage and he/she could figure them out in logs. For the errors when users 
set TTL through command line or APIs, the users should be notified directly.

bq. semantics: disappearing or time-limited files is an unfamiliar API, not 
like the traditional FS APIs we usually implement
Firstly, no much difference whether implemented inside NN or outside NN. 
Moreover, if only users have the requirements of TTL-based cleanup, it 
shouldn't be difficult for them to accept an API.

bq. Making this pluggable doesn't fix any of those problems, and it adds some 
more:
The motivation isn't fixing possible problems of implementing TTL policy in the 
server-side, it's trying to separate the mechanism from specific jobs. It 
provides an elegant approach to implement such an extension to NN and makes the 
common part of such extensions reusable.

bq. The only points I've seen raised in favor of doing this in the NameNode 
are:...
IMHO, the major points for doing this in NN are:
# it's a more natural way for end-users, they don't have to interact with HDFS 
directly in most cases but resort to another system for TTL requirement.
# lower cost for maintenance (possibly lower cost for implementation too, but 
it depends on current status of NN).

bq. To the second point, HBase doesn't use coprocessors for cleanup jobs... it 
uses them for things like secondary indices, a much better-defined problem.
HBase coprocessor is just an analogy... Possibly not a good one but I can't 
think of a better one right now. HBase could use coprocessor for cleanup jobs. 
HBase's default cleanup policies are "Number of Versions" and "TTL", which are 
configured per Column Family. If you have a special requirement to clean up 
cells per its content, for example using the value of a specific column as the 
"Number of Versions" to keep, you could do it using coprocessor. You could do 
the same thing in a MR job for sure. I'm not saying using coprocessor is a good 
practice in general but for some use cases, it might be.

A little bit background about what we are doing: both Zesheng and I are from 
Xiaomi, a fast growing mobile internet company in China. We are in a team to 
support data infrastructure of the company using the open-sourced Hadoop 
ecosystem and our role might be similar to some teams in Facebook. We do 
improvements to the open-sourced software per the requirements from our 
products and would like to contribute our improvements back to community. We 
have contributed pretty a few patches to HBase community and two members of our 
team, [~xieliang007] and [~fenghh] became HBase committers recently. We improve 
HDFS at the same time and are also happy to collaborate with the community.

For this specific feature proposal, a NN-side TTL implementation and a general 
NN extension mechanism, its feasibility isn't very clear to us as it's just an 
idea so far. We'd like to spend time on investigating its feasibility 
furthermore. It's still preferable if feasible. If we encounter insurmountable 
technical challenges, we would give up for sure. So how about keep this jira 
issue opened right now (we might open another jira issue to track the general 
NN extension mechanism), and we will get back after we do the investigation? 
Whatever approach we choose eventually, we always appreciate you guys' help to 
work out the solution.

> HDFS File/Directory TTL
> -----------------------
>
>                 Key: HDFS-6382
>                 URL: https://issues.apache.org/jira/browse/HDFS-6382
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>          Components: hdfs-client, namenode
>    Affects Versions: 2.4.0
>            Reporter: Zesheng Wu
>            Assignee: Zesheng Wu
>
> In production environment, we always have scenario like this, we want to 
> backup files on hdfs for some time and then hope to delete these files 
> automatically. For example, we keep only 1 day's logs on local disk due to 
> limited disk space, but we need to keep about 1 month's logs in order to 
> debug program bugs, so we keep all the logs on hdfs and delete logs which are 
> older than 1 month. This is a typical scenario of HDFS TTL. So here we 
> propose that hdfs can support TTL.
> Following are some details of this proposal:
> 1. HDFS can support TTL on a specified file or directory
> 2. If a TTL is set on a file, the file will be deleted automatically after 
> the TTL is expired
> 3. If a TTL is set on a directory, the child files and directories will be 
> deleted automatically after the TTL is expired
> 4. The child file/directory's TTL configuration should override its parent 
> directory's
> 5. A global configuration is needed to configure that whether the deleted 
> files/directories should go to the trash or not
> 6. A global configuration is needed to configure that whether a directory 
> with TTL should be deleted when it is emptied by TTL mechanism or not.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to