[ 
https://issues.apache.org/jira/browse/HADOOP-14041?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15854863#comment-15854863
 ] 

Aaron Fabbri commented on HADOOP-14041:
---------------------------------------

A couple of thoughts on this patch:

1. I think prune() should be optional for implementations.   prune() is an 
offline algorithm for evicting old metadata from the metadata store.  Some 
implementations (i.e. LocalMetadataStore) probably want to do this as an online 
algorithm.  When I get around to doing HADOOP-13649, I would probably remove 
the prune() function there and do eviction as we go from the clients' accesses.

2. I think the work here should be broken up into batches, and there should be 
a sleep parameter to the prune function "batchSleepMsec" which is the number of 
milliseconds the implementation should sleep between pruning batches.  This is 
a simple way to have a tunable "niceness" parameter for the process.  This 
allows users to minimize impact to production jobs by making it much less 
likely that provisioned capacity will be exceeded.

3. The directory pruning has a couple of issues.  I'm wondering if we should 
omit directory pruning from the v1 of this.   Currently it builds a set of all 
directories in the whole metadata store, in memory, then checks each one if it 
is empty, and prunes it if so.  This could be optimized some, but the problems 
of having everything in memory, and of potentially breaking the "all paths to 
root are stored" invariant of the DDB data remains.

Let me share a variation on this algorithm I'm thinking of:

*Phase 1*: prune files.
{noformat}
while (number_pruned > 0) :
    paths = table_scan(mod_time < x && is_dir==false, limit=BATCH_SIZE)
    do_batched_delete(paths)
    number_pruned = paths.size()
    sleep(batchSleepMsecs)
{noformat}

*Phase 2*: directory pruning
Change meaning of mod_time for directories in DDB: it is create time.
{noformat}
while (number_pruned > 0) :
   paths = table_scan(mod_time < x && is_dir==true, limit=BATCH_SIZE)
   emptyOldDirs = paths.filter(isEmptyDir(x))
   do_batched_delete(emptyOldDirs)
{noformat}

Phase 2 is still subject to races where a file is placed into a directory right 
after we evaluate isEmptyDir(path).  Solving this with DDB doesn't seem 
trivial.  For now we could expose an option for prune() where the caller can 
select to prune just files, or to prune files and directories, with the caveat 
that directory pruning should not happen if there are other clients actively 
modifying the filesystem?


> CLI command to prune old metadata
> ---------------------------------
>
>                 Key: HADOOP-14041
>                 URL: https://issues.apache.org/jira/browse/HADOOP-14041
>             Project: Hadoop Common
>          Issue Type: Sub-task
>          Components: fs/s3
>            Reporter: Sean Mackrory
>            Assignee: Sean Mackrory
>         Attachments: HADOOP-14041-HADOOP-13345.001.patch, 
> HADOOP-14041-HADOOP-13345.002.patch
>
>
> Add a CLI command that allows users to specify an age at which to prune 
> metadata that hasn't been modified for an extended period of time. Since the 
> primary use-case targeted at the moment is list consistency, it would make 
> sense (especially when authoritative=false) to prune metadata that is 
> expected to have become consistent a long time ago.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to