[
https://issues.apache.org/jira/browse/HADOOP-15193?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16372638#comment-16372638
]
Steve Loughran commented on HADOOP-15193:
-----------------------------------------
I'm thinking of doing this with some explicit {{BulkOperationInfo extends
Closeable}} class and
{code}
BulkOperationInfo initiateDirectoryDelete(path)
void deleteBatch(BulkOperationInfo, List<Path>) // every path must be under the
path specified in the bulk operation
void completeBulkOperation(BulkOperationInfo, boolean wasSuccessful)
{code}
This lines us up for setting up other bulk ops, like an explicit rename.
Why this way? It allows us to tell the store that the batches are all part of
the same rmdir call, and that there is little/no need to create any parent dir
markers, etc, etc, because everything is expected to work. The complete call
can do that and choose what to use as a success/failure marker.
The base impl will do nothing but very that in a batch delete, all paths are
valid, then issue 1 by 1; nothing done in complete()
> add bulk delete call to metastore API & DDB impl
> ------------------------------------------------
>
> Key: HADOOP-15193
> URL: https://issues.apache.org/jira/browse/HADOOP-15193
> Project: Hadoop Common
> Issue Type: Sub-task
> Components: fs/s3
> Affects Versions: 3.0.0
> Reporter: Steve Loughran
> Priority: Major
>
> recursive dir delete (and any future bulk delete API like HADOOP-15191)
> benefits from using the DDB bulk table delete call, which takes a list of
> deletes and executes. Hopefully this will offer better perf.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]