[jira] [Comment Edited] (HADOOP-15191) Add Private/Unstable BulkDelete operations to supporting object stores for DistCP

Aaron Fabbri (JIRA) Thu, 01 Feb 2018 22:18:34 -0800

    [ 
https://issues.apache.org/jira/browse/HADOOP-15191?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16349840#comment-16349840
 ]


Aaron Fabbri edited comment on HADOOP-15191 at 2/2/18 6:17 AM:
---------------------------------------------------------------

I like the idea of an optional batch delete interface, thanks for posting a 
patch.  Haven't reviewed all the guts yet (busy week) but wanted to discuss the 
interface a bit. I'd argue for making it implementation-independent as 
possible, and getting specific only when there are significant advantages: I 
think this will be pretty useful for other FS clients too (ADL and WASB come to 
mind).
{noformat}
+  /**
+   * Initiate a bulk delete operation.
+   *
+   * Preconditions:
+   * <pre>
+   *   files == list of Path
+   *   forall f in files: not exists(FS, f) or isFile(FS, f)
+   * </pre>
+   *
{noformat}
Can you talk a bit about this requirement that only files are listed? Why not 
just have a bulk "paths to delete" interface?
{noformat}
+   * Postconditions for a successfully completed operation
+   * <pre>
+   *   FS' = FS where forall f in files:
+   *      not exists(FS', f) and isDirectory(FS', parent(f))
+   * </pre>
{noformat}
So, you guarantee to leave all parent directories intact, right?
 Won't users of this API need to do a bulk file deletion, then traverse and 
1-by-1 delete the directories?

Also, have you considered a simple recursive delete API instead of an 
enumeration? (List may be better, just curious on your thinking).
{noformat}
+   *
+   * <ol>
+   *   <i>All paths in the list which resolve to a entry
+   *   MUST refer to files.</i>
+   *   <li>If a directory is included in the list, the outcome
+   *   is undefined.</li>
+   *   <li>The operation is unlikely  be atomic.</li>
+   *   <li>If an error occurs, the state of the filesystem is undefined.
+   *   Some, all or none of the other files may have been deleted.</li>
{noformat}
Why not return an enumeration of failures, or even successes?
{noformat}
+   *   <li>It is not expected that the changes in the operation
+   *   will be isolated from other, concurrent changes to the FS.</li>
+   *   <li>Duplicates may result in multiple attempts to delete the file,
+   *   or they may be filtered.</li>
+   *   <li>It is not required to be O(1), only that it should scale better.
+   *   than a sequence of individual file delete operations.</li>
{noformat}
Yeah, can't think of any O(1) implementations ;) Given next sentence you could 
remove this point I think.
{noformat}
+   *   <li>There's no guarantee that for small sets of files bulk deletion
+   *   is faster than single deletes. It may even be slower.</li>
{noformat}
There is also the FileSystem approach of leaving a default implementation that 
is slow and allowing other FS's to override it with something better. Thoughts 
on pros/cons? (Obvious con: FS is bloated already).
{noformat}
+   *   <li>It is not an error if a listed file does not exist.</li>
{noformat}
Interesting. I assume you are trying to save round trips here? I don't know 
about this one (at first glance it seems like explicit failure would be better, 
e.g. we know to retry on S3 inconsistency because the caller is expressing that 
the file should exist).
{noformat}
+   *   <li>If a path which does not exist is deleted, the filesystem
+   *   <i>may</i> still create a fake directory marker where its
+   *   parent directory was.</li>
{noformat}
"fake directory marker" seems awfully S3A-specific.  We could word this in 
terms of just Directories and leave out the s3-specific "marker" word I suppose?
{noformat}
+   * </ol>
+   * The directory marker is relevant for object stores which create them.
+   * For performance, the list of files to create may not be probed before
+   * a bulk delete request is issued, yet afterwards the store is
+   * expected to contain the parent directories, if present.
+   * Accordingly, an implementation may create an empty marker dir for all
+   * paths passed in, even if they don't refer to files.
+   * @param filesToDelete (possibly empty) list of files to delete
+   * @return the number of files included in the filesystem delete request 
after
+   * duplicates were discarded.
+   * @throws IOException IO failure.
+   * @throws IllegalArgumentException precondition failure
+   */{noformat}
I will review the actual S3 bulk delete semantics and try to look more at this 
tomorrow or Monday.  Looking forward to your thinking on the API.  Cool stuff.


was (Author: fabbri):
I like the idea of an optional batch delete interface, thanks for posting a 
patch.  Haven't reviewed all the guts yet (busy week) but wanted to discuss the 
interface a bit. I'd argue for making it implementation-independent as 
possible, and getting specific only when there are significant advantages: I 
think this will be pretty useful for other FS clients too (ADL and WASB come to 
mind).
{noformat}
+  /**
+   * Initiate a bulk delete operation.
+   *
+   * Preconditions:
+   * <pre>
+   *   files == list of Path
+   *   forall f in files: not exists(FS, f) or isFile(FS, f)
+   * </pre>
+   *
{noformat}
Can you talk a bit about this requirement that only files are listed? Why not 
just have a bulk "paths to delete" interface?
{noformat}
+   * Postconditions for a successfully completed operation
+   * <pre>
+   *   FS' = FS where forall f in files:
+   *      not exists(FS', f) and isDirectory(FS', parent(f))
+   * </pre>
{noformat}
So, you guarantee to leave all parent directories intact, right?
 Won't users of this API need to do a bulk file listing, then traverse and 
1-by-1 delete the directories?

Also, have you considered a simple recursive delete API instead of an 
enumeration? (List may be better, just curious on your thinking).
{noformat}
+   *
+   * <ol>
+   *   <i>All paths in the list which resolve to a entry
+   *   MUST refer to files.</i>
+   *   <li>If a directory is included in the list, the outcome
+   *   is undefined.</li>
+   *   <li>The operation is unlikely  be atomic.</li>
+   *   <li>If an error occurs, the state of the filesystem is undefined.
+   *   Some, all or none of the other files may have been deleted.</li>
{noformat}
Why not return an enumeration of failures, or even successes?
{noformat}
+   *   <li>It is not expected that the changes in the operation
+   *   will be isolated from other, concurrent changes to the FS.</li>
+   *   <li>Duplicates may result in multiple attempts to delete the file,
+   *   or they may be filtered.</li>
+   *   <li>It is not required to be O(1), only that it should scale better.
+   *   than a sequence of individual file delete operations.</li>
{noformat}
Yeah, can't think of any O(1) implementations ;) Given next sentence you could 
remove this point I think.
{noformat}
+   *   <li>There's no guarantee that for small sets of files bulk deletion
+   *   is faster than single deletes. It may even be slower.</li>
{noformat}
There is also the FileSystem approach of leaving a default implementation that 
is slow and allowing other FS's to override it with something better. Thoughts 
on pros/cons? (Obvious con: FS is bloated already).
{noformat}
+   *   <li>It is not an error if a listed file does not exist.</li>
{noformat}
Interesting. I assume you are trying to save round trips here? I don't know 
about this one (at first glance it seems like explicit failure would be better, 
e.g. we know to retry on S3 inconsistency because the caller is expressing that 
the file should exist).
{noformat}
+   *   <li>If a path which does not exist is deleted, the filesystem
+   *   <i>may</i> still create a fake directory marker where its
+   *   parent directory was.</li>
{noformat}
"fake directory marker" seems awfully S3A-specific.  We could word this in 
terms of just Directories and leave out the s3-specific "marker" word I suppose?
{noformat}
+   * </ol>
+   * The directory marker is relevant for object stores which create them.
+   * For performance, the list of files to create may not be probed before
+   * a bulk delete request is issued, yet afterwards the store is
+   * expected to contain the parent directories, if present.
+   * Accordingly, an implementation may create an empty marker dir for all
+   * paths passed in, even if they don't refer to files.
+   * @param filesToDelete (possibly empty) list of files to delete
+   * @return the number of files included in the filesystem delete request 
after
+   * duplicates were discarded.
+   * @throws IOException IO failure.
+   * @throws IllegalArgumentException precondition failure
+   */{noformat}
I will review the actual S3 bulk delete semantics and try to look more at this 
tomorrow or Monday.  Looking forward to your thinking on the API.  Cool stuff.

> Add Private/Unstable BulkDelete operations to supporting object stores for 
> DistCP
> ---------------------------------------------------------------------------------
>
>                 Key: HADOOP-15191
>                 URL: https://issues.apache.org/jira/browse/HADOOP-15191
>             Project: Hadoop Common
>          Issue Type: Sub-task
>          Components: fs/s3, tools/distcp
>    Affects Versions: 2.9.0
>            Reporter: Steve Loughran
>            Assignee: Steve Loughran
>            Priority: Major
>         Attachments: HADOOP-15191-001.patch, HADOOP-15191-002.patch, 
> HADOOP-15191-003.patch, HADOOP-15191-004.patch
>
>
> Large scale DistCP with the -delete option doesn't finish in a viable time 
> because of the final CopyCommitter doing a 1 by 1 delete of all missing 
> files. This isn't randomized (the list is sorted), and it's throttled by AWS.
> If bulk deletion of files was exposed as an API, distCP would do 1/1000 of 
> the REST calls, so not get throttled.
> Proposed: add an initially private/unstable interface for stores, 
> {{BulkDelete}} which declares a page size and offers a 
> {{bulkDelete(List<Path>)}} operation for the bulk deletion.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Comment Edited] (HADOOP-15191) Add Private/Unstable BulkDelete operations to supporting object stores for DistCP

Reply via email to