[ 
https://issues.apache.org/jira/browse/HADOOP-15628?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16554589#comment-16554589
 ] 

Steve Loughran commented on HADOOP-15628:
-----------------------------------------

That's interesting. 

# How did you manage to replicate it? Not grant enough permissions to an object?
# What version have you actually seen this on? <= 2.8.x? .  HADOOP-11572 tried 
to handle this better, which is 2.9+.
# and In HADOOP-15176 and the 3.1 release we've done a lot of work there

1. We actually rely on a MultiObjectsDeleteException being raised on a delete 
failure, which the API Says "if one or more of the objects couldn't be deleted."
2. We don't have a complete policy on what to do here.; currently it's 
catch-log-rethrow

I'm actually going to do some work on this in the next 10 days, because we need 
to handle this for rename() too (it's a copy & delete, after all): 
HADOOP-13936; HADOOP-15193 are the issues there. It'd be great if you could 
help there by testing the 3.2 RC0 against your buckets (expect this in 
september).

At the same time, we aren't going to be able to recover from the failure. All 
we're going to do is make S3Guard consistent with the remote state (i.e. mark 
deleted files as delete) and throw again. I don't believe I need to worry about 
the response from DeleteObjects, assuming that the SDK assertion "partial 
delete failures raise an exception". If you have evidence that this is not 
always the case, and that sometimes partial deletes surface in an incomplete 
result *and no exception raised in the client*, well, that would be cause for 
concern.

> S3A Filesystem does not check return from AmazonS3Client deleteObjects
> ----------------------------------------------------------------------
>
>                 Key: HADOOP-15628
>                 URL: https://issues.apache.org/jira/browse/HADOOP-15628
>             Project: Hadoop Common
>          Issue Type: Bug
>          Components: fs/s3
>    Affects Versions: 2.9.1, 2.8.4, 3.1.1, 3.0.3
>         Environment: Hadoop 3.0.2 / Hadoop 2.8.3
> Hive 2.3.2 / Hive 2.3.3 / Hive 3.0.0
>            Reporter: Steve Jacobs
>            Priority: Minor
>
> Deletes in S3A that use the Multi-Delete functionality in the Amazon S3 api 
> do not check to see if all objects have been succesfully delete. In the event 
> of a failure, the api will still return a 200 OK (which isn't checked 
> currently):
> [Current Delete 
> Code|https://github.com/apache/hadoop/blob/a0da1ec01051108b77f86799dd5e97563b2a3962/hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/S3AFileSystem.java#L574]
>  
> {code:java}
> if (keysToDelete.size() == MAX_ENTRIES_TO_DELETE) {
> DeleteObjectsRequest deleteRequest =
> new DeleteObjectsRequest(bucket).withKeys(keysToDelete);
> s3.deleteObjects(deleteRequest);
> statistics.incrementWriteOps(1);
> keysToDelete.clear();
> }
> {code}
> This should be converted to use the DeleteObjectsResult class from the 
> S3Client: 
> [Amazon Code 
> Example|https://docs.aws.amazon.com/AmazonS3/latest/dev/DeletingMultipleObjectsUsingJava.htm]
> {code:java}
> // Verify that the objects were deleted successfully.
> DeleteObjectsResult delObjRes = 
> s3Client.deleteObjects(multiObjectDeleteRequest); int successfulDeletes = 
> delObjRes.getDeletedObjects().size();
> System.out.println(successfulDeletes + " objects successfully deleted.");
> {code}
> Bucket policies can be misconfigured, and deletes will fail without warning 
> by S3A clients.
>  
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

Reply via email to