[jira] [Updated] (HADOOP-11572) s3a delete() operation fails during a concurrent delete of child entries

Steve Loughran (JIRA) Sun, 01 May 2016 04:19:42 -0700

     [ 
https://issues.apache.org/jira/browse/HADOOP-11572?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Steve Loughran updated HADOOP-11572:
------------------------------------
    Attachment: HADOOP-11572-001.patch

Patch 001

this is just a bit of code I started to write in HADOOP-13028; I've pulled out 
as it was getting too complex to work through within that patch.

This code
# catches the multidelete exception
# extracts the list that failed
# and tries them as part of a one-by-one deletion.

I've realised this isn't quite the right approach, as it assumes that the 
failures are transient and so that individual retries may work. Furthermore, 
there's no handling of failures in the one-by-one patch.

h3. A core way that a key delete can fail is for the file to be already 
deleted. 

Retrying doesn't help —and anyway, we don't need to retry, as the system is in 
the desired state.

What is needed, then, is

# the failure cause of single delete or elements in a multi-delete be examined.
# if this is a not found failure: upgrade to success.
# if it is some other failure, we need to consider what to do? Something like a 
permissions failure probably merits uprating. What other failures can arise?

It's critical we add tests for this, as that's the only way to understand what 
AWS S3 will really send back.

My draft test plan here is

# make that removeKeys call protected/package private
# subclass s3a to one which overrides removeKeys(), and, prior to calling the 
superclass, deletes one or more of the keys from s3. This will trigger failures 
 
# make sure the removekeys succeeds
# test with the FS.rename() and fs.delete() operations on both multikey and 
single key removal options.
Alongside this: try to do a delete of a read only bucket, like the amazon 
landsat data.

Like I said: more complex

> s3a delete() operation fails during a concurrent delete of child entries
> ------------------------------------------------------------------------
>
>                 Key: HADOOP-11572
>                 URL: https://issues.apache.org/jira/browse/HADOOP-11572
>             Project: Hadoop Common
>          Issue Type: Sub-task
>          Components: fs/s3
>    Affects Versions: 2.6.0
>            Reporter: Steve Loughran
>         Attachments: HADOOP-11572-001.patch
>
>
> Reviewing the code, s3a has the problem raised in HADOOP-6688: deletion of a 
> child entry during a recursive directory delete is propagated as an 
> exception, rather than ignored as a detail which idempotent operations should 
> just ignore.
> the exception should be caught and, if a file not found problem, logged 
> rather than propagated



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Updated] (HADOOP-11572) s3a delete() operation fails during a concurrent delete of child entries

Reply via email to