[ 
https://issues.apache.org/jira/browse/HADOOP-14036?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sean Mackrory updated HADOOP-14036:
-----------------------------------
    Attachment: HADOOP-14036-HADOOP-13345.000.patch

Not proposing this for inclusion just yet (although it's possible this is 
precisely the correct solution), but just a proof-of-concept of the problem. I 
see paths getting added to the containers of objects to move here in the loop 
I'm modifying, and then also down below the comment, "We moved all the 
children, now move the top-level dir."

I should dig a bit into the listObjects call, as I'm curious why we don't have 
this problem with a lot more tests / workloads that involve renames. I'm also 
not entirely sure we do actually have to move the top-level dir last (although 
my current fix ensures that it is added last). If the move isn't atomic, the 
invariant that parent paths always exist is going to be violated for either the 
new path or the old path sometime, and this particular operation is just adding 
it to the collection to be broken into batches. Seems cleaner IMO to do it last 
like we do, but I want to think through it a bit more. Speak up if you have any 
insight or opinions there...

> S3Guard: intermittent duplicate item keys failure
> -------------------------------------------------
>
>                 Key: HADOOP-14036
>                 URL: https://issues.apache.org/jira/browse/HADOOP-14036
>             Project: Hadoop Common
>          Issue Type: Sub-task
>          Components: fs/s3
>    Affects Versions: HADOOP-13345
>            Reporter: Aaron Fabbri
>            Assignee: Mingliang Liu
>         Attachments: HADOOP-14036-HADOOP-13345.000.patch
>
>
> I see this occasionally when running integration tests with -Ds3guard 
> -Ddynamo:
> {noformat}
> testRenameToDirWithSamePrefixAllowed(org.apache.hadoop.fs.s3a.ITestS3AFileSystemContract)
>   Time elapsed: 2.756 sec  <<< ERROR!
> org.apache.hadoop.fs.s3a.AWSServiceIOException: move: 
> com.amazonaws.services.dynamodbv2.model.AmazonDynamoDBException: Provided 
> list of item keys contains duplicates (Service: AmazonDynamoDBv2; Status 
> Code: 400; Error Code: ValidationException; Request ID: 
> QSBVQV69279UGOB4AJ4NO9Q86VVV4KQNSO5AEMVJF66Q9ASUAAJG): Provided list of item 
> keys contains duplicates (Service: AmazonDynamoDBv2; Status Code: 400; Error 
> Code: ValidationException; Request ID: 
> QSBVQV69279UGOB4AJ4NO9Q86VVV4KQNSO5AEMVJF66Q9ASUAAJG)
>         at 
> org.apache.hadoop.fs.s3a.S3AUtils.translateException(S3AUtils.java:178)
>         at 
> org.apache.hadoop.fs.s3a.s3guard.DynamoDBMetadataStore.move(DynamoDBMetadataStore.java:408)
>         at 
> org.apache.hadoop.fs.s3a.S3AFileSystem.innerRename(S3AFileSystem.java:869)
>         at 
> org.apache.hadoop.fs.s3a.S3AFileSystem.rename(S3AFileSystem.java:662)
>         at 
> org.apache.hadoop.fs.FileSystemContractBaseTest.rename(FileSystemContractBaseTest.java:525)
>         at 
> org.apache.hadoop.fs.FileSystemContractBaseTest.testRenameToDirWithSamePrefixAllowed(FileSystemContractBaseTest.java:669)
>         at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>         at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAcces
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to