[
https://issues.apache.org/jira/browse/HADOOP-10714?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14153334#comment-14153334
]
Steve Loughran commented on HADOOP-10714:
-----------------------------------------
charles —you happy with this? If so I'm +1 for it.
there's one thing I'd like to do as an experience of setting up the tests,
something we could merge in with this patch.
It is: make the authentication property names in s3a match those of s3n. They
do the same thing, but have different names. Making the same except for
s/s3a/s3n/ will aid migration and documentation.
That change is easy enough to do here, and something we need to do before
releasing this in 2.6.
There's another extension later, "convert more HTTP error codes into standard
exceptions" —which can be done later
> AmazonS3Client.deleteObjects() need to be limited to 1000 entries per call
> --------------------------------------------------------------------------
>
> Key: HADOOP-10714
> URL: https://issues.apache.org/jira/browse/HADOOP-10714
> Project: Hadoop Common
> Issue Type: Bug
> Components: fs/s3
> Affects Versions: 2.5.0
> Reporter: David S. Wang
> Assignee: Juan Yu
> Priority: Critical
> Labels: s3
> Attachments: HADOOP-10714-007.patch, HADOOP-10714-1.patch,
> HADOOP-10714.001.patch, HADOOP-10714.002.patch, HADOOP-10714.003.patch,
> HADOOP-10714.004.patch, HADOOP-10714.005.patch, HADOOP-10714.006.patch
>
>
> In the patch for HADOOP-10400, calls to AmazonS3Client.deleteObjects() need
> to have the number of entries at 1000 or below. Otherwise we get a Malformed
> XML error similar to:
> com.amazonaws.services.s3.model.AmazonS3Exception: Status Code: 400, AWS
> Service: Amazon S3, AWS Request ID: 6626AD56A3C76F5B, AWS Error Code:
> MalformedXML, AWS Error Message: The XML you provided was not well-formed or
> did not validate against our published schema, S3 Extended Request ID:
> DOt6C+Y84mGSoDuaQTCo33893VaoKGEVC3y1k2zFIQRm+AJkFH2mTyrDgnykSL+v
> at
> com.amazonaws.http.AmazonHttpClient.handleErrorResponse(AmazonHttpClient.java:798)
> at
> com.amazonaws.http.AmazonHttpClient.executeHelper(AmazonHttpClient.java:421)
> at com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:232)
> at com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:3528)
> at com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:3480)
> at
> com.amazonaws.services.s3.AmazonS3Client.deleteObjects(AmazonS3Client.java:1739)
> at org.apache.hadoop.fs.s3a.S3AFileSystem.rename(S3AFileSystem.java:388)
> at
> org.apache.hadoop.hbase.snapshot.ExportSnapshot.run(ExportSnapshot.java:829)
> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
> at
> org.apache.hadoop.hbase.snapshot.ExportSnapshot.innerMain(ExportSnapshot.java:874)
> at
> org.apache.hadoop.hbase.snapshot.ExportSnapshot.main(ExportSnapshot.java:878)
> Note that this is mentioned in the AWS documentation:
> http://docs.aws.amazon.com/AmazonS3/latest/API/multiobjectdeleteapi.html
> "The Multi-Object Delete request contains a list of up to 1000 keys that you
> want to delete. In the XML, you provide the object key names, and optionally,
> version IDs if you want to delete a specific version of the object from a
> versioning-enabled bucket. For each key, Amazon S3….”
> Thanks to Matteo Bertozzi and Rahul Bhartia from AWS for identifying the
> problem.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)