pitrou commented on a change in pull request #11594:
URL: https://github.com/apache/arrow/pull/11594#discussion_r741752805



##########
File path: cpp/src/arrow/filesystem/s3fs.cc
##########
@@ -563,6 +566,97 @@ class S3Client : public Aws::S3::S3Client {
     req.SetBucket(ToAwsString(bucket));
     return GetBucketRegion(req);
   }
+
+  S3Model::CompleteMultipartUploadOutcome 
CompleteMultipartUploadWithErrorFixup(
+      S3Model::CompleteMultipartUploadRequest&& request) const {
+    // CompletedMultipartUpload can return a 200 OK response with an error
+    // encoded in the response body, in which case we should either retry
+    // or propagate the error to the user (see
+    // 
https://docs.aws.amazon.com/AmazonS3/latest/API/API_CompleteMultipartUpload.html).
+    //
+    // Unfortunately the AWS SDK doesn't detect such situations but lets them
+    // return successfully (see https://github.com/aws/aws-sdk-cpp/issues/658).
+    //
+    // We work around the issue by registering a DataReceivedEventHandler
+    // which parses the XML response for embedded errors.
+
+    util::optional<AWSError<Aws::Client::CoreErrors>> aws_error;
+
+    auto handler = [&](const Aws::Http::HttpRequest* http_req,
+                       Aws::Http::HttpResponse* http_resp,
+                       long long) {  // NOLINT runtime/int
+      auto& stream = http_resp->GetResponseBody();
+      const auto pos = stream.tellg();
+      const auto doc = 
Aws::Utils::Xml::XmlDocument::CreateFromXmlStream(stream);
+      // Rewind stream for later
+      stream.clear();
+      stream.seekg(pos);
+
+      if (doc.WasParseSuccessful()) {
+        auto root = doc.GetRootElement();
+        if (!root.IsNull()) {
+          ARROW_LOG(INFO) << "... CompleteMultipartUpload XML response root 
node: "
+                          << root.GetName();
+          if (root.GetName() != "CompleteMultipartUploadResult") {
+            // It's not the expected XML response root node, so it should be 
an error.
+            ARROW_LOG(INFO) << "CompletedMultipartUpload got error: "
+                            << doc.ConvertToString();
+            http_resp->SetResponseCode(
+                Aws::Http::HttpResponseCode::INTERNAL_SERVER_ERROR);
+            aws_error = GetErrorMarshaller()->Marshall(*http_resp);
+            // Rewind stream for later
+            stream.clear();
+            stream.seekg(pos);

Review comment:
       `GetErrorMarshaller()->Marshall` will involve XML parsing over the 
response stream again.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to