Harsh J created HADOOP-12970:
--------------------------------

             Summary: Intermittent signature match failures in S3AFileSystem 
due connection closure
                 Key: HADOOP-12970
                 URL: https://issues.apache.org/jira/browse/HADOOP-12970
             Project: Hadoop Common
          Issue Type: Bug
          Components: fs/s3
    Affects Versions: 2.7.0
            Reporter: Harsh J
            Assignee: Harsh J


S3AFileSystem's use of {{ObjectMetadata#clone()}} method inside the 
{{copyFile}} implementation may fail in circumstances where the connection used 
for obtaining the metadata is closed by the server (i.e. response carries a 
{{Connection: close}} header). Due to this header not being stripped away when 
the {{ObjectMetadata}} is created, and due to us cloning it for use in the next 
{{CopyObjectRequest}}, it causes the request to use {{Connection: close}} 
headers as a part of itself.

This causes signer related exceptions because the client now includes the 
{{Connection}} header as part of the {{SignedHeaders}}, but the S3 server does 
not receive the same value for it ({{Connection}} headers are likely stripped 
away before the S3 Server tries to match signature hashes), causing a failure 
like below:

{code}
2016-03-29 19:59:30,120 DEBUG [s3a-transfer-shared--pool1-t35] 
org.apache.http.wire: >> "Authorization: AWS4-HMAC-SHA256 
Credential=XXX/20160329/eu-central-1/s3/aws4_request, 
SignedHeaders=accept-ranges;connection;content-length;content-type;etag;host;last-modified;user-agent;x-amz-acl;x-amz-content-sha256;x-amz-copy-source;x-amz-date;x-amz-metadata-directive;x-amz-server-side-encryption;x-amz-version-id,
 Signature=MNOPQRSTUVWXYZ[\r][\n]"
…
com.amazonaws.services.s3.model.AmazonS3Exception: The request signature we 
calculated does not match the signature you provided. Check your key and 
signing method. (Service: Amazon S3; Status Code: 403; Error Code: 
SignatureDoesNotMatch; Request ID: ABC), S3 Extended Request ID: XYZ
{code}

This is intermittent because the S3 Server does not always add a {{Connection: 
close}} directive in its response, but whenever we receive it AND we clone it, 
the above exception would happen for the copy request. The copy request is 
often used in the context of FileOutputCommitter, when a lot of the MR attempt 
files on {{s3a://}} destination filesystem are to be moved to their parent 
directories post-commit.

I've also submitted a fix upstream with AWS Java SDK to strip out the 
{{Connection}} headers when dealing with {{ObjectMetadata}}, which is pending 
acceptance and release at: https://github.com/aws/aws-sdk-java/pull/669, but 
until that release is available and can be used by us, we'll need to workaround 
the clone approach by manually excluding the {{Connection}} header (not 
straight-forward due to the {{metadata}} object being private with no mutable 
access). We can remove such a change in future when there's a release available 
with the upstream fix.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to