[ 
https://issues.apache.org/jira/browse/HADOOP-12970?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15217513#comment-15217513
 ] 

Harsh J commented on HADOOP-12970:
----------------------------------

Just for posterity: I did get my upstream patch into aws-sdk-java: 
https://github.com/aws/aws-sdk-java/blob/1.10.65/aws-java-sdk-s3/src/main/java/com/amazonaws/services/s3/internal/AbstractS3ResponseHandler.java#L58,
 its available in AWS Java SDK versions 1.10.65 onwards, so using that version 
in future (within hadoop-aws) will resolve the issue too.

> Intermittent signature match failures in S3AFileSystem due connection closure
> -----------------------------------------------------------------------------
>
>                 Key: HADOOP-12970
>                 URL: https://issues.apache.org/jira/browse/HADOOP-12970
>             Project: Hadoop Common
>          Issue Type: Bug
>          Components: fs/s3
>    Affects Versions: 2.7.0
>            Reporter: Harsh J
>            Assignee: Harsh J
>         Attachments: HADOOP-12970.patch, HADOOP-12970.patch
>
>
> S3AFileSystem's use of {{ObjectMetadata#clone()}} method inside the 
> {{copyFile}} implementation may fail in circumstances where the connection 
> used for obtaining the metadata is closed by the server (i.e. response 
> carries a {{Connection: close}} header). Due to this header not being 
> stripped away when the {{ObjectMetadata}} is created, and due to us cloning 
> it for use in the next {{CopyObjectRequest}}, it causes the request to use 
> {{Connection: close}} headers as a part of itself.
> This causes signer related exceptions because the client now includes the 
> {{Connection}} header as part of the {{SignedHeaders}}, but the S3 server 
> does not receive the same value for it ({{Connection}} headers are likely 
> stripped away before the S3 Server tries to match signature hashes), causing 
> a failure like below:
> {code}
> 2016-03-29 19:59:30,120 DEBUG [s3a-transfer-shared--pool1-t35] 
> org.apache.http.wire: >> "Authorization: AWS4-HMAC-SHA256 
> Credential=XXX/20160329/eu-central-1/s3/aws4_request, 
> SignedHeaders=accept-ranges;connection;content-length;content-type;etag;host;last-modified;user-agent;x-amz-acl;x-amz-content-sha256;x-amz-copy-source;x-amz-date;x-amz-metadata-directive;x-amz-server-side-encryption;x-amz-version-id,
>  Signature=MNOPQRSTUVWXYZ[\r][\n]"
> …
> com.amazonaws.services.s3.model.AmazonS3Exception: The request signature we 
> calculated does not match the signature you provided. Check your key and 
> signing method. (Service: Amazon S3; Status Code: 403; Error Code: 
> SignatureDoesNotMatch; Request ID: ABC), S3 Extended Request ID: XYZ
> {code}
> This is intermittent because the S3 Server does not always add a 
> {{Connection: close}} directive in its response, but whenever we receive it 
> AND we clone it, the above exception would happen for the copy request. The 
> copy request is often used in the context of FileOutputCommitter, when a lot 
> of the MR attempt files on {{s3a://}} destination filesystem are to be moved 
> to their parent directories post-commit.
> I've also submitted a fix upstream with AWS Java SDK to strip out the 
> {{Connection}} headers when dealing with {{ObjectMetadata}}, which is pending 
> acceptance and release at: https://github.com/aws/aws-sdk-java/pull/669, but 
> until that release is available and can be used by us, we'll need to 
> workaround the clone approach by manually excluding the {{Connection}} header 
> (not straight-forward due to the {{metadata}} object being private with no 
> mutable access). We can remove such a change in future when there's a release 
> available with the upstream fix.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to