[ 
https://issues.apache.org/jira/browse/HADOOP-12970?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15215335#comment-15215335
 ] 

Harsh J commented on HADOOP-12970:
----------------------------------

bq. Patch generated 1 ASF License warnings.

Unrelated to patch, per below, seems to have been caused elsewhere:

{code}
Lines that start with ????? in the ASF License  report indicate files that do 
not have an Apache license header:
 !????? 
/testptch/hadoop/hadoop-hdfs-project/hadoop-hdfs/src/test/resources/dfs.hosts.json
{code}

bq. The patch doesn't appear to include any new or modified tests. Please 
justify why no new tests are needed for this patch. Also please list what 
manual steps were performed to verify this patch.

Writing a test-case for this would be non-trivial as it would involve 
controlling the S3 service to send back a connection close header. I doubt 
there's a way to control/simulate that, as S3 does not always respond back with 
"Connection: close" to the object metadata requests.

> Intermittent signature match failures in S3AFileSystem due connection closure
> -----------------------------------------------------------------------------
>
>                 Key: HADOOP-12970
>                 URL: https://issues.apache.org/jira/browse/HADOOP-12970
>             Project: Hadoop Common
>          Issue Type: Bug
>          Components: fs/s3
>    Affects Versions: 2.7.0
>            Reporter: Harsh J
>            Assignee: Harsh J
>         Attachments: HADOOP-12970.patch
>
>
> S3AFileSystem's use of {{ObjectMetadata#clone()}} method inside the 
> {{copyFile}} implementation may fail in circumstances where the connection 
> used for obtaining the metadata is closed by the server (i.e. response 
> carries a {{Connection: close}} header). Due to this header not being 
> stripped away when the {{ObjectMetadata}} is created, and due to us cloning 
> it for use in the next {{CopyObjectRequest}}, it causes the request to use 
> {{Connection: close}} headers as a part of itself.
> This causes signer related exceptions because the client now includes the 
> {{Connection}} header as part of the {{SignedHeaders}}, but the S3 server 
> does not receive the same value for it ({{Connection}} headers are likely 
> stripped away before the S3 Server tries to match signature hashes), causing 
> a failure like below:
> {code}
> 2016-03-29 19:59:30,120 DEBUG [s3a-transfer-shared--pool1-t35] 
> org.apache.http.wire: >> "Authorization: AWS4-HMAC-SHA256 
> Credential=XXX/20160329/eu-central-1/s3/aws4_request, 
> SignedHeaders=accept-ranges;connection;content-length;content-type;etag;host;last-modified;user-agent;x-amz-acl;x-amz-content-sha256;x-amz-copy-source;x-amz-date;x-amz-metadata-directive;x-amz-server-side-encryption;x-amz-version-id,
>  Signature=MNOPQRSTUVWXYZ[\r][\n]"
> …
> com.amazonaws.services.s3.model.AmazonS3Exception: The request signature we 
> calculated does not match the signature you provided. Check your key and 
> signing method. (Service: Amazon S3; Status Code: 403; Error Code: 
> SignatureDoesNotMatch; Request ID: ABC), S3 Extended Request ID: XYZ
> {code}
> This is intermittent because the S3 Server does not always add a 
> {{Connection: close}} directive in its response, but whenever we receive it 
> AND we clone it, the above exception would happen for the copy request. The 
> copy request is often used in the context of FileOutputCommitter, when a lot 
> of the MR attempt files on {{s3a://}} destination filesystem are to be moved 
> to their parent directories post-commit.
> I've also submitted a fix upstream with AWS Java SDK to strip out the 
> {{Connection}} headers when dealing with {{ObjectMetadata}}, which is pending 
> acceptance and release at: https://github.com/aws/aws-sdk-java/pull/669, but 
> until that release is available and can be used by us, we'll need to 
> workaround the clone approach by manually excluding the {{Connection}} header 
> (not straight-forward due to the {{metadata}} object being private with no 
> mutable access). We can remove such a change in future when there's a release 
> available with the upstream fix.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to