Anis Elleuch created HADOOP-15267:
-------------------------------------

             Summary: S3A fails to store my data when multipart size is set ot 
5 Mb and SSE-C encryption is enabled
                 Key: HADOOP-15267
                 URL: https://issues.apache.org/jira/browse/HADOOP-15267
             Project: Hadoop Common
          Issue Type: Bug
          Components: fs/s3
    Affects Versions: 3.1.0
         Environment: Hadoop 3.1 Snapshot
            Reporter: Anis Elleuch
         Attachments: hadoop-fix.patch

With Spark with Hadoop 3.1.0, when I enable SSE-C encryption and set  
fs.s3a.multipart.size to 5 Mb, storing data in AWS won't work anymore. For 
example, running the following code:
{code}
>>> df1 = spark.read.json('/home/user/people.json')
>>> df1.write.mode("overwrite").json("s3a://testbucket/people.json")
{code}
shows the following exception:
{code:java}
com.amazonaws.services.s3.model.AmazonS3Exception: The multipart upload 
initiate requested encryption. Subsequent part requests must include the 
appropriate encryption parameters.
{code}
After some investigation, I discovered that hadoop-aws doesn't send SSE-C 
headers in Put Object Part as stated in AWS specification: 
[https://docs.aws.amazon.com/AmazonS3/latest/API/mpUploadUploadPart.html]
{code:java}
If you requested server-side encryption using a customer-provided encryption 
key in your initiate multipart upload request, you must provide identical 
encryption information in each part upload using the following headers.
{code}
 
You can find a patch attached to this issue for a better clarification of the 
problem.





--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org

Reply via email to