[
https://issues.apache.org/jira/browse/HADOOP-13887?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Steve Loughran updated HADOOP-13887:
------------------------------------
Release Note:
We cannot guarantee that client-side encryption works in applications which use
the Hadoop FileSystem APIs.
Client-side encryption breaks a fundamental expectation of lots of code: that
the amount of data you can read from a file equals the size of the file when
listed. Because of padding and roundup, you may get less data on read than you
think, seek(length(file)-1) can fail, read(bytes,length(file)) may fail, etc.
Be assured, code will break. That's going to be found as you use this feature,
and maybe it can be fixed. It may also be that it can't, not easily.
If you do find problems, you are going to have to take it up with the
particular application/library which has the issue, and see whether or not they
can/will fix it. We cannot fix it in the Hadoop S3A filesystem, as this code is
only reporting back the file lengths supplied by the S3 endpoint: if there is a
mismatch, the client code does not know of it. Indeed, it's likely that S3
itself doesn't know of it, because the decryption is taking place on the client.
For this reason, you cannot simply turn on client-side encryption and expect
everything to "just" work. You should restrict it to specific uses which you
can test, such as writing data out for use by an external application, or
carefully importing data uploaded by other processes.
> Support for client-side encryption in S3A file system
> -----------------------------------------------------
>
> Key: HADOOP-13887
> URL: https://issues.apache.org/jira/browse/HADOOP-13887
> Project: Hadoop Common
> Issue Type: Sub-task
> Components: fs/s3
> Affects Versions: 2.8.0
> Reporter: Jeeyoung Kim
> Assignee: Igor Mazur
> Priority: Minor
> Attachments: HADOOP-13887-002.patch, HADOOP-13887-007.patch,
> HADOOP-13887-branch-2-003.patch, HADOOP-13897-branch-2-004.patch,
> HADOOP-13897-branch-2-005.patch, HADOOP-13897-branch-2-006.patch,
> HADOOP-13897-branch-2-008.patch, HADOOP-13897-branch-2-009.patch,
> HADOOP-13897-branch-2-010.patch, HADOOP-13897-branch-2-012.patch,
> HADOOP-13897-branch-2-014.patch, HADOOP-13897-trunk-011.patch,
> HADOOP-13897-trunk-013.patch, HADOOP-14171-001.patch
>
>
> Expose the client-side encryption option documented in Amazon S3
> documentation -
> http://docs.aws.amazon.com/AmazonS3/latest/dev/UsingClientSideEncryption.html
> Currently this is not exposed in Hadoop but it is exposed as an option in AWS
> Java SDK, which Hadoop currently includes. It should be trivial to propagate
> this as a parameter passed to the S3client used in S3AFileSystem.java
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]