[
https://issues.apache.org/jira/browse/MAPREDUCE-5890?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Arun Suresh updated MAPREDUCE-5890:
-----------------------------------
Attachment: MAPREDUCE-5890.11.patch
Updating patch..
Made some formatting / variable name changes.
Performed some backward compatibility testing :
Tested on a cluster with 1/2 the Nodes running with the patch and 1/2 that
aren't. Set the Encryption flag to OFF and ran some jobs that utilized the
whole cluster. Ensured that there weren't any failures.
[~chris.douglas],
bq. Has this been tested in a cluster? Would the perf hit be simple to measure?
So given that the extra bits (IV and offset) are sent only if Encryption is
turned ON, I ran some basic terasort tests and I could not find any
perceptible difference in performance. But I guess there are various variables
that can the tuned during testing. For e.g., I can play around with
{{mapreduce.task.io.sort.mb}} and {{mapreduce.map.sort.spill.percent}} to vary
the number of spills/on-disk merges. So, to answer your question, don't think
it is easy to measure the performance hit.
> Support for encrypting Intermediate data and spills in local filesystem
> -----------------------------------------------------------------------
>
> Key: MAPREDUCE-5890
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5890
> Project: Hadoop Map/Reduce
> Issue Type: New Feature
> Components: security
> Affects Versions: 2.4.0
> Reporter: Alejandro Abdelnur
> Assignee: Arun Suresh
> Labels: encryption
> Attachments: MAPREDUCE-5890.10.patch, MAPREDUCE-5890.11.patch,
> MAPREDUCE-5890.3.patch, MAPREDUCE-5890.4.patch, MAPREDUCE-5890.5.patch,
> MAPREDUCE-5890.6.patch, MAPREDUCE-5890.7.patch, MAPREDUCE-5890.8.patch,
> MAPREDUCE-5890.9.patch,
> org.apache.hadoop.mapred.TestMRIntermediateDataEncryption-output.txt,
> syslog.tar.gz
>
>
> For some sensitive data, encryption while in flight (network) is not
> sufficient, it is required that while at rest it should be encrypted.
> HADOOP-10150 & HDFS-6134 bring encryption at rest for data in filesystem
> using Hadoop FileSystem API. MapReduce intermediate data and spills should
> also be encrypted while at rest.
--
This message was sent by Atlassian JIRA
(v6.2#6252)