[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5890?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arun Suresh updated MAPREDUCE-5890:
-----------------------------------

    Attachment: MAPREDUCE-5890.11.patch

Updating patch..
Made some formatting / variable name changes.

Performed some backward compatibility testing :
Tested on a cluster with 1/2 the Nodes running with the patch and 1/2 that 
aren't. Set the Encryption flag to OFF and ran some jobs that utilized the 
whole cluster. Ensured that there weren't any failures.  

[~chris.douglas],
bq. Has this been tested in a cluster? Would the perf hit be simple to measure?
So given that the extra bits (IV and offset) are sent only if Encryption is 
turned ON,  I ran some basic terasort tests and I could not find any 
perceptible difference in performance. But I guess there are various variables 
that can the tuned during testing. For e.g., I can play around with 
{{mapreduce.task.io.sort.mb}} and {{mapreduce.map.sort.spill.percent}} to vary 
the number of spills/on-disk merges. So, to answer your question, don't think 
it is easy to measure the performance hit.

> Support for encrypting Intermediate data and spills in local filesystem
> -----------------------------------------------------------------------
>
>                 Key: MAPREDUCE-5890
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5890
>             Project: Hadoop Map/Reduce
>          Issue Type: New Feature
>          Components: security
>    Affects Versions: 2.4.0
>            Reporter: Alejandro Abdelnur
>            Assignee: Arun Suresh
>              Labels: encryption
>         Attachments: MAPREDUCE-5890.10.patch, MAPREDUCE-5890.11.patch, 
> MAPREDUCE-5890.3.patch, MAPREDUCE-5890.4.patch, MAPREDUCE-5890.5.patch, 
> MAPREDUCE-5890.6.patch, MAPREDUCE-5890.7.patch, MAPREDUCE-5890.8.patch, 
> MAPREDUCE-5890.9.patch, 
> org.apache.hadoop.mapred.TestMRIntermediateDataEncryption-output.txt, 
> syslog.tar.gz
>
>
> For some sensitive data, encryption while in flight (network) is not 
> sufficient, it is required that while at rest it should be encrypted. 
> HADOOP-10150 & HDFS-6134 bring encryption at rest for data in filesystem 
> using Hadoop FileSystem API. MapReduce intermediate data and spills should 
> also be encrypted while at rest.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to