[
https://issues.apache.org/jira/browse/MAPREDUCE-5890?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Arun Suresh updated MAPREDUCE-5890:
-----------------------------------
Attachment: MAPREDUCE-5890.10.patch
Updating patch to address all the feedback. Thanks !!
[~tucu00],
bq. copyMapOutput() is unconditionally correct the offset, this seems wrong
wrt, to the {{Fetcher::copyMapOutput()}} unconditionally setting the offset. It
shouldn't be a problem not, since in the latest patch, there is no offset sent
anymore
bq. No need to define out2, just reuse out
We still do require out2 (I have since renamed the variable). Since I still
need a reference of the original 'out'. I can then wrap the original 'out' for
each partition of the SpillFile (and thereby prefix the IV and offset at the
beginning of EACH partition of the spill file). This will ensure that I won't
have to send either the stream offset or the IV via the
{{ShuffleHandler}}/{{ShuffleHeader}}
[~chris.douglas]
To address your concerns about backward compatibility, Since we don't touch the
{{ShuffleHandler}} in the latest patch (Think in my previous patch, I still had
the offset sent in the shuffle header.. that's been moved out now).. it should
be fine.
bq. but if the IFile format omits the 16 byte IV for each spill, then the only
overhead it's adding is for the checks in the config (most of which can be
pulled into the buffer init and cached)
Just to clarify, we are now sending 24 bytes (16 for the IV an 8 for the long
offset)
> Support for encrypting Intermediate data and spills in local filesystem
> -----------------------------------------------------------------------
>
> Key: MAPREDUCE-5890
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5890
> Project: Hadoop Map/Reduce
> Issue Type: New Feature
> Components: security
> Affects Versions: 2.4.0
> Reporter: Alejandro Abdelnur
> Assignee: Arun Suresh
> Labels: encryption
> Attachments: MAPREDUCE-5890.10.patch, MAPREDUCE-5890.3.patch,
> MAPREDUCE-5890.4.patch, MAPREDUCE-5890.5.patch, MAPREDUCE-5890.6.patch,
> MAPREDUCE-5890.7.patch, MAPREDUCE-5890.8.patch, MAPREDUCE-5890.9.patch,
> org.apache.hadoop.mapred.TestMRIntermediateDataEncryption-output.txt,
> syslog.tar.gz
>
>
> For some sensitive data, encryption while in flight (network) is not
> sufficient, it is required that while at rest it should be encrypted.
> HADOOP-10150 & HDFS-6134 bring encryption at rest for data in filesystem
> using Hadoop FileSystem API. MapReduce intermediate data and spills should
> also be encrypted while at rest.
--
This message was sent by Atlassian JIRA
(v6.2#6252)