[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5890?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14040237#comment-14040237
 ] 

Chris Douglas commented on MAPREDUCE-5890:
------------------------------------------

bq. Given that current abstraction does not provide a clean cut to hide this 
within the IFile without a significant refactoring throughout the code, I think 
is the least evil.

It's expedient, but this code is already difficult to follow. Arun, would you 
mind making an attempt at refactoring? The current code doesn't have an 
existing abstraction for this, but writing a separate file for every spill just 
to store a few bytes of IV doesn't seem like a reasonable tradeoff in either 
performance or complexity. Adding a metadata block to the {{IFile}} segment or 
adding the IV to the spill index (to be added in the header, as in the current 
patch) would both work.

A couple nits:
* In {{OnDiskMapOutput}}, the {{disk}} field can stay final, since the only 
assignment is in the cstr
* Minor indentation/braces issue in {{MapTask}}:
{{noformat}}
+          if (CryptoUtils.isShuffleEncrypted(job))
+          CryptoUtils.deleteIVFile(rfs, filename[i]);
{{noformat}}

Minor nit: please leave old patches attached to avoid orphaning the discussion 
around them.

> Support for encrypting Intermediate data and spills in local filesystem
> -----------------------------------------------------------------------
>
>                 Key: MAPREDUCE-5890
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5890
>             Project: Hadoop Map/Reduce
>          Issue Type: New Feature
>          Components: security
>    Affects Versions: 2.4.0
>            Reporter: Alejandro Abdelnur
>            Assignee: Arun Suresh
>              Labels: encryption
>         Attachments: MAPREDUCE-5890.3.patch, 
> org.apache.hadoop.mapred.TestMRIntermediateDataEncryption-output.txt, 
> syslog.tar.gz
>
>
> For some sensitive data, encryption while in flight (network) is not 
> sufficient, it is required that while at rest it should be encrypted. 
> HADOOP-10150 & HDFS-6134 bring encryption at rest for data in filesystem 
> using Hadoop FileSystem API. MapReduce intermediate data and spills should 
> also be encrypted while at rest.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to