[
https://issues.apache.org/jira/browse/MAPREDUCE-1026?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12759634#action_12759634
]
Devaraj Das commented on MAPREDUCE-1026:
----------------------------------------
bq. 1. Use a job specific random key, which is included in the URL of the fetch.
Yes.
bq. 2. Allow jobs to request encryption of the map output using a second job
specific random key. I assume the configuration boolean would be something like
mapred.job.shuffle.encrypt.
Yes.
bq. If the outputs are encrypted, I assume that we checksum the unencrypted
data and include the checksum in the encryption.
I am not sure whether this is required to be done. The encrypted bytes would be
checksummed automatically as we write them to the disk. Do we need to build the
extra logic of checksumming the unencrypted bytes (that might be a big deal
when we have multiple map output spills that we finally merge at the end, and
spill to disk). I propose we just live with the (auto) checksum of the
encrypted bytes.
> Shuffle should be secure
> ------------------------
>
> Key: MAPREDUCE-1026
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1026
> Project: Hadoop Map/Reduce
> Issue Type: Sub-task
> Components: security
> Reporter: Owen O'Malley
> Assignee: Devaraj Das
>
> Since the user's data is available via http from the TaskTrackers, we should
> require a job-specific secret to access it.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.