[
https://issues.apache.org/jira/browse/HADOOP-4699?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12649588#action_12649588
]
Jothi Padmanabhan commented on HADOOP-4699:
-------------------------------------------
This makes sense. However, I think we should do a measurement of the
performance overhead experimentally to validate the theory. If there is no
significant overhead due to checksum validation, we might be better off doing
it in the Servlet as well to help prevent transmission of the corrupted output.
No?
> Change TaskTracker.MapOutputServlet to send only the IFile segment, validate
> checksum in Reduce
> -----------------------------------------------------------------------------------------------
>
> Key: HADOOP-4699
> URL: https://issues.apache.org/jira/browse/HADOOP-4699
> Project: Hadoop Core
> Issue Type: Improvement
> Reporter: Chris Douglas
> Assignee: Chris Douglas
> Fix For: 0.20.0
>
>
> Instead of validating the checksum of the IFile segment in MapOutputServlet,
> validation may be left to the reduce. While failures may not be detected
> until late in the reduce, the throughput and CPU improvements should make up
> for it in the average case.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.