[ 
https://issues.apache.org/jira/browse/HADOOP-3514?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12609932#action_12609932
 ] 

Runping Qi commented on HADOOP-3514:
------------------------------------


bq  In this approach, reducers would identify checksum problems only after 
reading and processing all the map output data. If the data corruption had 
happened at the beginning out the output, this would imply doing reduce 
computations that would be discarded anyway.

That is not true. I assume we are talking about checksums for validating map 
output segments in reduce shuffling phase, not the end-to-end 
validation of all the temporary data on the local disks (such as the output of 
merge sort in reducers). If the latter, we don't have to invent
the wheel. We had checksum file, and we abandoned using it  in 0.18 for the 
intermediate data.

For the former, you need to validate a map output segment at the time when it 
is fetched. You throw it away if you detect problems.
Why there is a waste of processing data in reduce?

Taking a step back, we need to think what is the real problems we are trying to 
address and what is the approach
that strikes the right balance between costs and returns. I haven't seen any 
quantifications here.
How often do we see corrupted intermediate data? When/where do they happen? If 
the map outputs are corrupted,
is it corrupted on the mapper side or the reduce side or during transfer? 
Without clear answers to these questions, 
I don't think it is wise to put a lot efforts to the current issue. A simple 
(and non-fool-proof) approach may be sufficient.



> Reduce seeks during shuffle, by inline crcs
> -------------------------------------------
>
>                 Key: HADOOP-3514
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3514
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: mapred
>    Affects Versions: 0.18.0
>            Reporter: Devaraj Das
>            Assignee: Jothi Padmanabhan
>             Fix For: 0.19.0
>
>
> The number of seeks can be reduced by half in the iFile if we move the crc 
> into the iFile rather than having a separate file.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to