[
https://issues.apache.org/jira/browse/HADOOP-3514?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12622519#action_12622519
]
Devaraj Das commented on HADOOP-3514:
-------------------------------------
Some comments:
1) Remove the check for RawLocalFileSystem from Merger.java
2) In TaskTracker's doGet method, the mapOutputIn.close is not required since
checksumInputStream.close would call the underlying close anyway
3) The TaskTracker needn't compute the checksum on the data it is sending out
over the socket (it is already doing validation on the data it is reading from
the fs). It could send the raw bytes over the socket followed by the checksum
bytes.
4) The CheckSumInputStream.read needs to read in a loop until it gets the len
amount of data or EOF.
> Reduce seeks during shuffle, by inline crcs
> -------------------------------------------
>
> Key: HADOOP-3514
> URL: https://issues.apache.org/jira/browse/HADOOP-3514
> Project: Hadoop Core
> Issue Type: Improvement
> Components: mapred
> Affects Versions: 0.18.0
> Reporter: Devaraj Das
> Assignee: Jothi Padmanabhan
> Fix For: 0.19.0
>
> Attachments: hadoop-3514-v1.patch, hadoop-3514-v2.patch,
> hadoop-3514-v3.patch, hadoop-3514-v4.patch, hadoop-3514-v5.patch,
> hadoop-3514-v6.patch, hadoop-3514-v7.patch, hadoop-3514.patch
>
>
> The number of seeks can be reduced by half in the iFile if we move the crc
> into the iFile rather than having a separate file.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.