[
https://issues.apache.org/jira/browse/HADOOP-3514?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12623895#action_12623895
]
Jothi Padmanabhan commented on HADOOP-3514:
-------------------------------------------
One advantage of having the checksum internal to the data file is that this
mechanism could be used outside of IFile too. For example, in 3638, we are
considering moving the index file from using the localFS to RawLocalFileSystem
with the ChecksumInput/Output stream. Separating the checksum out to a separate
file external to the data file would make these streams closely tied to IFile
alone and would prohibit its usage outside.
> Reduce seeks during shuffle, by inline crcs
> -------------------------------------------
>
> Key: HADOOP-3514
> URL: https://issues.apache.org/jira/browse/HADOOP-3514
> Project: Hadoop Core
> Issue Type: Improvement
> Components: mapred
> Affects Versions: 0.18.0
> Reporter: Devaraj Das
> Assignee: Jothi Padmanabhan
> Fix For: 0.19.0
>
> Attachments: hadoop-3514-v1.patch, hadoop-3514-v2.patch,
> hadoop-3514-v3.patch, hadoop-3514-v4.patch, hadoop-3514-v5.patch,
> hadoop-3514-v6.patch, hadoop-3514-v7.patch, hadoop-3514-v8.patch,
> hadoop-3514.patch
>
>
> The number of seeks can be reduced by half in the iFile if we move the crc
> into the iFile rather than having a separate file.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.