[ 
https://issues.apache.org/jira/browse/HADOOP-4649?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Douglas updated HADOOP-4649:
----------------------------------

    Attachment: 4649-2.patch

Thanks for the review.

bq. Is there a reason why you do not use IFileInputStream/IFileOuptutStream in 
SpillRecord?
Because it's not an IFile? You're right, the java.util.zip.Checked\*Streams and 
the current IFile\*Streams are very close, but the SpillRecords have different 
access patterns, average sizes, etc. compared to IFiles and I could think of no 
particular reason why changes to one should influence the other. If the 
checksum algorithm for IFiles were to change, it would likely be something like 
Adler32- faster than CRC32, but insufficient for small input, like 
SpillRecords. Most of the modifications already discussed for IFile that might 
go into IFile\*Stream- record count, for example- would be awkward adaptations 
for SpillRecords. Separating the two was one of the motivations for the patch. 
Did you have a use case in mind or is there another benefit to using the 
IFile\*Streams?

bq. We do not expect this change to have any performance impact, but could you 
verify just in case, if you have not already.
Sure. I'll bounce this if the results are adverse, but I think we can assume 
that any impact isn't measurable.

Attached a patch removing the dead code you identified. The previous 
test/test-patch results remain valid.

> Improve abstraction for spill indices
> -------------------------------------
>
>                 Key: HADOOP-4649
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4649
>             Project: Hadoop Core
>          Issue Type: Improvement
>            Reporter: Chris Douglas
>            Assignee: Chris Douglas
>            Priority: Minor
>             Fix For: 0.20.0
>
>         Attachments: 4649-0.patch, 4649-1.patch, 4649-2.patch
>
>
> In support of changing checksum handling as part of the migration to Jetty6, 
> some of the spill code would be easier to reason about with a different 
> abstraction.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to