[
https://issues.apache.org/jira/browse/HADOOP-3366?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Arun C Murthy updated HADOOP-3366:
----------------------------------
Attachment: ifile.patch
Here is an early version of my creatively titled SequenceFile replacement for
intermediate data in Map-Reduce (map-outputs)... IFile stands-out for
"Intermediate File" *smile*.
Unfortunately the Writer isn't as tight as it can be, it needs to copy
key/value into an internal buffer (see HADOOP-3414 for necessary details).
However, the Reader seems reasonably tight and strictly does zero-copies. I
chose to use DataInputBuffer as the key/value type in the call for Reader.next
since it plays nicely by offering an InputStream interface and also the ability
to provide it with a raw-buffer to work with; it can also be queried to get
back the raw-buffer without _any_ copies being made. I'll continue to
plug-away, appreciate feedback.
> Shuffle/Merge improvements
> --------------------------
>
> Key: HADOOP-3366
> URL: https://issues.apache.org/jira/browse/HADOOP-3366
> Project: Hadoop Core
> Issue Type: Improvement
> Components: mapred
> Reporter: Arun C Murthy
> Assignee: Arun C Murthy
> Fix For: 0.18.0
>
> Attachments: 3366.1.patch, 3366.1.patch, ifile.patch
>
>
> This is intended to be a meta-issue to track various improvements to
> shuffle/merge in the reducer.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.