[
https://issues.apache.org/jira/browse/HADOOP-287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12564375#action_12564375
]
Devaraj Das commented on HADOOP-287:
------------------------------------
Chris, the patch just updates the SequenceFile.java code. However, this will
not affect the sort that the Map tasks uses. The SequenceFile's sort is in some
sense deprecated but the code is lying around there. Look at
org.apache.hadoop.mapred.MapTask.MapOutputBuffer to get an idea of how the
sorting infrastructure works. You can plug in quicksort easily (look at
org.apache.hadoop.mapred.MergeSorter which is the default sort implementation).
> Speed up SequenceFile sort with memory reduction
> ------------------------------------------------
>
> Key: HADOOP-287
> URL: https://issues.apache.org/jira/browse/HADOOP-287
> Project: Hadoop Core
> Issue Type: Improvement
> Components: io
> Affects Versions: 0.17.0
> Reporter: Benjamin Reed
> Assignee: Doug Cutting
> Attachments: 287-0.patch, 287-1.patch, s.patch, zoom-sort.patch,
> zoom-sort.patch
>
>
> I replaced the merge sort with a quick sort and it yielded approx 30%
> improvement in sort time. It also reduced the memory requirement for sorting
> because the sort is done in place.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.