[ 
http://issues.apache.org/jira/browse/HADOOP-287?page=comments#action_12423847 ] 
            
Benjamin Reed commented on HADOOP-287:
--------------------------------------

I have improved it a bit more. It now is guaranteed to only take logN stack 
space, and I eaked out a bit more performance. Unfortunately, the 30% 
improvement is for the in-memory sort. For your slow disks you need my other 
patch which reduces the number of times the data hits the disk. Unfortunately, 
that patch doesn't apply anymore. I have a newer version that removes one more 
full disk hit, so that should work even better. I'll try to create a patch for 
it today. (Unrelated changes break my patches and take a long time to 
reconcile...

> Speed up SequenceFile sort with memory reduction
> ------------------------------------------------
>
>                 Key: HADOOP-287
>                 URL: http://issues.apache.org/jira/browse/HADOOP-287
>             Project: Hadoop
>          Issue Type: Improvement
>          Components: io
>    Affects Versions: 0.3.2
>            Reporter: Benjamin Reed
>         Assigned To: Doug Cutting
>         Attachments: s.patch, zoom-sort.patch, zoom-sort.patch
>
>
> I replaced the merge sort with a quick sort and it yielded approx 30% 
> improvement in sort time. It also reduced the memory requirement for sorting 
> because the sort is done in place.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: 
http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to