Gopal V created MAPREDUCE-4755:
----------------------------------
Summary: Rewrite MapOutputBuffer to use direct buffers & allow
parallel sort+collect
Key: MAPREDUCE-4755
URL: https://issues.apache.org/jira/browse/MAPREDUCE-4755
Project: Hadoop Map/Reduce
Issue Type: Improvement
Affects Versions: 3.0.0
Environment: Ubuntu 12.10 x86_64 (Bulldozer 8-core)
Reporter: Gopal V
Assignee: Gopal V
Attachments: 0001-first-cut-of-MMapOutputBuffer.patch
The MapOutputBuffer has been written with a very severe constraint on the
amount of memory it can consume. This results in code that has to page-in &
page-out (i.e spill) data as it passes through the map buffers.
With the advent of the java.nio package, there is a fast and portable MMap
alternative to handling your own buffers. This exists outside the GC space of
Java and yet provides decently fast memory access to all the data.
The suggestion is that using mmap() direct buffers can be faster when a spill
is involved and simpler than the current spill logic, when given enough address
space & uses the buffer caches to deliver best effort I/O.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira