[jira] [Resolved] (MAPREDUCE-4755) Rewrite MapOutputBuffer to use direct buffers & allow parallel sort+collect

Gopal V (JIRA) Mon, 06 Oct 2014 12:40:57 -0700

     [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4755?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Gopal V resolved MAPREDUCE-4755.
--------------------------------
    Resolution: Not a Problem

> Rewrite MapOutputBuffer to use direct buffers & allow parallel sort+collect
> ---------------------------------------------------------------------------
>
>                 Key: MAPREDUCE-4755
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4755
>             Project: Hadoop Map/Reduce
>          Issue Type: Improvement
>    Affects Versions: 3.0.0
>         Environment: Ubuntu 12.10 x86_64 (Bulldozer 8-core)
>            Reporter: Gopal V
>            Assignee: Gopal V
>              Labels: optimization, sort
>
> The MapOutputBuffer has been written with a very severe constraint on the 
> amount of memory it can consume. This results in code that has to page-in & 
> page-out (i.e spill) data as it passes through the map buffers.
> With the advent of the java.nio package, there is a fast and portable MMap 
> alternative to handling your own buffers. This exists outside the GC space of 
> Java and yet provides decently fast memory access to all the data.
> The suggestion is that using mmap() direct buffers can be faster when a spill 
> is involved and simpler than the current spill logic when given enough 
> address space & uses the buffer caches to deliver best effort I/O.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Resolved] (MAPREDUCE-4755) Rewrite MapOutputBuffer to use direct buffers & allow parallel sort+collect

Reply via email to