jaehoon ko created MAPREDUCE-5947:
-------------------------------------

             Summary: Map phase merge can better utilize memory
                 Key: MAPREDUCE-5947
                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5947
             Project: Hadoop Map/Reduce
          Issue Type: Improvement
          Components: performance
    Affects Versions: 2.4.0
            Reporter: jaehoon ko
            Assignee: jaehoon ko


Map phase merge reads spills from disk and writes intermediate results back to 
disk, and so on. I think it is possible to use memory to store intermediate 
results, thereby reducing disk IO. Because kvbuffer is nullified right before 
merge, we have at least io.sort.mb amount of heap available.
MAPREDUCE-4511 can be considered as an effort to utilize memory better through 
read ahead, but number of disk IO is unchanged.

Please give me your thoughts. I'd like to take up this issue.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to