Ming Chen created MAPREDUCE-5605:
------------------------------------

             Summary: Memory-centric MapReduce aiming to solve the I/O 
bottleneck
                 Key: MAPREDUCE-5605
                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5605
             Project: Hadoop Map/Reduce
          Issue Type: Improvement
    Affects Versions: 1.0.1
         Environment: x86-64 Linux/Unix
jdk7 preferred
            Reporter: Ming Chen
            Assignee: Ming Chen


Memory is a very important resource to bridge the gap between
CPUs and I/O devices. So the idea is to maximize the usage of memory to solve 
the problem of I/O bottleneck. We developed a multi-threaded task execution 
engine, which runs in a single JVM on a node. In the execution engine, we have 
implemented the algorithm of memory scheduling to realize global memory 
management, based on which we further developed the techniques such as 
sequential disk accessing, multi-cache and solved the problem of full garbage 
collection in the JVM. We have conducted extensive experiments with comparison 
against the native Hadoop platform. The results show that the Mammoth system 
can reduce the job execution time by more than 40% in typical cases, without 
requiring any modifications of the Hadoop programs. When a system is short of 
memory, Mammoth can improve the performance by up to 4 times, as observed for 
I/O intensive applications, such as PageRank. 



--
This message was sent by Atlassian JIRA
(v6.1#6144)

Reply via email to