[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5605?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ming Chen updated MAPREDUCE-5605:
---------------------------------

    Attachment: TextOutputFormat.java
                TaskTrackerStatus.java
                TaskTrackerInstrumentation.java
                TaskTrackerAction.java
                TaskTracker.java
                TaskStatus.java
                TaskScheduler.java
                TaskRunner.java
                TaskReport.java
                TaskMemoryManagerThread.java
                TaskLogsTruncater.java
                TaskLogServlet.java
                TaskLogAppender.java
                TaskLog.java
                TaskInProgress.java
                Task.java
                SpillScheduler.java
                SequenceFileOutputFormat.java
                RunningJob.java
                RoundQueue.java
                ReinitTrackerAction.java
                ReduceTaskStatus.java
                ReduceTaskRunner.java
                ReduceTask.java
                ReduceRamManager.java
                RecordReader.java
                RawKeyValueIterator.java
                RawHistoryFileServlet.java
                RawBufferedOutputStream.java
                RamManager.java
                Partitioner.java
                OutputLogFilter.java
                OutputFormat.java
                OutputCommitter.java
                OutputCollector.java
                Operation.java
                MergeSorter.java
                Merger.java
                MemoryElement.java
                MapTaskStatus.java
                MapTaskRunner.java
                MapTaskCompletionEventsUpdate.java
                MapTask.java
                MapRunner.java
                MapRamManager.java
                MapOutputFile.java
                JvmTask.java
                JvmManager.java
                JVMId.java
                JobTaskRunner.java
                JobConf.java
                IFile.java
                DefaultJvmMemoryManager.java
                ChildRamManager.java
                Child.java
                CachePool.java
                CacheOutputStream.java
                CacheFile.java

> Memory-centric MapReduce aiming to solve the I/O bottleneck
> -----------------------------------------------------------
>
>                 Key: MAPREDUCE-5605
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5605
>             Project: Hadoop Map/Reduce
>          Issue Type: Improvement
>    Affects Versions: 1.0.1
>         Environment: x86-64 Linux/Unix
> jdk7 preferred
>            Reporter: Ming Chen
>            Assignee: Ming Chen
>         Attachments: JobTaskRunner.java, JvmManager.java, JvmTask.java, 
> MapOutputFile.java, MapRamManager.java, MapRunner.java, MapTask.java, 
> MapTaskCompletionEventsUpdate.java, MapTaskRunner.java, MapTaskStatus.java, 
> MemoryElement.java, MergeSorter.java, Merger.java, Operation.java, 
> OutputCollector.java, OutputCommitter.java, OutputFormat.java, 
> OutputLogFilter.java, Partitioner.java, RamManager.java, 
> RawBufferedOutputStream.java, RawHistoryFileServlet.java, 
> RawKeyValueIterator.java, RecordReader.java, ReduceRamManager.java, 
> ReduceTask.java, ReduceTaskRunner.java, ReduceTaskStatus.java, 
> ReinitTrackerAction.java, RoundQueue.java, RunningJob.java, 
> SequenceFileOutputFormat.java, SpillScheduler.java, Task.java, 
> TaskInProgress.java, TaskLog.java, TaskLogAppender.java, TaskLogServlet.java, 
> TaskLogsTruncater.java, TaskMemoryManagerThread.java, TaskReport.java, 
> TaskRunner.java, TaskScheduler.java, TaskStatus.java, TaskTracker.java, 
> TaskTrackerAction.java, TaskTrackerInstrumentation.java, 
> TaskTrackerStatus.java, TextOutputFormat.java
>
>
> Memory is a very important resource to bridge the gap between CPUs and I/O 
> devices. So the idea is to maximize the usage of memory to solve the problem 
> of I/O bottleneck. We developed a multi-threaded task execution engine, which 
> runs in a single JVM on a node. In the execution engine, we have implemented 
> the algorithm of memory scheduling to realize global memory management, based 
> on which we further developed the techniques such as sequential disk 
> accessing, multi-cache and solved the problem of full garbage collection in 
> the JVM. We have conducted extensive experiments with comparison against the 
> native Hadoop platform. The results show that the Mammoth system can reduce 
> the job execution time by more than 40% in typical cases, without requiring 
> any modifications of the Hadoop programs. When a system is short of memory, 
> Mammoth can improve the performance by up to 4 times, as observed for I/O 
> intensive applications, such as PageRank. 



--
This message was sent by Atlassian JIRA
(v6.1#6144)

Reply via email to