[
https://issues.apache.org/jira/browse/MAPREDUCE-5605?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13818459#comment-13818459
]
Ming Chen commented on MAPREDUCE-5605:
--------------------------------------
hi Lijie,
Thanks for the questions. When designing the system, we do face these problems.
Here are my opinions about them.
bq. Multi-threads vs. Multi-processes.
For any task (Map or Reduce) in hadoop-1.0.1, the memory it can use must be
specified carefully in the configuration files by the administrator. Each
task's memory quota is fixed in runtime and cannot coordinate with each other.
This causes that the memory usage is not efficient, and some I/O is not
necessary. For example, in the early part of the MapReduce lifetime, most of
the memory configured to Reduce tasks is not used, while the Map tasks still
need to spill; when Map tasks are done, Reduce tasks cannot use the memory
configured for Map tasks either. Another case is that the intermediate data
will screw sometimes, for example 'grep'. In our design, the dataflow of Map
and Reduce tasks remains unchanged. We just coordanate the memory usage between
them, and take best effort to maximize the memory usage efficiency.
The parameters are so important for the system's performance, but it's hard to
know the intermediate data size at the first time. Rather than taking very
hard work to predict the intermediate data size and make an fixed
configuration, it would be much better to calculate the runtime information and
do dynamical coordination.
bq. Fault-tolerance
Firstly, the intermediate data is seperated in the system, so when you want to
kill a discarded task, you just need to kill the corresponding thread and
recycle the memory it is allocated. Scondly, for fault tolerance, we still
spill each Map task's outputs to disk.
bq. Idempotent
Because each task's data is seperated, so the final result is easy to guarantee.
bq. Trade-off of memory usage and disk usage
This point is very important and one of our main aims is to minimize the disk
I/O. When disk I/O happens, it will be sheduled by the unified I/O scheduler,
which implements the asynchronous and sequential I/O to minimize the I/O wait.
> Memory-centric MapReduce aiming to solve the I/O bottleneck
> -----------------------------------------------------------
>
> Key: MAPREDUCE-5605
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5605
> Project: Hadoop Map/Reduce
> Issue Type: Improvement
> Affects Versions: 1.0.1
> Environment: x86-64 Linux/Unix
> 64-bit jdk7 preferred
> Reporter: Ming Chen
> Assignee: Ming Chen
> Fix For: 1.0.1
>
> Attachments: MAPREDUCE-5605-v1.patch,
> hadoop-core-1.0.1-mammoth-0.9.0.jar
>
>
> Memory is a very important resource to bridge the gap between CPUs and I/O
> devices. So the idea is to maximize the usage of memory to solve the problem
> of I/O bottleneck. We developed a multi-threaded task execution engine, which
> runs in a single JVM on a node. In the execution engine, we have implemented
> the algorithm of memory scheduling to realize global memory management, based
> on which we further developed the techniques such as sequential disk
> accessing, multi-cache and solved the problem of full garbage collection in
> the JVM. The benchmark results shows that it can get impressive improvement
> in typical cases. When the a system is relatively short of memory (eg, HPC,
> small- and medium-size enterprises), the improvement will be even more
> impressive.
--
This message was sent by Atlassian JIRA
(v6.1#6144)