Improving Map -> Reduce performance and Task JVM reuse
------------------------------------------------------

         Key: HADOOP-249
         URL: http://issues.apache.org/jira/browse/HADOOP-249
     Project: Hadoop
        Type: Improvement

    Versions: 0.3    
    Reporter: Benjamin Reed


These patches are really just to make Hadoop start trotting. It is still at 
least an order of magnitude slower than it should be, but I think these patches 
are a good start.

I've created two patches for clarity. They are not independent, but could 
easily be made so.

The disk-zoom patch is a performance trifecta: less disk IO, less disk space, 
less CPU, and overall a tremendous improvement. The patch is based on the 
following observation: every piece of data from a map hits the disk once on the 
mapper, and 3 (+plus sorting) times on the reducer. Further, the entire input 
for the reduce step is sorted together maximizing the sort time. This patch 
causes:

1)  the mapper to sort the relatively small fragments at the mapper which 
causes two hits to the disk, but they are smaller files.
2) the reducer copies the map output and may merge (if more than 100 outputs 
are present) with a couple of other outputs at copy time. No sorting is done 
since the map outputs are sorted.
3) the reducer  will merge the map outputs on the fly in memory at reduce time.

I'm attaching the performance graph (with just the disk-zoom patch) to show the 
results. This benchmark uses a random input and null output to remove any DFS 
performance influences. The cluster of 49 machines I was running on had limited 
disk space, so I was only able to run to a certain size on unmodified Hadoop. 
With the patch we use 1/3 the amount of disk space.

The second patch allows the task tracker to reuse processes to avoid the 
over-head of starting the JVM. While JVM startup is relatively fast, restarting 
a Task causes disk IO and DFS operations that have a negative impact on the 
rest of the system. When a Task finishes, rather than exiting, it reads the 
next task to run from stdin. We still isolate the Task runtime from 
TaskTracker, but we only pay the startup penalty once.

This second patch also fixes two performance issues not related to JVM reuse. 
(The reuse just makes the problems glaring.) First, the JobTracker counts all 
jobs not just the running jobs to decide the load on a tracker. Second, the 
TaskTracker should really ask for a new Task as soon as one finishes rather 
than wait the 10 secs.

I've been benchmarking the code alot, but I don't have access to a really good 
cluster to try the code out on, so please treat it as experimental. I would 
love to feedback.

There is another obvious thing to change: ReduceTasks should start after the 
first batch of MapTasks complete, so that 1) they have something to do, and 2) 
they are running on the fastest machines.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira

Reply via email to