[ https://issues.apache.org/jira/browse/HADOOP-249?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Devaraj Das updated HADOOP-249: ------------------------------- Release Note: Jobs can enable task JVMs to be reused via the job config mapred.job.reuse.jvm.num.tasks. If this is 1 (the default), then JVMs are not reused (1 task per JVM). If it is -1, there is no limit to the number of tasks a JVM can run (of the same job). One can also specify some value greater than 1. Also a JobConf API has been added - setNumTasksToExecutePerJvm. > Improving Map -> Reduce performance and Task JVM reuse > ------------------------------------------------------ > > Key: HADOOP-249 > URL: https://issues.apache.org/jira/browse/HADOOP-249 > Project: Hadoop Core > Issue Type: Improvement > Components: mapred > Affects Versions: 0.3.0 > Reporter: Benjamin Reed > Assignee: Devaraj Das > Fix For: 0.19.0 > > Attachments: 249-3.patch, 249-after-review.patch, 249-final.patch, > 249-with-jvmID.patch, 249.1.patch, 249.2.patch, disk_zoom.patch, > image001.png, task_zoom.patch > > > These patches are really just to make Hadoop start trotting. It is still at > least an order of magnitude slower than it should be, but I think these > patches are a good start. > I've created two patches for clarity. They are not independent, but could > easily be made so. > The disk-zoom patch is a performance trifecta: less disk IO, less disk space, > less CPU, and overall a tremendous improvement. The patch is based on the > following observation: every piece of data from a map hits the disk once on > the mapper, and 3 (+plus sorting) times on the reducer. Further, the entire > input for the reduce step is sorted together maximizing the sort time. This > patch causes: > 1) the mapper to sort the relatively small fragments at the mapper which > causes two hits to the disk, but they are smaller files. > 2) the reducer copies the map output and may merge (if more than 100 outputs > are present) with a couple of other outputs at copy time. No sorting is done > since the map outputs are sorted. > 3) the reducer will merge the map outputs on the fly in memory at reduce > time. > I'm attaching the performance graph (with just the disk-zoom patch) to show > the results. This benchmark uses a random input and null output to remove any > DFS performance influences. The cluster of 49 machines I was running on had > limited disk space, so I was only able to run to a certain size on unmodified > Hadoop. With the patch we use 1/3 the amount of disk space. > The second patch allows the task tracker to reuse processes to avoid the > over-head of starting the JVM. While JVM startup is relatively fast, > restarting a Task causes disk IO and DFS operations that have a negative > impact on the rest of the system. When a Task finishes, rather than exiting, > it reads the next task to run from stdin. We still isolate the Task runtime > from TaskTracker, but we only pay the startup penalty once. > This second patch also fixes two performance issues not related to JVM reuse. > (The reuse just makes the problems glaring.) First, the JobTracker counts all > jobs not just the running jobs to decide the load on a tracker. Second, the > TaskTracker should really ask for a new Task as soon as one finishes rather > than wait the 10 secs. > I've been benchmarking the code alot, but I don't have access to a really > good cluster to try the code out on, so please treat it as experimental. I > would love to feedback. > There is another obvious thing to change: ReduceTasks should start after the > first batch of MapTasks complete, so that 1) they have something to do, and > 2) they are running on the fastest machines. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.