[jira] Commented: (MAPREDUCE-1220) Implement an in-cluster LocalJobRunner

Greg Roelofs (JIRA) Tue, 08 Mar 2011 08:49:22 -0800

    [ 
https://issues.apache.org/jira/browse/MAPREDUCE-1220?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13004052#comment-13004052
 ]


Greg Roelofs commented on MAPREDUCE-1220:
-----------------------------------------

Status: basically functional; I believe all otherwise-passing unit tests still 
pass. Unfortunately, because of the duration over which patches were committed 
(and intervening commits), there's no easy way (that I'm aware of) to merge 
everything back into one patch. I'm currently working on the "MR v2" version 
(see MAPREDUCE-279), which is much less hackish and shares very little with the 
version above. I'm not sure this version has a future, but the patches are here 
if anyone is interested.

Known bugs:

 - "Re-localization" is missing. Specifically, because all subtasks run in the 
same JVM, and Java doesn't have chdir(), there's no clean way to isolate them 
from each other. If any but the last sub-MapTask does something obnoxious 
(e.g., delete a distcache symlink or create a file that any other subtask wants 
to create), things will break.  Obviously this is a problem for an optimization 
that's supposed to be (mostly) transparent to users.

 - Progress is still broken, apparently. Everything seemed to check out when I 
had gobs of debugging in there, but it doesn't make it to the UI (including the 
client) as frequently as it should. No clue what broke.

 - The max-input-size decision criterion (in JobInProgress) should check the 
default block size (if appropriate) for the actual input filesystem, not use a 
hardcoded HDFS config that's not necessarily available to tasktracker nodes 
anyway.

 - The UI changes are incomplete, and there are some 404 and error links in 
some cases. Basically, the whole idea of masquerading an UberTask as a 
ReduceTask, yet exposing it to the user in some cases, is awkward, and there 
are a _lot_ of JSP pages to handle.

There are also some cleanup items (test and potentially enable reduce-only 
case; fix memory criterion in uber-decision for map-only [and reduce-only] 
cases; clean up TaskStatus mess; instead of renaming file.out to map_#.out, 
always use attemptID.out; etc.).  However, those kind of pale in comparison to 
the overall intrusive grubbiness of the patch. :-/

> Implement an in-cluster LocalJobRunner
> --------------------------------------
>
>                 Key: MAPREDUCE-1220
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1220
>             Project: Hadoop Map/Reduce
>          Issue Type: New Feature
>          Components: client, jobtracker
>            Reporter: Arun C Murthy
>            Assignee: Greg Roelofs
>         Attachments: MAPREDUCE-1220_yhadoop20.patch, 
> MR-1220.v1.trunk-hadoop-common.Progress-dumper.patch.txt, 
> MR-1220.v10e-v11c-v12b.ytrunk-hadoop-mapreduce.delta.patch.txt, 
> MR-1220.v13.ytrunk-hadoop-mapreduce.delta.patch.txt, 
> MR-1220.v14b.ytrunk-hadoop-mapreduce.delta.patch.txt, 
> MR-1220.v15.ytrunk-hadoop-mapreduce.delta.patch.txt, 
> MR-1220.v2.trunk-hadoop-mapreduce.patch.txt, 
> MR-1220.v2.trunk-hadoop-mapreduce.patch.txt, 
> MR-1220.v6.ytrunk-hadoop-mapreduce.patch.txt, 
> MR-1220.v7.ytrunk-hadoop-mapreduce.delta.patch.txt, 
> MR-1220.v8b.ytrunk-hadoop-mapreduce.delta.patch.txt, 
> MR-1220.v9c.ytrunk-hadoop-mapreduce.delta.patch.txt
>
>
> Currently very small map-reduce jobs suffer from latency issues due to 
> overheads in Hadoop Map-Reduce such as scheduling, jvm startup etc. We've 
> periodically tried to optimize all parts of framework to achieve lower 
> latencies.
> I'd like to turn the problem around a little bit. I propose we allow very 
> small jobs to run as a single task job with multiple maps and reduces i.e. 
> similar to our current implementation of the LocalJobRunner. Thus, under 
> certain conditions (maybe user-set configuration, or if input data is small 
> i.e. less a DFS blocksize) we could launch a special task which will run all 
> maps in a serial manner, followed by the reduces. This would really help 
> small jobs achieve significantly smaller latencies, thanks to lesser 
> scheduling overhead, jvm startup, lack of shuffle over the network etc. 
> This would be a huge benefit, especially on large clusters, to small Hive/Pig 
> queries.
> Thoughts?

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] Commented: (MAPREDUCE-1220) Implement an in-cluster LocalJobRunner

Reply via email to