Chris,

I have filed MAPREDUCE-3376 
<https://issues.apache.org/jira/browse/MAPREDUCE-3376>  for this issue.  I have 
no idea when or if I will get around to fixing it.  It looks like a fairly 
simple change, perhaps even a one or two line change, but reproducing the issue 
and testing that I actually fixed it is likely to take a fair amount of time.  
So, if you Chris or someone else wants to take a crack at it feel free to.  
Chris I don't know if you are able to post the code that is causing this issue 
or not, but if you can, that would probably allow me to get around to posting a 
fix a lot faster.

Thanks,

Bobby Evans

On 11/7/11 10:53 AM, "Robert Evans" <ev...@yahoo-inc.com> wrote:

OK I found the problem.  Line 1148 of Task.java in the OldCombinerRunner class. 
 If your combiner is part of the old mapred API then the reporter is always the 
NULL reporter and there is nothing that we can do about it without a code 
update.  However if you use the new mapreduce API (Your combiner extends 
org.apache.hadoop.mapreduce.Reducer) then it looks like it will do the right 
thing, but I have not tested this.

It appears to be an issue even in the latest release of 0.20.205, and possibly 
Trunk.  I am not sure if the code is still used in Trunk or not, but I suspect 
that it is.

Please file a JIRA about the issue in the MAPREDUCE project or if you want me 
to I can.  In the short term I would suggest that you switch to using the new 
API if you can for your combiner.

--Bobby Evans

On 11/4/11 6:13 PM, "Christopher Egner" <ceg...@apple.com> wrote:

I'm using CDH3u0 and streaming, so this is hadoop-0.20.2 at patch level 923.21 
(cf https://ccp.cloudera.com/display/DOC/Downloading+CDH+Releases).

I modified the streaming code to confirm that it is calling progress when I ask 
it to and which Reporter class is actually being used.  It's the 
Task.TaskReporter class for map and reduce but the Reporter.NULL class for 
combine (both map-side and reduce-side combines).  It appears to be the mapred 
layer (as opposed to streaming) that sets the reporter, so this should affect 
non-streaming jobs as well.

Chris

On Nov 4, 2011, at 9:11 AM, Robert Evans wrote:

There was a change that went into 0.20.205 
https://issues.apache.org/jira/browse/MAPREDUCE-2187 where after so many inputs 
to the combiner progress is automatically reported.  I looked through the code 
for 0.20.205 and from what I can see the CombineOutputCollector should be 
getting an instance of TaskReporter.  What version of Hadoop are you running?  
Are you using the old APIs in the mapred package or the newer APIs in the 
mapreduce java package?

--Bobby Evans

On 11/4/11 1:20 AM, "Christopher Egner" <ceg...@apple.com 
<x-msg://74/ceg...@apple.com> > wrote:

Hi all,

Let me preface this with my understanding of how tasks work.

If a task takes a long time (default 10min) and demonstrates no progress, the 
task tracker will decide the process is hung, kill it, and start a new attempt. 
 Normally, one uses a Reporter instance's progress method to provide progress 
updates and avoid this. For a streaming mapper, the Reporter class is 
org.apache.hadoop.mapred.Task$TaskReporter and this works well.  Streaming is 
even set up to take progress, status, and counter updates from stderr, which is 
really cool.

However, for combiner tasks, the class is org.apache.hadoop.mapred.Reporter$1.  
The first subclass in this particular java file is the Reporter.NULL class, 
which ignores all updates.  So even if a combiner task is updating its reporter 
in accordance with docs (see postscript), its updates are ignored and it dies 
at 10 minutes.  Or one sets mapred.task.timeout very high, allowing truly hung 
tasks to go unrecognised for much longer.

At least this is what I've been able to put together from reading code and 
searching the web for docs (except hadoop jira which has been down for a while 
- my bad luck).

So am I understanding this correctly?  Are there plans to change this?  Or 
reasons that combiners can't have normal reporters associated to them?

Thanks for any help,
Chris

http://hadoop.apache.org/common/docs/r0.20.2/mapred_tutorial.html#Reporter
http://www.cloudera.com/blog/2009/05/10-mapreduce-tips/ (cf tip 7)
http://hadoop.apache.org/common/docs/r0.18.3/streaming.html#How+do+I+update+counters+in+streaming+applications%3F
http://hadoop.apache.org/common/docs/r0.20.0/mapred-default.html  (cf 
mapred.task.timeout)




Reply via email to