[ 
https://issues.apache.org/jira/browse/HADOOP-3229?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12588113#action_12588113
 ] 

Chris Douglas commented on HADOOP-3229:
---------------------------------------

bq. If a slow-running mapper or reducer only outputs one tiny record every 
minute or so...

Shouldn't the progress update from TrackedRecordReader take care of that? I 
suppose there could be a map that reads a single record and emits several, 
small records over several minutes... OK, sold. For most cases, that sounds 
like a dead/dying node that probably _should_ get killed, but even if that were 
true this would be the wrong way to address it.

bq. that's the contract we've advertised in the past, that either consuming or 
emitting an entry counted as task progress

Setting the flag on collect from the map, as above, I agree. We're missing a 
case by only following collect(), though: should we signal progress while we're 
spilling to disk, without a combiner? Passing the Reporter to the fs::create 
would be address tasks with/without a combiner, right?

> Map OutputCollector does not report progress on writes
> ------------------------------------------------------
>
>                 Key: HADOOP-3229
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3229
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred
>         Environment: all
>            Reporter: Alejandro Abdelnur
>             Fix For: 0.17.0
>
>         Attachments: 3229-0.patch, HADOOP-3229.patch
>
>
> It seem that the collector implementation used during the map phase does not 
> report progress on writing.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to