[
https://issues.apache.org/jira/browse/HADOOP-3229?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12588113#action_12588113
]
Chris Douglas commented on HADOOP-3229:
---------------------------------------
bq. If a slow-running mapper or reducer only outputs one tiny record every
minute or so...
Shouldn't the progress update from TrackedRecordReader take care of that? I
suppose there could be a map that reads a single record and emits several,
small records over several minutes... OK, sold. For most cases, that sounds
like a dead/dying node that probably _should_ get killed, but even if that were
true this would be the wrong way to address it.
bq. that's the contract we've advertised in the past, that either consuming or
emitting an entry counted as task progress
Setting the flag on collect from the map, as above, I agree. We're missing a
case by only following collect(), though: should we signal progress while we're
spilling to disk, without a combiner? Passing the Reporter to the fs::create
would be address tasks with/without a combiner, right?
> Map OutputCollector does not report progress on writes
> ------------------------------------------------------
>
> Key: HADOOP-3229
> URL: https://issues.apache.org/jira/browse/HADOOP-3229
> Project: Hadoop Core
> Issue Type: Bug
> Components: mapred
> Environment: all
> Reporter: Alejandro Abdelnur
> Fix For: 0.17.0
>
> Attachments: 3229-0.patch, HADOOP-3229.patch
>
>
> It seem that the collector implementation used during the map phase does not
> report progress on writing.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.