[ http://issues.apache.org/jira/browse/HADOOP-318?page=comments#action_12418086 ]
Doug Cutting commented on HADOOP-318: ------------------------------------- Sigh, I also don't see a compatible way to make this change. So we'll have to upgrade some Nutch InputFormat implementations to define the new method. Could you please construct a patch for Nutch too? That would make my life easier. Thanks. > Progress in writing a DFS file does not count towards Job progress and can > make the task timeout > ------------------------------------------------------------------------------------------------ > > Key: HADOOP-318 > URL: http://issues.apache.org/jira/browse/HADOOP-318 > Project: Hadoop > Type: Bug > Components: mapred > Versions: 0.3.2 > Environment: all, but especially on big busy clusters > Reporter: Milind Bhandarkar > Assignee: Milind Bhandarkar > Fix For: 0.4.0 > Attachments: hadoop-datanode-allocation.patch, hadoop-latency-new.patch, > hadoop-latency.patch > > When a task writes to DFS file, depending on how busy the cluster is, it can > timeout after 10 minutes by default, because the progress towards writing a > DFS file does not count as progress of the task. The solution (patch is > forthcoming) is to provide a way to callback reporter to report task progress > from DFSOutputStream. -- This message is automatically generated by JIRA. - If you think it was sent incorrectly contact one of the administrators: http://issues.apache.org/jira/secure/Administrators.jspa - For more information on JIRA, see: http://www.atlassian.com/software/jira
