[jira] Updated: (HADOOP-318) Progress in writing a DFS file does not count towards Job progress and can make the task timeout

Milind Bhandarkar (JIRA) Tue, 27 Jun 2006 10:40:05 -0700

     [ http://issues.apache.org/jira/browse/HADOOP-318?page=all ]


Milind Bhandarkar updated HADOOP-318:
-------------------------------------

    Attachment: hadoop-datanode-allocation.patch

This is an updated patch for this issue that does not have any errors "task 
reported no progress for 600 seconds" even if there is progress. In fact it is 
a datanode allocation patch. Each datanode sends an additional load data to 
namenode that indicates how many bllocks it is currently writing or reading. 
The namenode, when choosing datanodes for new block takes this load into 
consideration, and discards datanodes whose load is more than twice that of 
average.

Thiss is in addition to the requirement that the datanode has enough space to 
store min_num_blocks.

With this patch, I never see the "no progress for 600 seconds, killing task" 
error. Therefore, on my 240 node cluster, the randomwriter times went down from 
3997 seconds to 2404 seconds.

This patch includes the file-writing progress patch as well. So, please discard 
 the two patches I submitted earlier.

> Progress in writing a DFS file does not count towards Job progress and can 
> make the task timeout
> ------------------------------------------------------------------------------------------------
>
>          Key: HADOOP-318
>          URL: http://issues.apache.org/jira/browse/HADOOP-318
>      Project: Hadoop
>         Type: Bug

>   Components: mapred
>     Versions: 0.3.2
>  Environment: all, but especially on big busy clusters
>     Reporter: Milind Bhandarkar
>     Assignee: Milind Bhandarkar
>      Fix For: 0.4.0
>  Attachments: hadoop-datanode-allocation.patch, hadoop-latency-new.patch, 
> hadoop-latency.patch
>
> When a task writes to DFS file, depending on how busy the cluster is, it can 
> timeout after 10 minutes by default, because the progress towards writing a 
> DFS file does not count as progress of the task. The solution (patch is 
> forthcoming) is to provide a way to callback reporter to report task progress 
> from DFSOutputStream.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira

[jira] Updated: (HADOOP-318) Progress in writing a DFS file does not count towards Job progress and can make the task timeout

Reply via email to