[jira] Commented: (HADOOP-370) TaskTracker startup fails if any mapred.local.dir entries don't exist

Doug Cutting (JIRA) Thu, 20 Jul 2006 02:46:29 -0700

    [ 
http://issues.apache.org/jira/browse/HADOOP-370?page=comments#action_12422378 ] 
            
Doug Cutting commented on HADOOP-370:
-------------------------------------


Yes, let's cache the "good dirs".  If a drive goes offline or becomes 
unwritable while a node is running, then we should start emitting warnings, but 
we should not warn more than once for drives that are offline or unwritable at 
startup.

> TaskTracker startup fails if any mapred.local.dir entries don't exist
> ---------------------------------------------------------------------
>
>                 Key: HADOOP-370
>                 URL: http://issues.apache.org/jira/browse/HADOOP-370
>             Project: Hadoop
>          Issue Type: Bug
>          Components: mapred
>         Environment: ~30 node cluster, various size/number of disks, CPUs, 
> memory
>            Reporter: Bryan Pendleton
>         Attachments: fix-freespace-tasktracker-failure.txt
>
>
> This appears to have been introduced with the "check for enough free space" 
> before startup.
> It's debatable how best to fix this bug. I will submit a patch which ignores 
> directories for which the DF utility fails. This is letting me continue 
> operation on my cluster (where the number of drives varies, so there are 
> entries in mapred.local.dir for drives that aren't on all cluster nodes), but 
> a cleaner solution is probably better. I'd lean towards "check for 
> existence", and ignore the dir if it doesn't  - but don't depend on DF to 
> fail, since DF could fail for other reasons without meaning you're out of 
> disk space. I argue that a TaskTracker should start up if *all* directories 
> that *can be written to* in the list have enough space. Otherwise, a failed 
> drive per cluster machine means no work ever gets done.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: 
http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] Commented: (HADOOP-370) TaskTracker startup fails if any mapred.local.dir entries don't exist

Reply via email to