[ http://issues.apache.org/jira/browse/HADOOP-370?page=comments#action_12422206 ] Bryan Pendleton commented on HADOOP-370: ----------------------------------------
Yeah, that sounds like a better approach. I'd be happy to implement that in the patch instead, modulo a dangling issue: Should "good dirs" (ie, the new return value for checkLocalDirs) be cached? Implication: after initialization, no further checking for writability of a directory, and the directory list can only get smaller during an instance of a daemon. The alternative is, as I'm seeing with my current patch, a lot of extraneous log output that isn't really valuable. > TaskTracker startup fails if any mapred.local.dir entries don't exist > --------------------------------------------------------------------- > > Key: HADOOP-370 > URL: http://issues.apache.org/jira/browse/HADOOP-370 > Project: Hadoop > Issue Type: Bug > Components: mapred > Environment: ~30 node cluster, various size/number of disks, CPUs, > memory > Reporter: Bryan Pendleton > Attachments: fix-freespace-tasktracker-failure.txt > > > This appears to have been introduced with the "check for enough free space" > before startup. > It's debatable how best to fix this bug. I will submit a patch which ignores > directories for which the DF utility fails. This is letting me continue > operation on my cluster (where the number of drives varies, so there are > entries in mapred.local.dir for drives that aren't on all cluster nodes), but > a cleaner solution is probably better. I'd lean towards "check for > existence", and ignore the dir if it doesn't - but don't depend on DF to > fail, since DF could fail for other reasons without meaning you're out of > disk space. I argue that a TaskTracker should start up if *all* directories > that *can be written to* in the list have enough space. Otherwise, a failed > drive per cluster machine means no work ever gets done. -- This message is automatically generated by JIRA. - If you think it was sent incorrectly contact one of the administrators: http://issues.apache.org/jira/secure/Administrators.jspa - For more information on JIRA, see: http://www.atlassian.com/software/jira
