[
https://issues.apache.org/jira/browse/MAPREDUCE-2850?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13125749#comment-13125749
]
Ravi Gummadi commented on MAPREDUCE-2850:
-----------------------------------------
>> In prepareDirToFail it says the file is set to perms "000" but
>> File#createNewFile uses the default perms (eg 644 with umask 022), so it
>> should still be accessible right?
No. The comment was wrong. I replace the directory with a file so that
DiskChecker.checkDirs() will fail because it tries to do mkdir with the same
name and this will be reported as a disk failure. Updating the comment
accordingly.
>> If you want not always have waitForDiskHealthCheck wait for 10s at a time
>> seems like you can lower the DISK_HEALTH_CHECK_INTERVAL to eg 1s.
OK. Changing to 1 sec.
>> Would also be good to test startup with a failed directory. Feel free to
>> punt this to MAPREDUCE-2921.
This change would need a handle into the code of
MiniMRCluster.TaskTrackerRunner() and needs some refactoring and exposing some
api in MiniMRCluster. So not doing it as part of current JIRA.
> TaskTracker disk failure handling (MR-2413) has no test coverage
> ----------------------------------------------------------------
>
> Key: MAPREDUCE-2850
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-2850
> Project: Hadoop Map/Reduce
> Issue Type: Sub-task
> Components: tasktracker
> Affects Versions: 0.20.204.0
> Reporter: Eli Collins
> Assignee: Ravi Gummadi
> Attachments: MR2850.v0.patch
>
>
> MR-2413 doesn't have any test coverage that eg tests that the TT can survive
> disk failure.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira