[jira] [Updated] (MAPREDUCE-3121) NodeManager should handle disk-failures

Vinod Kumar Vavilapalli (Updated) (JIRA) Mon, 07 Nov 2011 22:13:20 -0800

     [ 
https://issues.apache.org/jira/browse/MAPREDUCE-3121?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Vinod Kumar Vavilapalli updated MAPREDUCE-3121:
-----------------------------------------------

    Status: Open  (was: Patch Available)

My comments on the patch.
----
*Major:*
DiskHealthCheckerService
 - Let us not use the configuration property to read the dir-list anywhere. I 
already see a couple of places where this can be trouble-some. Make sure that 
YarnConfiguration.NM_LOCAL_DIRS and YarnConfiguration.NM_LOG_DIRS aren't read 
anywhere after the initial NM-start.
   -- After this, you should definitely drop 
DiskHealthCheckerService.updateDirsInConfiguration().
 - ResourceLocalizationService.localDirsSelector doesn't look at healthy disks, 
same for ShuffleHandler.
 - Let's have a single API to return local-dirs either Array or a list. Let the 
caller take care of any conversions they need. Ditto about 
getLocalDirsString()/getLogDirsString().
 - Same as above for the LocalStorage class and getNumDirs() methods in there.
 - Instead of setting the disksInfo in NodePage, we can set it directly in the 
health-report via NodeStatusUpdater. That way, RM will also get the report.

bq. IMO, one approach we could take is to do a check at the top level. For 
example, ContainersLaunch as per the patch does a sanity check on available 
disks and bails if there is an issue. It could create a snapshot of available 
disks at this point.
Agreed, we should check at ContainersLauncherImpl level before launching every 
container and send out a container-failed event in case all disks have gone bad.
----
*Minor:*
DiskHealthCheckerService
 - DisksHealthMonitorExecutor can be static and final.
 - Drop the constructor DiskHealthCheckerService(Configuration conf) and use 
init() life-cycle directly.
 - init(Configuration conf): call super.init() at the end so that it can set 
the service-state correctly. Same for start(), stop()
 - Declare diskHealthChecker as a DisksHealthMonitorExecutor instead of 
TimerTask so as to avoid unnecessary casting in get{Log|Local}Dirs()
 - LocalStorage is way too general. Let's rename LocalStorage to 
LocalDirectories?

Let's have multiple local and log-dirs by default in MiniYarnCluster.

TestDiskFailures:
 - No need for hard-coded 2 secs sleep, MiniYarnCluster already has code to 
wait for starting.
 - waitForDiskHealthCheck() is error-prone. You can set an inline 
DiskHealthChecker and invoke explicit checking of disks.
----
*Miscellaneous:*
Like I summarized in my first comment, there are more pending items. Let me 
know if you want to tackle them separately:
 - We need a configuration for the max-percentage of disks that can fail 
without shutting down NM.
 - If all or a major percentage of disks turn bad at startup, NodeManager 
should either fail to start or get started, and report the RM that it is bad 
and needs operator intervention.
 - If all or a major percentage of disks turn bad during runtime, NodeManager 
can update the RM and ask for a halt in the scheduling.

                
> NodeManager should handle disk-failures
> ---------------------------------------
>
>                 Key: MAPREDUCE-3121
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3121
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>          Components: mrv2, nodemanager
>    Affects Versions: 0.23.0
>            Reporter: Vinod Kumar Vavilapalli
>            Assignee: Ravi Gummadi
>             Fix For: 0.23.1
>
>         Attachments: 3121.patch
>
>
> This is akin to MAPREDUCE-2413 but for YARN's NodeManager. We want to 
> minimize the impact of transient/permanent disk failures on containers. With 
> larger number of disks per node, the ability to continue to run containers on 
> other disks is crucial.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (MAPREDUCE-3121) NodeManager should handle disk-failures

Reply via email to