[ https://issues.apache.org/jira/browse/HADOOP-1605?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
dhruba borthakur updated HADOOP-1605: ------------------------------------- Attachment: watcher.patch This patch implements a separate process called the Watcher. It currently monitors only the Namenode. If the NameNode dies, then the watcher restarts it. A testcase to test this code is being written. One question is whether this watcher process is helpful to real deployments. In can so happen, that a cluster administrator might actually want to watch a variety of services, Hadoop being only one of them. In that case, the sys admin probably would like to integrate the hadoop servers into that framework. Most OS's have software that can watch over a set of services. > Automatic namenode restart when it encounters an error situation > ---------------------------------------------------------------- > > Key: HADOOP-1605 > URL: https://issues.apache.org/jira/browse/HADOOP-1605 > Project: Hadoop > Issue Type: Improvement > Components: dfs > Reporter: dhruba borthakur > Assignee: dhruba borthakur > Fix For: 0.15.0 > > Attachments: namenodeRestart4.patch, watcher.patch > > > The namenode dies when it encounters an unexpected Runtime Exception. > Instead, it can catch exceptions, clears up all its internal data structures > and restarts. This was attempted in HADOOP-1486 earlier. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.