[
https://issues.apache.org/jira/browse/HADOOP-4532?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12643246#action_12643246
]
Steve Loughran commented on HADOOP-4532:
----------------------------------------
Intercepting System.exit() calls stops this and changes the stack to a
transition to failed state instead.
http://jira.smartfrog.org/jira/browse/SFOS-1016
[sf-startdaemon-debug] 08/10/28 16:18:21 [Thread-305] INFO common.Storage :
Image file of size 93 saved in 0 seconds.
[sf-startdaemon-debug] 08/10/28 16:18:21 [Thread-305] ERROR
namenode.FSNamesystem : FSNamesystem initialization failed.
[sf-startdaemon-debug] java.nio.channels.ClosedByInterruptException
[sf-startdaemon-debug] at
java.nio.channels.spi.AbstractInterruptibleChannel.end(AbstractInterruptibleChannel.java:184)
[sf-startdaemon-debug] at
sun.nio.ch.FileChannelImpl.size(FileChannelImpl.java:317)
[sf-startdaemon-debug] at
org.apache.hadoop.hdfs.server.namenode.FSEditLog$EditLogFileOutputStream.<init>(FSEditLog.java:128)
[sf-startdaemon-debug] at
org.apache.hadoop.hdfs.server.namenode.FSEditLog.createEditLogFile(FSEditLog.java:343)
[sf-startdaemon-debug] at
org.apache.hadoop.hdfs.server.namenode.FSImage.saveFSImage(FSImage.java:1030)
[sf-startdaemon-debug] at
org.apache.hadoop.hdfs.server.namenode.FSDirectory.loadFSImage(FSDirectory.java:88)
[sf-startdaemon-debug] at
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.initialize(FSNamesystem.java:311)
[sf-startdaemon-debug] at
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.<init>(FSNamesystem.java:290)
[sf-startdaemon-debug] at
org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:165)
[sf-startdaemon-debug] at
org.apache.hadoop.hdfs.server.namenode.NameNode.innerStart(NameNode.java:226)
[sf-startdaemon-debug] at
org.apache.hadoop.util.Service.start(Service.java:188)
[sf-startdaemon-debug] at
org.smartfrog.services.hadoop.components.cluster.HadoopServiceImpl.innerDeploy(HadoopServiceImpl.java:479)
[sf-startdaemon-debug] at
org.smartfrog.services.hadoop.components.cluster.HadoopServiceImpl.access$000(HadoopServiceImpl.java:46)
[sf-startdaemon-debug] at
org.smartfrog.services.hadoop.components.cluster.HadoopServiceImpl$ServiceDeployerThread.execute(HadoopServiceImpl.java:628)
[sf-startdaemon-debug] at
org.smartfrog.sfcore.utils.SmartFrogThread.run(SmartFrogThread.java:279)
[sf-startdaemon-debug] at
org.smartfrog.sfcore.utils.WorkflowThread.run(WorkflowThread.java:117)
[sf-startdaemon-debug] 08/10/28 16:18:21 [Thread-305] INFO namenode.NameNode :
State change: NameNode is now FAILED
> Interrupting the namenode thread triggers System.exit()
> -------------------------------------------------------
>
> Key: HADOOP-4532
> URL: https://issues.apache.org/jira/browse/HADOOP-4532
> Project: Hadoop Core
> Issue Type: Bug
> Components: dfs
> Affects Versions: 0.20.0
> Reporter: Steve Loughran
> Priority: Minor
>
> My service setup/teardown tests are managing to trigger system exits in the
> namenode, which seems overkill.
> 1. Interrupting the thread that is starting the namesystem up raises a
> java.nio.channels.ClosedByInterruptException.
> 2. This is caught in FSImage.rollFSImage, and handed off to processIOError
> 3. This triggers a call to Runtime.getRuntime().exit(-1); "All storage
> directories are inaccessible.".
> Stack trace to follow. Exiting the JVM is somewhat overkill; if someone has
> interrupted the thread is is (presumably) because they want to stop the
> namenode, which may not imply they want to kill the JVM at the same time.
> Certainly JUnit does not expect it.
> Some possibilities
> -ClosedByInterruptException get handled differently as some form of shutdown
> request
> -Calls to system exit are factored out into something that can have its
> behaviour changed by policy options to throw a RuntimeException instead.
> Hosting a Namenode in a security manager that blocks off System.exit() is the
> simplest workaround; this is fairly simple, but it means that what would be a
> straight exit does now get turned into an exception, so callers may be
> surprised by what happens.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.