[
https://issues.apache.org/jira/browse/HADOOP-4532?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12643194#action_12643194
]
Steve Loughran commented on HADOOP-4532:
----------------------------------------
stack trace. FSImage does not like to be interrupted.
[sf-startdaemon-debug] 08/10/28 12:50:22 [Thread-305] ERROR common.Storage :
Cannot write file /tmp/hadoop/dfs/name
[sf-startdaemon-debug] java.nio.channels.ClosedByInterruptException
[sf-startdaemon-debug] at
java.nio.channels.spi.AbstractInterruptibleChannel.end(AbstractInterruptibleChannel.java:184)
[sf-startdaemon-debug] at
sun.nio.ch.FileChannelImpl.position(FileChannelImpl.java:271)
[sf-startdaemon-debug] at
org.apache.hadoop.hdfs.server.common.Storage$StorageDirectory.write(Storage.java:268)
[sf-startdaemon-debug] at
org.apache.hadoop.hdfs.server.common.Storage$StorageDirectory.write(Storage.java:244)
[sf-startdaemon-debug] at
org.apache.hadoop.hdfs.server.namenode.FSImage.rollFSImage(FSImage.java:1316)
[sf-startdaemon-debug] at
org.apache.hadoop.hdfs.server.namenode.FSImage.saveFSImage(FSImage.java:1034)
[sf-startdaemon-debug] at
org.apache.hadoop.hdfs.server.namenode.FSDirectory.loadFSImage(FSDirectory.java:88)
[sf-startdaemon-debug] at
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.initialize(FSNamesystem.java:311)
[sf-startdaemon-debug] at
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.<init>(FSNamesystem.java:290)
[sf-startdaemon-debug] at
org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:165)
[sf-startdaemon-debug] at
org.apache.hadoop.hdfs.server.namenode.NameNode.innerStart(NameNode.java:226)
[sf-startdaemon-debug] at
org.apache.hadoop.util.Service.start(Service.java:188)
[sf-startdaemon-debug] at
org.smartfrog.services.hadoop.components.cluster.HadoopServiceImpl.innerDeploy(HadoopServiceImpl.java:479)
[sf-startdaemon-debug] at
org.smartfrog.services.hadoop.components.cluster.HadoopServiceImpl.access$000(HadoopServiceImpl.java:46)
[sf-startdaemon-debug] at
org.smartfrog.services.hadoop.components.cluster.HadoopServiceImpl$ServiceDeployerThread.execute(HadoopServiceImpl.java:628)
> Interrupting the namenode thread triggers System.exit()
> -------------------------------------------------------
>
> Key: HADOOP-4532
> URL: https://issues.apache.org/jira/browse/HADOOP-4532
> Project: Hadoop Core
> Issue Type: Bug
> Components: dfs
> Affects Versions: 0.20.0
> Reporter: Steve Loughran
> Priority: Minor
>
> My service setup/teardown tests are managing to trigger system exits in the
> namenode, which seems overkill.
> 1. Interrupting the thread that is starting the namesystem up raises a
> java.nio.channels.ClosedByInterruptException.
> 2. This is caught in FSImage.rollFSImage, and handed off to processIOError
> 3. This triggers a call to Runtime.getRuntime().exit(-1); "All storage
> directories are inaccessible.".
> Stack trace to follow. Exiting the JVM is somewhat overkill; if someone has
> interrupted the thread is is (presumably) because they want to stop the
> namenode, which may not imply they want to kill the JVM at the same time.
> Certainly JUnit does not expect it.
> Some possibilities
> -ClosedByInterruptException get handled differently as some form of shutdown
> request
> -Calls to system exit are factored out into something that can have its
> behaviour changed by policy options to throw a RuntimeException instead.
> Hosting a Namenode in a security manager that blocks off System.exit() is the
> simplest workaround; this is fairly simple, but it means that what would be a
> straight exit does now get turned into an exception, so callers may be
> surprised by what happens.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.