[
https://issues.apache.org/jira/browse/HDFS-3347?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13267248#comment-13267248
]
amith commented on HDFS-3347:
-----------------------------
Hi
Problem is with transitionToActive for namenode.
Analysis:
here
{code}
try {
editLogStream.write(op);
} catch (IOException ex) {
// All journals failed, it is handled in logSync.
}
{code}
editLogStream is found to be null !!!
I got the problem is in transitionToActive of NN
in
{code}
void startActiveServices() throws IOException {
LOG.info("Starting services required for active state");
writeLock();
try {
FSEditLog editLog = dir.fsImage.getEditLog();
if (!editLog.isOpenForWrite()) {
// During startup, we're already open for write during initialization.
editLog.initJournalsForWrite();<-- Here IAE thrown caused the
transition failure
.
.
.
dir.fsImage.editLog.openForWrite();<-- Here editLogStream var is
initialised this is not done due to IAE
}
.
.
.
}
{code}
FSEditLog#startLogSegment() will populate the 'editLogStream' variable which
will be done when NN is starting as active itself.
Here editLog.initJournalsForWrite() can fail throwing
{code}
java.lang.IllegalArgumentException: No class configured for bkpr
at
org.apache.hadoop.hdfs.server.namenode.FSEditLog.getJournalClass(FSEditLog.java:1204)
at
org.apache.hadoop.hdfs.server.namenode.FSEditLog.createJournal(FSEditLog.java:1218)
at
org.apache.hadoop.hdfs.server.namenode.FSEditLog.initJournals(FSEditLog.java:242)
at
org.apache.hadoop.hdfs.server.namenode.FSEditLog.initJournalsForWrite(FSEditLog.java:210)
at
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startActiveServices(FSNamesystem.java:596)
at
org.apache.hadoop.hdfs.server.namenode.NameNode$NameNodeHAContext.startActiveServices(NameNode.java:1246)
at
org.apache.hadoop.hdfs.server.namenode.ha.ActiveState.enterState(ActiveState.java:61)
at
org.apache.hadoop.hdfs.server.namenode.ha.HAState.setStateInternal(HAState.java:63)
at
org.apache.hadoop.hdfs.server.namenode.ha.StandbyState.setState(StandbyState.java:49)
at
org.apache.hadoop.hdfs.server.namenode.NameNode.transitionToActive(NameNode.java:1178)
at
org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.transitionToActive(NameNodeRpcServer.java:981)
at
org.apache.hadoop.ha.protocolPB.HAServiceProtocolServerSideTranslatorPB.transitionToActive(HAServiceProtocolServerSideTranslatorPB.java:80)
at
org.apache.hadoop.ha.proto.HAServiceProtocolProtos$HAServiceProtocolService$2.callBlockingMethod(HAServiceProtocolProtos.java:2827)
at
org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:428)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:905)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1688)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1684)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1205)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1682)
{code}
which caused the init to fail !
If I check the status of NN using the command
./hdfs haadmin -getServiceState nn1
active
I got the reply as active !!!
Now perform any write operation which will log to edits cause the NPE.
I feel that if any transition fails due to exception it should remain in its
old state (here standby) :)
Please correct me if I am wrong
> NullPointerException When trying to log to editstreams
> ------------------------------------------------------
>
> Key: HDFS-3347
> URL: https://issues.apache.org/jira/browse/HDFS-3347
> Project: Hadoop HDFS
> Issue Type: Bug
> Components: name-node
> Environment: HDFS
> Reporter: amith
> Assignee: amith
>
> When i try to create a file i got a exception
> {code}
> 2012-05-02 17:42:55,768 DEBUG hdfs.StateChange
> (NameNodeRpcServer.java:create(402)) - *DIR* NameNode.create: file
> /a._COPYING_ for DFSClient_NONMAPREDUCE_1515782500_1 at 10.18.40.95
> 2012-05-02 17:42:55,770 DEBUG hdfs.StateChange
> (FSNamesystem.java:startFileInternal(1547)) - DIR* NameSystem.startFile:
> src=/a._COPYING_, holder=DFSClient_NONMAPREDUCE_1515782500_1,
> clientMachine=10.18.40.95, createParent=true, replication=1,
> createFlag=[CREATE, OVERWRITE]
> 2012-05-02 17:42:55,778 WARN ipc.Server (Server.java:run(1701)) - IPC Server
> handler 1 on 8020, call org.apache.hadoop.hdfs.protocol.ClientProtocol.create
> from 10.18.40.95:37973: error: java.lang.NullPointerException
> java.lang.NullPointerException
> at
> org.apache.hadoop.hdfs.server.namenode.FSEditLog.logEdit(FSEditLog.java:348)
> at
> org.apache.hadoop.hdfs.server.namenode.FSEditLog.logGenerationStamp(FSEditLog.java:755)
> at
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.nextGenerationStamp(FSNamesystem.java:4357)
> at
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFileInternal(FSNamesystem.java:1621)
> at
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFile(FSNamesystem.java:1509)
> at
> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.create(NameNodeRpcServer.java:409)
> at
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.create(ClientNamenodeProtocolServerSideTranslatorPB.java:200)
> at
> org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java:42590)
> at
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:428)
> at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:905)
> at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1688)
> at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1684)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:396)
> at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1205)
> at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1682)
> {code}
> analysing the same, will provide the details soon.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira