[ 
https://issues.apache.org/jira/browse/HDFS-15323?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17097735#comment-17097735
 ] 

Konstantin Shvachko commented on HDFS-15323:
--------------------------------------------

This is the exception it throws:
{noformat:nowrap}
2020-04-30 02:07:19,087 ERROR org.apache.hadoop.hdfs.server.namenode.NameNode: 
Error encountered requiring NN shutdown. Shutting down immediately.
java.lang.IllegalStateException: Cannot start writing at txid 449161330348 when 
there is a stream available for read: 
org.apache.hadoop.hdfs.server.namenode.RedundantEditLogInputStream@68063923
        at 
org.apache.hadoop.hdfs.server.namenode.FSEditLog.openForWrite(FSEditLog.java:321)
        at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startActiveServices(FSNamesystem.java:1148)
        at 
org.apache.hadoop.hdfs.server.namenode.NameNode$NameNodeHAContext.startActiveServices(NameNode.java:1768)
        at 
org.apache.hadoop.hdfs.server.namenode.ha.ActiveState.enterState(ActiveState.java:61)
        at 
org.apache.hadoop.hdfs.server.namenode.ha.HAState.setStateInternal(HAState.java:64)
        at 
org.apache.hadoop.hdfs.server.namenode.ha.StandbyState.setState(StandbyState.java:60)
        at 
org.apache.hadoop.hdfs.server.namenode.NameNode.transitionToActive(NameNode.java:1627)
        at 
org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.transitionToActive(NameNodeRpcServer.java:1513)
        at 
org.apache.hadoop.ha.protocolPB.HAServiceProtocolServerSideTranslatorPB.transitionToActive(HAServiceProtocolServerSideTranslatorPB.java:112)
        at 
org.apache.hadoop.ha.proto.HAServiceProtocolProtos$HAServiceProtocolService$2.callBlockingMethod(HAServiceProtocolProtos.java:5409)
        at 
org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:622)
        at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1027)
        at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2373)
        at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2369)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:422)
        at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1754)
        at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2369)
2020-04-30 02:07:19,100 INFO org.apache.hadoop.util.ExitUtil: Exiting with 
status 1
2020-04-30 02:07:19,101 INFO org.apache.hadoop.hdfs.server.namenode.NameNode: 
SHUTDOWN_MSG: 
/************************************************************
SHUTDOWN_MSG: Shutting down NameNode at 
************************************************************/
{noformat}
The exception means that SBN did not finish catching up to the final state of 
the journal. There is more transactions there than it consumed while catching 
up.

> StandbyNode fails transition to active due to insufficient transaction tailing
> ------------------------------------------------------------------------------
>
>                 Key: HDFS-15323
>                 URL: https://issues.apache.org/jira/browse/HDFS-15323
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: namenode, qjm
>    Affects Versions: 2.7.7
>            Reporter: Konstantin Shvachko
>            Priority: Major
>
> StandbyNode is asked to {{transitionToActive()}}. If it fell too far behind 
> in tailing journal transaction (from QJM) it can crash with 
> {{IllegalStateException}}.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to