[ https://issues.apache.org/jira/browse/HDFS-15348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17103796#comment-17103796 ]
xuzq commented on HDFS-15348: ----------------------------- Maybe we can turn off the `dfs.ha.tail-edits.in-progress` when Standby transfer to Active. And turn on when Active transfer to Standby. > [SBN Read] IllegalStateException happened when doing failover > ------------------------------------------------------------- > > Key: HDFS-15348 > URL: https://issues.apache.org/jira/browse/HDFS-15348 > Project: Hadoop HDFS > Issue Type: Bug > Reporter: xuzq > Priority: Major > > Standby shutdown when doing failover, and throw IllegalStateException. > `getJournaledEdits` only return `dfs.ha.tail-edits.qjm.rpc.max-txns` edits, > resulting in failure to replay all edits in `catchupDuringFailover()`. > > And check `streams.isEmpty()` will be throw this exception in > `FSEditLog#openForWrite` > The exception like: > > {code:java} > 2020-05-10 09:20:02,235 ERROR org.apache.hadoop.hdfs.server.namenode.NameNode > IPC Server handler 763 on 8022: Error encountered requiring NN sh > utdown. Shutting down immediately. > java.lang.IllegalStateException: Cannot start writing at txid 173922195318 > when there is a stream available for read: org.apache.hadoop.hdfs.se > rver.namenode.RedundantEditLogInputStream@47b73995 > at > org.apache.hadoop.hdfs.server.namenode.FSEditLog.openForWrite(FSEditLog.java:320) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startActiveServices(FSNamesystem.java:1352) > at > org.apache.hadoop.hdfs.server.namenode.NameNode$NameNodeHAContext.startActiveServices(NameNode.java:1890) > at > org.apache.hadoop.hdfs.server.namenode.ha.ActiveState.enterState(ActiveState.java:61) > at > org.apache.hadoop.hdfs.server.namenode.ha.HAState.setStateInternal(HAState.java:64) > at > org.apache.hadoop.hdfs.server.namenode.ha.StandbyState.setState(StandbyState.java:49) > at > org.apache.hadoop.hdfs.server.namenode.NameNode.transitionToActive(NameNode.java:1763) > at > org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.transitionToActive(NameNodeRpcServer.java:1605){code} > -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org