[
https://issues.apache.org/jira/browse/HDFS-3452?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13280956#comment-13280956
]
Ivan Kelly commented on HDFS-3452:
----------------------------------
Moved to HDFS.
I've thought about this a bit more. Your patch is a good start, but it actually
does more than we need in some parts. Really, the purpose of the locking is to
ensure that we do not add new entries without having read all previous entries.
Locking on the creation of inprogress znodes should be enough to ensure this.
Fencing should take care of any other cases.
startLogSegment should work as follows:
# get version(V) and content(C) of writePermissions znode. C is the path to an
inprogress znode (Z1), or null
# if Z1 exists, throw an exception. Otherwise proceed.
# create inprogress znode(Z2) and ledger.
# write writePermissions znode with Z2 and V.
finalizeLogSegment should read writePermissions znode and null it if content
matches the inprogress znode it is finalizing.
So,
a) I think WritePermission should be called something more like
CurrentInprogress.
b) The interface should be something like
{code}
public class CurrentInprogress {
String readCurrent(); // returns current znode or null
void updateCurrent(String path) throws Exception;
void clearCurrent();
}
{code}
c) This only ever needs to be used in startLogSegment. #clearCurrent is really
optional, but there for completeness.
d) #checkPermission is unnecessary. If something else has opened another
inprogress znode while we are writing, it should have closed the ledger we were
writing to, thereby fencing it, thereby stopping any further writes.
e) The actual data stored in the znode should include a version number, a
hostname and then the path. This will make debugging easier.
f) You have some tabs.
> BKJM:Switch from standby to active fails and NN gets shut down due to delay
> in clearing of lock
> -----------------------------------------------------------------------------------------------
>
> Key: HDFS-3452
> URL: https://issues.apache.org/jira/browse/HDFS-3452
> Project: Hadoop HDFS
> Issue Type: Sub-task
> Reporter: suja s
> Assignee: Uma Maheswara Rao G
> Priority: Blocker
> Attachments: BK-253-BKJM.patch
>
>
> Normal switch fails.
> (BKjournalManager zk session timeout is 3000 and ZKFC session timeout is
> 5000. By the time control comes to acquire lock the previous lock is not
> released which leads to failure in lock acquisition by NN and NN gets
> shutdown. Ideally it should have been done)
> =============================================================================
> 2012-05-09 20:15:29,732 ERROR org.apache.hadoop.contrib.bkjournal.WriteLock:
> Failed to acquire lock with /ledgers/lock/lock-0000000007, lock-0000000006
> already has it
> 2012-05-09 20:15:29,732 FATAL
> org.apache.hadoop.hdfs.server.namenode.FSEditLog: Error:
> recoverUnfinalizedSegments failed for required journal
> (JournalAndStream(mgr=org.apache.hadoop.contrib.bkjournal.BookKeeperJournalManager@412beeec,
> stream=null))
> java.io.IOException: Could not acquire lock
> at org.apache.hadoop.contrib.bkjournal.WriteLock.acquire(WriteLock.java:107)
> at
> org.apache.hadoop.contrib.bkjournal.BookKeeperJournalManager.recoverUnfinalizedSegments(BookKeeperJournalManager.java:406)
> at
> org.apache.hadoop.hdfs.server.namenode.JournalSet$6.apply(JournalSet.java:551)
> at
> org.apache.hadoop.hdfs.server.namenode.JournalSet.mapJournalsAndReportErrors(JournalSet.java:322)
> at
> org.apache.hadoop.hdfs.server.namenode.JournalSet.recoverUnfinalizedSegments(JournalSet.java:548)
> at
> org.apache.hadoop.hdfs.server.namenode.FSEditLog.recoverUnclosedStreams(FSEditLog.java:1134)
> at
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startActiveServices(FSNamesystem.java:598)
> at
> org.apache.hadoop.hdfs.server.namenode.NameNode$NameNodeHAContext.startActiveServices(NameNode.java:1287)
> at
> org.apache.hadoop.hdfs.server.namenode.ha.ActiveState.enterState(ActiveState.java:61)
> at
> org.apache.hadoop.hdfs.server.namenode.ha.HAState.setStateInternal(HAState.java:63)
> at
> org.apache.hadoop.hdfs.server.namenode.ha.StandbyState.setState(StandbyState.java:49)
> at
> org.apache.hadoop.hdfs.server.namenode.NameNode.transitionToActive(NameNode.java:1219)
> at
> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.transitionToActive(NameNodeRpcServer.java:978)
> at
> org.apache.hadoop.ha.protocolPB.HAServiceProtocolServerSideTranslatorPB.transitionToActive(HAServiceProtocolServerSideTranslatorPB.java:107)
> at
> org.apache.hadoop.ha.proto.HAServiceProtocolProtos$HAServiceProtocolService$2.callBlockingMethod(HAServiceProtocolProtos.java:3633)
> at
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:427)
> at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:916)
> at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1692)
> at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1688)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:396)
> at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1232)
> at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1686)
> 2012-05-09 20:15:29,736 INFO org.apache.hadoop.hdfs.server.namenode.NameNode:
> SHUTDOWN_MSG:
> /************************************************************
> SHUTDOWN_MSG: Shutting down NameNode at HOST-XX-XX-XX-XX/XX.XX.XX.XX
> Scenario:
> Start ZKFCS, NNs
> NN1 is active and NN2 is standby
> Stop NN1. NN2 tries to transition to active and gets shut down
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira