[
https://issues.apache.org/jira/browse/HDFS-13051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16336048#comment-16336048
]
zhangwei edited comment on HDFS-13051 at 1/23/18 5:19 PM:
----------------------------------------------------------
*Yes*
*when secondnode send a rollEditlog rpc call*
*In FSNamesystem below method will be call and acquire writelock*
CheckpointSignature rollEditLog() throws IOException {
checkSuperuserPrivilege();
checkOperation(OperationCategory.JOURNAL);
writeLock();
try {
checkOperation(OperationCategory.JOURNAL);
checkNameNodeSafeMode("Log not rolled");
if (Server.isRpcInvocation())
{ LOG.info("Roll Edit Log from " + Server.getRemoteAddress()); }
return *getFSImage().rollEditLog();*
} finally
{ writeUnlock(); }
}
*finally it calls getFSImage().rollEditLog() -> FSEditlog.rollEditlog()*
*below is the code of* *FSEditlog.rollEditlog()***
synchronized long rollEditLog() throws IOException
{ LOG.info("Rolling edit logs"); endCurrentLogSegment(true); long nextTxId =
getLastWrittenTxId() + 1; startLogSegment(nextTxId, true); assert
curSegmentTxId == nextTxId; return nextTxId; }
*Remember!! This method will acquire FSEditlogAsync object lock.*
*In a case*
*At the same moment another thread concurrently take a RPCEdit log from queue
like below*
FSEditlogAsync run method
@Override
public void run() {
try {
while (true) {
boolean doSync;
Edit edit = *dequeueEdit();*
if (edit != null)
{ // sync if requested by edit log. doSync = *edit.logEdit();*
syncWaitQ.add(edit); }
else
{ // sync when editq runs dry, but have edits pending a sync. doSync =
!syncWaitQ.isEmpty(); }
*edit.logEdit method also need acquire FSEditlogAsync lock*
*as we know rollEditlog method already acquire* *FSEditlogAsync* *lock.*
*Let's go back to*
*FSEditlog.rollEditlog method*
*within rollEditlog method it will call below method*
endCurrentLogSegment(true)
public synchronized void endCurrentLogSegment(boolean writeEndTxn) {
LOG.info("Ending log segment " + curSegmentTxId);
Preconditions.checkState(isSegmentOpen(),
"Bad state: %s", state);
if (writeEndTxn)
{ logEdit(LogSegmentOp.getInstance(cache.get(),
FSEditLogOpCodes.OP_END_LOG_SEGMENT)); }
// always sync to ensure all edits are flushed.
*logSyncAll();*
printStatistics(true)
*the key point is logSyncAll method*
@Override
public void logSyncAll() {
// doesn't actually log anything, just ensures that the queues are
// drained when it returns.
Edit edit = new SyncEdit(this, null){
@Override
public boolean logEdit()
{ return true; }
};
enqueueEdit(edit);
*edit.logSyncWait();*
}
*it will finally call edit.logSyncWait*
@Override
public void logSyncWait() {
synchronized(lock) {
while (!done) {
try
{ lock.wait(10); }
catch (InterruptedException e) {}
}
// only needed by tests that rely on ExitUtil.terminate() since
// normally exceptions terminate the NN.
if (syncEx != null)
{ syncEx.fillInStackTrace(); throw syncEx; }
}
}
*we can see this method will not exit until variable 'done' is true, in other
hand, Rolleditlog will not release the FSEditlogAsync object lock until
variable 'done' is set to ture*
*where does variable done set to true, look back the FSEditlogAsync.run method
again*
*within* *FSEditlogAsync.run method we can see below code segment*
logSync(getLastWrittenTxId());
} catch (RuntimeException ex)
{ syncEx = ex; }
while ((edit = syncWaitQ.poll()) != null)
{ e*dit.logSyncNotify(syncEx);* }
}
}
@Override
public void logSyncNotify(RuntimeException ex) {
synchronized(lock)
{ done = true; syncEx = ex; lock.notifyAll(); }
}
*so only logSyncNotify method will set variable done to ture, but this method
is call at the end of run method. before this code segment is executed, we must
get the FSEditlogAsync object lock firstly to call the edit.logEdit method.*
private abstract static class Edit {
final FSEditLog log;
final FSEditLogOp op;
Edit(FSEditLog log, FSEditLogOp op)
{ this.log = log; this.op = op; }
// return whether edit log wants to sync.
*boolean logEdit() {*
return log.doEditTransaction(op);
}
*synchronized* boolean doEditTransaction(final FSEditLogOp op) {
long start = beginTransaction();
op.setTransactionId(txid);
try
{ editLogStream.write(op); }
catch (IOException ex)
{ // All journals failed, it is handled in logSync. }
finally
{ op.reset(); }
endTransaction(start);
return shouldForceSync();
}
*So the RollEditlog will not release the lock until run method set the variable
done to true.*
*but run method block on the lock which RollediltLog method hold to handle the
edit take from queue.*
*that's the dead lock, it will case namenode crash in any time.*
was (Author: photogamrun):
*Yes*
*when secondnode send a rollEditlog rpc call*
*In FSNamesystem below method will be call and acquire writelock*
CheckpointSignature rollEditLog() throws IOException {
checkSuperuserPrivilege();
checkOperation(OperationCategory.JOURNAL);
writeLock();
try {
checkOperation(OperationCategory.JOURNAL);
checkNameNodeSafeMode("Log not rolled");
if (Server.isRpcInvocation())
{ LOG.info("Roll Edit Log from " + Server.getRemoteAddress()); }
return *getFSImage().rollEditLog();*
} finally
{ writeUnlock(); }
}
*finally it calls getFSImage().rollEditLog() -> FSEditlog.rollEditlog()*
*below is the code of* *FSEditlog.rollEditlog()***
synchronized long rollEditLog() throws IOException
{ LOG.info("Rolling edit logs"); endCurrentLogSegment(true); long nextTxId =
getLastWrittenTxId() + 1; startLogSegment(nextTxId, true); assert
curSegmentTxId == nextTxId; return nextTxId; }
*Remember!! This method will acquire FSEditlogAsync object lock.*
*In a case*
*At the same moment another thread concurrently take a RPCEdit log from queue
like below*
FSEditlogAsync run method
@Override
public void run() {
try {
while (true) {
boolean doSync;
Edit edit = *dequeueEdit();*
if (edit != null)
{ // sync if requested by edit log. doSync = *edit.logEdit();*
syncWaitQ.add(edit); }
else
{ // sync when editq runs dry, but have edits pending a sync. doSync =
!syncWaitQ.isEmpty(); }
*edit.logEdit method also need acquire FSEditlogAsync lock*
*as we know rollEditlog method already acquire* *FSEditlogAsync* *lock.*
*Let's go back to*
*FSEditlog.rollEditlog method*
*within rollEditlog method it will call below method*
endCurrentLogSegment(true)
public synchronized void endCurrentLogSegment(boolean writeEndTxn) {
LOG.info("Ending log segment " + curSegmentTxId);
Preconditions.checkState(isSegmentOpen(),
"Bad state: %s", state);
if (writeEndTxn)
{ logEdit(LogSegmentOp.getInstance(cache.get(),
FSEditLogOpCodes.OP_END_LOG_SEGMENT)); }
// always sync to ensure all edits are flushed.
*logSyncAll();*
printStatistics(true)
*the key point is logSyncAll method*
@Override
public void logSyncAll() {
// doesn't actually log anything, just ensures that the queues are
// drained when it returns.
Edit edit = new SyncEdit(this, null){
@Override
public boolean logEdit()
{ return true; }
};
enqueueEdit(edit);
*edit.logSyncWait();*
}
*it will finally call edit.logSyncWait*
@Override
public void logSyncWait() {
synchronized(lock) {
while (!done) {
try
{ lock.wait(10); }
catch (InterruptedException e) {}
}
// only needed by tests that rely on ExitUtil.terminate() since
// normally exceptions terminate the NN.
if (syncEx != null)
{ syncEx.fillInStackTrace(); throw syncEx; }
}
}
*we can this method will not exit until variable 'done' is true, in other hand,
Rolleditlog will not release the FSEditlogAsync object lock until variable
'done' is set to ture*
*where does variable done set to true, look back the FSEditlogAsync.run method
again*
*within* *FSEditlogAsync.run method we can see below code segment*
logSync(getLastWrittenTxId());
} catch (RuntimeException ex)
{ syncEx = ex; }
while ((edit = syncWaitQ.poll()) != null)
{ e*dit.logSyncNotify(syncEx);* }
}
}
@Override
public void logSyncNotify(RuntimeException ex) {
synchronized(lock)
{ done = true; syncEx = ex; lock.notifyAll(); }
}
*so only logSyncNotify method will set variable done to ture, but this method
is call at the end of run method. before this code segment is executed, we must
get the FSEditlogAsync object lock firstly to call the edit.logEdit method.*
private abstract static class Edit {
final FSEditLog log;
final FSEditLogOp op;
Edit(FSEditLog log, FSEditLogOp op)
{ this.log = log; this.op = op; }
// return whether edit log wants to sync.
*boolean logEdit() {*
return log.doEditTransaction(op);
}
*synchronized* boolean doEditTransaction(final FSEditLogOp op) {
long start = beginTransaction();
op.setTransactionId(txid);
try
{ editLogStream.write(op); }
catch (IOException ex)
{ // All journals failed, it is handled in logSync. }
finally
{ op.reset(); }
endTransaction(start);
return shouldForceSync();
}
*So the RollEditlog will not release the lock until run method set the variable
done to true.*
*but run method block on the lock which RollediltLog method hold to handle the
edit take from queue.*
*that's the dead lock, it will case namenode crash in any time.*
> dead lock occurs when using async editlog while rolleditlog rpc call happen
> ---------------------------------------------------------------------------
>
> Key: HDFS-13051
> URL: https://issues.apache.org/jira/browse/HDFS-13051
> Project: Hadoop HDFS
> Issue Type: Bug
> Components: namenode
> Affects Versions: 2.7.5
> Reporter: zhangwei
> Priority: Major
>
> When we apply AsyncEditlog in our prod env, in extremely case, deadlock will
> occur and crash the namenode. in the case, rollEditlog rpc call happen while
> AsyncEditlog asynchronized takes edit object from queue and is going to
> acquire AsyncEditlog object lock to executed it's logEdit method.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]