Hi Srini, Thanks for your review!
On 10-07-23 15:27, Srinivas Eeda wrote: > thanks for making this patch, it looks good just few minor changes > about comments > > On 7/23/2010 5:15 AM, Wengang Wang wrote: > >In the following situation, there remains an incorrect bit in refmap on the > >recovery master. Finally the recovery master will fail at purging the lockres > >due to the incorrect bit in refmap. > > > >1) node A has no interest on lockres A any longer, so it is purging it. > >2) the owner of lockres A is node B, so node A is sending de-ref message > >to node B. > >3) at this time, node B crashed. node C becomes the recovery master. it > >recovers > >lockres A(because the master is the dead node B). > >4) node A migrated lockres A to node C with a refbit there. > >5) node A failed to send de-ref message to node B because it crashed. The > >failure > >is ignored. no other action is done for lockres A any more. > > > >For mormal, re-send the deref message to it to recovery master can fix it. > >Well, > >ignoring the failure of deref to the original master and not recovering the > >lockres > >to recovery master has the same effect. And the later is simpler. > > > >Signed-off-by: Wengang Wang <[email protected]> > >--- > > fs/ocfs2/dlm/dlmrecovery.c | 17 +++++++++++++++-- > > fs/ocfs2/dlm/dlmthread.c | 28 +++++++++++++++++----------- > > 2 files changed, 32 insertions(+), 13 deletions(-) > > > >diff --git a/fs/ocfs2/dlm/dlmrecovery.c b/fs/ocfs2/dlm/dlmrecovery.c > >index 9dfaac7..06640f6 100644 > >--- a/fs/ocfs2/dlm/dlmrecovery.c > >+++ b/fs/ocfs2/dlm/dlmrecovery.c > >@@ -1997,6 +1997,8 @@ void dlm_move_lockres_to_recovery_list(struct dlm_ctxt > >*dlm, > > struct list_head *queue; > > struct dlm_lock *lock, *next; > >+ assert_spin_locked(&dlm->spinlock); > >+ assert_spin_locked(&res->spinlock); > > res->state |= DLM_LOCK_RES_RECOVERING; > > if (!list_empty(&res->recovering)) { > > mlog(0, > >@@ -2336,9 +2338,20 @@ static void dlm_do_local_recovery_cleanup(struct > >dlm_ctxt *dlm, u8 dead_node) > > /* the wake_up for this will happen when the > > * RECOVERING flag is dropped later */ > remove above comment as it doesn't seem to be relevant anymore. OK. > >- res->state &= ~DLM_LOCK_RES_DROPPING_REF; > >+ if (res->state & DLM_LOCK_RES_DROPPING_REF) { > >+ /* > >+ * don't migrate a lockres which is in > >+ * progress of dropping ref > >+ */ > move this comment to before the if condition OK. > >+ mlog(ML_NOTICE, "%.*s ignored for " > >+ "migration\n", res->lockname.len, > >+ res->lockname.name); > This information only helps us in diagnosing any related issue and > is not helpful in normal cases. So it should be 0 instead of > ML_NOTICE. I think this happens in a rare case. So it won't write many such logs but it could help in diagnosing in case something wroing take place. mlog(0, ..) almost help nothing. > >+ res->state &= > >+ ~DLM_LOCK_RES_DROPPING_REF; > we don't need to clear this state as dlm_purge_lockres removes it. OK. regards, wengang. > >+ } else > >+ dlm_move_lockres_to_recovery_list(dlm, > >+ res); > >- dlm_move_lockres_to_recovery_list(dlm, res); > > } else if (res->owner == dlm->node_num) { > > dlm_free_dead_locks(dlm, res, dead_node); > > __dlm_lockres_calc_usage(dlm, res); _______________________________________________ Ocfs2-devel mailing list [email protected] http://oss.oracle.com/mailman/listinfo/ocfs2-devel
