looks good, thanks Junxiao for fixing this. Reviewed-by: Srinivas Eeda<srinivas.e...@oracle.com>
On 02/25/2014 11:47 PM, Junxiao Bi wrote: > This issue was introduced by commit 800deef3 where it replaced list_for_each > with list_for_each_entry. The variable "lock" will point to invalid data if > "tmpq" list is empty and a panic will be triggered due to this. > Sunil advised reverting it back, but the old version was also not right. At > the end of the outer for loop, that list_for_each_entry will also set "lock" > to an invalid data, then in the next loop, if the "tmpq" list is empty, "lock" > will be an stale invalid data and cause the panic. So reverting the > list_for_each > back and reset "lock" to NULL to fix this issue. > > Another concern is that this seemes can not happen because the "tmpq" list > should > not be empty. Let me describe how. > old lock resource owner(node 1): migratation > target(node 2): > image there's lockres with a EX lock from node 2 in > granted list, a NR lock from node x with convert_type > EX in converting list. > dlm_empty_lockres() { > dlm_pick_migration_target() { > pick node 2 as target as its lock is the first one > in granted list. > } > dlm_migrate_lockres() { > dlm_mark_lockres_migrating() { > res->state |= DLM_LOCK_RES_BLOCK_DIRTY; > wait_event(dlm->ast_wq, !dlm_lockres_is_dirty(dlm, res)); > //after the above code, we can not dirty lockres any more, > // so dlm_thread shuffle list will not run > > downconvert lock from EX to NR > upconvert > lock from NR to EX > <<< migration may schedule out here, then > <<< node 2 send down convert request to convert type from EX to > <<< NR, then send up convert request to convert type from NR to > <<< EX, at this time, lockres granted list is empty, and two locks > <<< in the converting list, node x up convert lock followed by > <<< node 2 up convert lock. > > // will set lockres RES_MIGRATING flag, the following > // lock/unlock can not run > dlm_lockres_release_ast(dlm, res); > } > > dlm_send_one_lockres() > > dlm_process_recovery_data() > for (i=0; > i<mres->num_locks; i++) > if > (ml->node == dlm->node_num) > for > (j = DLM_GRANTED_LIST; j <= DLM_BLOCKED_LIST; j++) { > > list_for_each_entry(lock, tmpq, list) > if > (lock) break; <<< lock is invalid as grant list is empty. > } > if > (lock->ml.node != ml->node) > > BUG() >>> crash here > } > I see the above locks status from a vmcore of our internal bug. > > Signed-off-by: Junxiao Bi <junxiao...@oracle.com> > Cc: Sunil Mushran <sunil.mush...@gmail.com> > Cc: Srinivas Eeda <srinivas.e...@oracle.com> > Cc: <sta...@vger.kernel.org> > --- > fs/ocfs2/dlm/dlmrecovery.c | 14 ++++++++------ > 1 file changed, 8 insertions(+), 6 deletions(-) > > diff --git a/fs/ocfs2/dlm/dlmrecovery.c b/fs/ocfs2/dlm/dlmrecovery.c > index 7035af0..c2dd258 100644 > --- a/fs/ocfs2/dlm/dlmrecovery.c > +++ b/fs/ocfs2/dlm/dlmrecovery.c > @@ -1750,13 +1750,13 @@ static int dlm_process_recovery_data(struct dlm_ctxt > *dlm, > struct dlm_migratable_lockres *mres) > { > struct dlm_migratable_lock *ml; > - struct list_head *queue; > + struct list_head *queue, *iter; > struct list_head *tmpq = NULL; > struct dlm_lock *newlock = NULL; > struct dlm_lockstatus *lksb = NULL; > int ret = 0; > int i, j, bad; > - struct dlm_lock *lock = NULL; > + struct dlm_lock *lock; > u8 from = O2NM_MAX_NODES; > unsigned int added = 0; > __be64 c; > @@ -1791,14 +1791,16 @@ static int dlm_process_recovery_data(struct dlm_ctxt > *dlm, > /* MIGRATION ONLY! */ > BUG_ON(!(mres->flags & DLM_MRES_MIGRATION)); > > + lock = NULL; > spin_lock(&res->spinlock); > for (j = DLM_GRANTED_LIST; j <= DLM_BLOCKED_LIST; j++) { > tmpq = dlm_list_idx_to_ptr(res, j); > - list_for_each_entry(lock, tmpq, list) { > - if (lock->ml.cookie != ml->cookie) > - lock = NULL; > - else > + list_for_each(iter, tmpq) { > + lock = list_entry(iter, > + struct dlm_lock, list); > + if (lock->ml.cookie == ml->cookie) > break; > + lock = NULL; > } > if (lock) > break; _______________________________________________ Ocfs2-devel mailing list Ocfs2-devel@oss.oracle.com https://oss.oracle.com/mailman/listinfo/ocfs2-devel