On 2016-7-11 10:07, Joseph Qi wrote:
> On 2016/7/10 18:03, piaojun wrote:
>> We found a BUG situation that lockres is migrated during deref
>> described below. To solve the BUG, we could purge lockres directly when
>> other node says I did not have a ref. Additionally, we'd better purge
>> lockres if master goes down, as no one will response deref done.
>>
>> Node 1                  Node 2(old master)             Node3(new master)
>> dlm_purge_lockres
>> send deref to N2
>>
>>                         leave domain
>>                         migrate lockres to N3
>>                                                        finish migration
>>                                                        send do assert
>>                                                        master to N1
>>
>> receive do assert msg
>> form N3, but can not
>> find lockres because
>> DROPPING_REF is set,
>> so the owner is still
>> N2.
>>
>>                         receive deref from N1
>>                         and response -EINVAL
>>                         because lockres is migrated
>>
>> BUG when receive -EINVAL
>> in dlm_drop_lockres_ref
>>
>> Fixes: 842b90b62461d ("ocfs2/dlm: return in progress if master can not clear 
>> the refmap bit...")
>> Signed-off-by: Jun Piao <piao...@huawei.com>
> Use full patch title please.
> Others looks well.
> 
> Thanks,
> Joseph
> 
Good suggestion, I will fix this problem in the following [PATCH v2].
Thanks,
Jun Piao
>> ---
>>  fs/ocfs2/dlm/dlmmaster.c |  9 ++++++---
>>  fs/ocfs2/dlm/dlmthread.c | 13 +++++++++++--
>>  2 files changed, 17 insertions(+), 5 deletions(-)
>>
>> diff --git a/fs/ocfs2/dlm/dlmmaster.c b/fs/ocfs2/dlm/dlmmaster.c
>> index f72e7ae..8c84641 100644
>> --- a/fs/ocfs2/dlm/dlmmaster.c
>> +++ b/fs/ocfs2/dlm/dlmmaster.c
>> @@ -2276,9 +2276,12 @@ int dlm_drop_lockres_ref(struct dlm_ctxt *dlm, struct 
>> dlm_lock_resource *res)
>>              mlog(ML_ERROR, "%s: res %.*s, DEREF to node %u got %d\n",
>>                   dlm->name, namelen, lockname, res->owner, r);
>>              dlm_print_one_lock_resource(res);
>> -            BUG();
>> -    }
>> -    return ret ? ret : r;
>> +            if (r == -ENOMEM)
>> +                    BUG();
>> +    } else
>> +            ret = r;
>> +
>> +    return ret;
>>  }
>>  
>>  int dlm_deref_lockres_handler(struct o2net_msg *msg, u32 len, void *data,
>> diff --git a/fs/ocfs2/dlm/dlmthread.c b/fs/ocfs2/dlm/dlmthread.c
>> index 68d239b..ce39722 100644
>> --- a/fs/ocfs2/dlm/dlmthread.c
>> +++ b/fs/ocfs2/dlm/dlmthread.c
>> @@ -175,6 +175,15 @@ static void dlm_purge_lockres(struct dlm_ctxt *dlm,
>>           res->lockname.len, res->lockname.name, master);
>>  
>>      if (!master) {
>> +            if (res->state & DLM_LOCK_RES_DROPPING_REF) {
>> +                    mlog(ML_NOTICE, "%s: res %.*s already in "
>> +                            "DLM_LOCK_RES_DROPPING_REF state\n",
>> +                            dlm->name, res->lockname.len,
>> +                            res->lockname.name);
>> +                    spin_unlock(&res->spinlock);
>> +                    return;
>> +            }
>> +
>>              res->state |= DLM_LOCK_RES_DROPPING_REF;
>>              /* drop spinlock...  retake below */
>>              spin_unlock(&res->spinlock);
>> @@ -203,8 +212,8 @@ static void dlm_purge_lockres(struct dlm_ctxt *dlm,
>>              dlm->purge_count--;
>>      }
>>  
>> -    if (!master && ret != 0) {
>> -            mlog(0, "%s: deref %.*s in progress or master goes down\n",
>> +    if (!master && ret == DLM_DEREF_RESPONSE_INPROG) {
>> +            mlog(0, "%s: deref %.*s in progress\n",
>>                      dlm->name, res->lockname.len, res->lockname.name);
>>              spin_unlock(&res->spinlock);
>>              return;
>>
> 
> 
> 
> .
> 


_______________________________________________
Ocfs2-devel mailing list
Ocfs2-devel@oss.oracle.com
https://oss.oracle.com/mailman/listinfo/ocfs2-devel

Reply via email to