Mark Fasheh wrote:
> On Tue, Dec 16, 2008 at 03:49:23PM -0800, Sunil Mushran wrote:
>   
>> dlm_get_lock_resource() is supposed to return a lock resource with a proper
>> master. If multiple concurrent threads attempt to lookup the lockres for the
>> same lockid while the lock mastery in underway, one or more threads are 
>> likely
>> to return a lockres without a proper master.
>>
>> This patch makes the threads wait in dlm_get_lock_resource() while the 
>> mastery
>> is underway, ensuring all threads return the lockres with a proper master.
>>
>> This issue is known to be limited to users using the flock() syscall. For all
>> other fs operations, the ocfs2 dlmglue layer serializes the dlm op for each
>> lockid.
>>
>> Patch fixes Novell bz#425491
>> https://bugzilla.novell.com/show_bug.cgi?id=425491
>>
>> Users encountering this bug will see flock() return EINVAL and dmesg have the
>> following error:
>> ERROR: Dlm error "DLM_BADARGS" while calling dlmlock on resource <LOCKID>: 
>> bad api args
>>
>> Reported-by: Coly Li <[email protected]>
>> Signed-off-by: Sunil Mushran <[email protected]>
>> ---
>>  fs/ocfs2/dlm/dlmmaster.c |    9 ++++++++-
>>  1 files changed, 8 insertions(+), 1 deletions(-)
>>
>> diff --git a/fs/ocfs2/dlm/dlmmaster.c b/fs/ocfs2/dlm/dlmmaster.c
>> index cbf3abe..54e182a 100644
>> --- a/fs/ocfs2/dlm/dlmmaster.c
>> +++ b/fs/ocfs2/dlm/dlmmaster.c
>> @@ -732,14 +732,21 @@ lookup:
>>      if (tmpres) {
>>              int dropping_ref = 0;
>>  
>> +            spin_unlock(&dlm->spinlock);
>> +
>>              spin_lock(&tmpres->spinlock);
>> +            /* We wait for the other thread that is mastering the resource 
>> */
>> +            if (tmpres->owner == DLM_LOCK_RES_OWNER_UNKNOWN) {
>> +                    __dlm_wait_on_lockres(tmpres);
>> +                    BUG_ON(tmpres->owner == DLM_LOCK_RES_OWNER_UNKNOWN);
>> +            }
>>     
>
> Do we have a ref on tmpres here, when we decide to wait on it? Could it get
> destroyed during mastery (for example, due to mastery ending in error) and
> we wake up with a bad pointer?
>   
Yes. __dlm_lookup_lockres_full() takes a reference.


_______________________________________________
Ocfs2-devel mailing list
[email protected]
http://oss.oracle.com/mailman/listinfo/ocfs2-devel

Reply via email to