On Wed, Aug 09, 2017 at 05:51:37AM +0000, tsutomu....@toshiba.co.jp wrote:
> If there is a lock resource conflict on multiple nodes, the lock on
> convert queue may not be granted forever.
> 
> EX.)
> grant queue:
>     node0 grmode NL / rqmode IV
>     node1 grmode NL / rqmode IV
> 
> convert queue:
>     node2 grmode NL / rqmode EX
>     node3 grmode PR / rqmode EX
> 
> wait queue:
>     node4 grmode IV / rqmode PR
>     node5 grmode IV / rqmode PR
> 
> When the lock conversion (node PR -> NL) of node 0 is completed, the lock
> of node 2 should be grantable. However, __can_be_granted() returns 0
> because the grmode of the lock on node 3 in convert queue is PR.
> 
> When checking the lock at the head of convert queue, exclude
> queue_conflict() targeting convert queue.

This example doesn't look right.  node2's NL->EX cannot be granted because
it conflicts with the PR lock held by node3.  (The grmode is still valid
when a lock is on the convert queue.)

There are two valid outcomes in the example above, either 1) node3 PR->EX
is granted, or 2) node4 and node5 PR requests are granted.  What have you
seen the dlm do in this state?  If it does not grant anything, that would
be a bug.

Based on the sequence of events you describe, I think that the correct
outcome would be 1 (granting node3's PR->EX), based on this rule:
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/fs/dlm/lock.c#n2429


> -     if (queue_conflict(&r->res_convertqueue, lkb))
> +     if (!first_in_list(lkb, &r->res_convertqueue) &&
> +         queue_conflict(&r->res_convertqueue, lkb))
>               return 0;

Reply via email to