On Wed, Aug 09, 2017 at 05:51:37AM +0000, tsutomu....@toshiba.co.jp wrote: > If there is a lock resource conflict on multiple nodes, the lock on > convert queue may not be granted forever. > > EX.) > grant queue: > node0 grmode NL / rqmode IV > node1 grmode NL / rqmode IV > > convert queue: > node2 grmode NL / rqmode EX > node3 grmode PR / rqmode EX > > wait queue: > node4 grmode IV / rqmode PR > node5 grmode IV / rqmode PR > > When the lock conversion (node PR -> NL) of node 0 is completed, the lock > of node 2 should be grantable. However, __can_be_granted() returns 0 > because the grmode of the lock on node 3 in convert queue is PR. > > When checking the lock at the head of convert queue, exclude > queue_conflict() targeting convert queue.
This example doesn't look right. node2's NL->EX cannot be granted because it conflicts with the PR lock held by node3. (The grmode is still valid when a lock is on the convert queue.) There are two valid outcomes in the example above, either 1) node3 PR->EX is granted, or 2) node4 and node5 PR requests are granted. What have you seen the dlm do in this state? If it does not grant anything, that would be a bug. Based on the sequence of events you describe, I think that the correct outcome would be 1 (granting node3's PR->EX), based on this rule: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/fs/dlm/lock.c#n2429 > - if (queue_conflict(&r->res_convertqueue, lkb)) > + if (!first_in_list(lkb, &r->res_convertqueue) && > + queue_conflict(&r->res_convertqueue, lkb)) > return 0;