Coly Li wrote:
> Because I am not familiar with the code yet, I though this is an oops 
> triggered by my first
> modification. Therefore, I choose to use a loop which did not trigger the 
> oops.
>
> From your reply, it seems kernel BUG in __dlm_lockres_drop_inflight_ref at 
> dlmmaster.c:680 is
> another bug ? I saw a patch named "ocfs2/dlm: Fix race in adding/removing 
> lockres' to/from the
> tracking list", is it the fix for this bug ? If yes, I should learn how you 
> resolve it ;)

The oops in __dlm_lockres_drop_inflight_ref() is different that the
tracking list oops. No relationship.

The inflight_ref oops is because the "fix" was not taking the ref. Hence
it was zero during the drop. And that was because the "patch fix" was
at the wrong location. See the diff between my first patch and the final one
and see where the inflight ref is taken.

The tracking list bug has always been there. It was exposed during
forked flock() testing as explained in the patch.

> Here is how I thought, please comments on my mistake,
> The dlm associated with lockres is projected by dlm->spinlock, if we only 
> protect lockres by
> lockres->spinlock, there *might* be possibility to modify dlm->node_num 
> somewhere. Since we have
> quite a few places to compare lockres->owner with dlm->node_num, I suspect 
> that manipulating on
> lockres->owner without protecting dlm->owner might be problematic.

dlm->node_num can never be modified. It is the node number which
is fixed for the life of the dlm domain (and more).

Sunil

_______________________________________________
Ocfs2-devel mailing list
[email protected]
http://oss.oracle.com/mailman/listinfo/ocfs2-devel

Reply via email to