Coly Li wrote: > Because I am not familiar with the code yet, I though this is an oops > triggered by my first > modification. Therefore, I choose to use a loop which did not trigger the > oops. > > From your reply, it seems kernel BUG in __dlm_lockres_drop_inflight_ref at > dlmmaster.c:680 is > another bug ? I saw a patch named "ocfs2/dlm: Fix race in adding/removing > lockres' to/from the > tracking list", is it the fix for this bug ? If yes, I should learn how you > resolve it ;)
The oops in __dlm_lockres_drop_inflight_ref() is different that the tracking list oops. No relationship. The inflight_ref oops is because the "fix" was not taking the ref. Hence it was zero during the drop. And that was because the "patch fix" was at the wrong location. See the diff between my first patch and the final one and see where the inflight ref is taken. The tracking list bug has always been there. It was exposed during forked flock() testing as explained in the patch. > Here is how I thought, please comments on my mistake, > The dlm associated with lockres is projected by dlm->spinlock, if we only > protect lockres by > lockres->spinlock, there *might* be possibility to modify dlm->node_num > somewhere. Since we have > quite a few places to compare lockres->owner with dlm->node_num, I suspect > that manipulating on > lockres->owner without protecting dlm->owner might be problematic. dlm->node_num can never be modified. It is the node number which is fixed for the life of the dlm domain (and more). Sunil _______________________________________________ Ocfs2-devel mailing list [email protected] http://oss.oracle.com/mailman/listinfo/ocfs2-devel
