[Linux-HA] limited usefulness of ocf_take_lock()

Ulrich Windl Mon, 28 Nov 2011 01:08:20 -0800

Hi!

I was requested to work around a kernel bug by adding locks to my RA. Reading 
the docs I found that ist supposed to be done via


ocf_take_lock $LOCKFILE
and
ocf_release_lock_on_exit $LOCKFILE

Out of curiosity I inspected the implementation in SLES11 SP1. To me the 
functions are improperly implemented (unless I'm wrong) because:

1) you can have only one lock per RA, no matter what $LOCKFILE you provide. 
This is because actually not the $LOCKFILE is the lock, but the process ID of 
the shell

2) the implementation does not guarantee mutual exclusion:

ocf_pidfile_status() is used to query for an unowned lock. ocf_take_lock() in 
turn waits until either the specified lockfile does not exist, or the PID in 
the lockfile vanished.

Then the PID of the RA's shell is written into the lockfile. As can be seen, 
multiple processes can do that if no lock exists.

If you had parallel execution of RAs before, you'll have parallel execution 
even with those "locks".

Finally you can only release the lock using ocf_release_lock_on_exit(). 
Unfortunately that function will only release tha last lock passed to that 
function as "trap" does not accumulate the commends you give to it.

Maybe an approach using flock(1) instead might be better (untested, just from 
reading the docs):

lock() {
(flock -e 123; test -e $LOCKFILE || touch $LOCKFILE) 123> $MASTERLOCKFILE
}

unlock() {
(flock -e 124; test -e $LOCKFILE && rm $LOCKFILE) 124> $MASTERLOCKFILE
}

Regards,
Ulrich


_______________________________________________
Linux-HA mailing list
[email protected]
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems

[Linux-HA] limited usefulness of ocf_take_lock()

Reply via email to