Hi!
I was requested to work around a kernel bug by adding locks to my RA. Reading
the docs I found that ist supposed to be done via
ocf_take_lock $LOCKFILE
and
ocf_release_lock_on_exit $LOCKFILE
Out of curiosity I inspected the implementation in SLES11 SP1. To me the
functions are improperly implemented (unless I'm wrong) because:
1) you can have only one lock per RA, no matter what $LOCKFILE you provide.
This is because actually not the $LOCKFILE is the lock, but the process ID of
the shell
2) the implementation does not guarantee mutual exclusion:
ocf_pidfile_status() is used to query for an unowned lock. ocf_take_lock() in
turn waits until either the specified lockfile does not exist, or the PID in
the lockfile vanished.
Then the PID of the RA's shell is written into the lockfile. As can be seen,
multiple processes can do that if no lock exists.
If you had parallel execution of RAs before, you'll have parallel execution
even with those "locks".
Finally you can only release the lock using ocf_release_lock_on_exit().
Unfortunately that function will only release tha last lock passed to that
function as "trap" does not accumulate the commends you give to it.
Maybe an approach using flock(1) instead might be better (untested, just from
reading the docs):
lock() {
(flock -e 123; test -e $LOCKFILE || touch $LOCKFILE) 123> $MASTERLOCKFILE
}
unlock() {
(flock -e 124; test -e $LOCKFILE && rm $LOCKFILE) 124> $MASTERLOCKFILE
}
Regards,
Ulrich
_______________________________________________
Linux-HA mailing list
[email protected]
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems