[Linux-HA] Antw: limited usefulness of ocf_take_lock()

Ulrich Windl Mon, 28 Nov 2011 02:10:06 -0800

Hi!

Here is a locking sample (potential replacement functions) that seems to work: 
Just start the script more than once as a background process and watch the 
output
-----------snip locks.sh --------------------
MASTERLOCKFILE=/tmp/blabla


lock() {
    (flock -e 123 &&
        if [ -e "$1" ]; then
            if ! kill -0 $(<"$1") 2>&1 > /dev/null; then
                # stale lock
                echo $$ > "$1"
            else
                false
            fi
        else
            echo $$ > "$1"
        fi) 123> $MASTERLOCKFILE
}

unlock() {
    (flock -e 124 && test -e "$1" && rm "$1") 124> $MASTERLOCKFILE
}

# "application"
while true
do
    while ! lock /tmp/foobar; do
        echo "waiting for lock $$"
        sleep 0.2
    done
    echo "lock OK $$"
    sleep 0.1
    if unlock /tmp/foobar; then
        echo "unlock OK $$"
    else
        echo "unlock FAIL $$"
    fi
    sleep 0.1
done
---------------------snip--------------------

Regards,
Ulrich


>>> "Ulrich Windl" <[email protected]> schrieb am 28.11.2011 um
10:07 in Nachricht <[email protected]>:
> Hi!
> 
> I was requested to work around a kernel bug by adding locks to my RA. 
> Reading the docs I found that ist supposed to be done via
> 
> ocf_take_lock $LOCKFILE
> and
> ocf_release_lock_on_exit $LOCKFILE
> 
> Out of curiosity I inspected the implementation in SLES11 SP1. To me the 
> functions are improperly implemented (unless I'm wrong) because:
> 
> 1) you can have only one lock per RA, no matter what $LOCKFILE you provide. 
> This is because actually not the $LOCKFILE is the lock, but the process ID of 
> the shell
> 
> 2) the implementation does not guarantee mutual exclusion:
> 
> ocf_pidfile_status() is used to query for an unowned lock. ocf_take_lock() 
> in turn waits until either the specified lockfile does not exist, or the PID 
> in the lockfile vanished.
> 
> Then the PID of the RA's shell is written into the lockfile. As can be seen, 
> multiple processes can do that if no lock exists.
> 
> If you had parallel execution of RAs before, you'll have parallel execution 
> even with those "locks".
> 
> Finally you can only release the lock using ocf_release_lock_on_exit(). 
> Unfortunately that function will only release tha last lock passed to that 
> function as "trap" does not accumulate the commends you give to it.
> 
> Maybe an approach using flock(1) instead might be better (untested, just 
> from reading the docs):
> 
> lock() {
> (flock -e 123; test -e $LOCKFILE || touch $LOCKFILE) 123> $MASTERLOCKFILE
> }
> 
> unlock() {
> (flock -e 124; test -e $LOCKFILE && rm $LOCKFILE) 124> $MASTERLOCKFILE
> }
> 
> Regards,
> Ulrich
> 
> 
> _______________________________________________
> Linux-HA mailing list
> [email protected] 
> http://lists.linux-ha.org/mailman/listinfo/linux-ha 
> See also: http://linux-ha.org/ReportingProblems 
> 

 
 

_______________________________________________
Linux-HA mailing list
[email protected]
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems

[Linux-HA] Antw: limited usefulness of ocf_take_lock()

Reply via email to