Darren New wrote:
1) Negotiate who is the master:
1A) If "Elected" exists, copy its contents to the file named "10.0.0.5"
and go to phase 3 if it matches my IP address, phase 2 if it doesn't.
1B) If "Nominated" exists, copy its contents to the file named
"10.0.0.5" and go to 1D.
1C) Write the string "10.0.0.5" into "10.0.0.5" and into "Nominated".
(Here, we don't think either of those files exist, so we think we're the
first to come online. Yes, this is a race condition. See 1F below.)
First, if two machines write to nominated simultaneously, nominated can
get corrupted.
You assume that the writing of nominated has atomic semantics. You also
assume that nominated has a write barrier which guarantees the ordering
of the steps 1C and 1D. You also assume the writes have a known fixed
ordering on all machines. It is possible for the following sequence to
occur:
Machine A Machine B
Write A to Nom Write B to Nom
Write B to Nom Write A to Nom
Read Nom-See B Read Nom-See A
Nothing prevents this without locking. You can also continue to
generate this pattern repeatedly unless you have something like random,
exponential backoff.
Obviously, test-and-set is easier when you're talking about local memory
and stuff. But there's no test-and-set over (say) NFS, and hence locking
is messy there.
Quit spouting wrong information! Please go read NFS Illustrated. NFS
locking works *JUST FINE*. Only flock() in Linux sucks, quoting the
Linux flock man page:
NOTES
flock(2) does not lock files over NFS. Use fcntl(2) instead: that
does work over NFS, given a sufficiently recent version of Linux and a server
which supports locking.
So locking does work in Linux. Also note that the BSD systems *DO NOT
HAVE THIS LIMITATION*. flock over NFS works just fine for BSD.
Don't blame NFS for the fact that the Linux guys don't actually care to
get things right.
But, hey, Linux runs on more and cheaper hardware than BSD and that's
what's important.
Right?
-a
--
[email protected]
http://www.kernel-panic.org/cgi-bin/mailman/listinfo/kplug-lpsg