Re: Is NFS Locking Reliable?

2009-03-11 Thread perryh
 Our NFS servers for user home directories are on FreeBSD (6.4),
 MacOSX (10.5), Linux (still 2.4 kernel) and Tru64-UNIX boxes; NFS
 clients are mostly Linux (2.6 kernel) and FreeBSD (6.4, 7.0, but
 w/o kernel lockd) systems.

I have seen problems with NFS locking even in completely homogeneous
environments.  With a mix like that, I would not trust it as far as
I could throw a Cray :)

 There are periods of several days without problems, but from time
 to time, on one, two, or several (but not all) clients application
 processes which use locking suddenly hang in kernel mode - namely
 firefox, opera, pine.

Lockups are probably the least of your concerns, at least where
pine is involved.  Dunno what sort of data firefox and opera are
protecting from race conditions, but I suppose pine is being used
for email.  Cases will arise wherein mail mysteriously disappears,
because the client and the delivery agent were both updating the
inbox at the same time.  Often there will be no noticeable symptoms,
except for users wondering what happened to that important message
they were supposed to have gotten (and which the MTA log shows was
in fact delivered).

Never export an inbox read/write if reliability of mail delivery is
needed.  Use IMAP instead.

 It seems to be no specific operating system problem - all
 combinations of clients and servers are involved.

I suspect the reason NFS locking is so troublesome is that it
presents problems which are fundamentally incomputable.  Prior
to restoration of communication, how can any automaton possibly
distinguish between

* a temporary loss of the communication link (but the peer is still
  running and the link will eventually be re-established), and

* the peer has crashed, and will eventually reboot?
___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to freebsd-questions-unsubscr...@freebsd.org


Is NFS Locking Reliable?

2009-03-10 Thread Konrad Heuer


I'd like to ask for your experiences with NFS locking in larger 
environments.


Our experiences are not so satisfying. Our NFS servers for user home 
directories are on FreeBSD (6.4), MacOSX (10.5), Linux (still 2.4 kernel) 
and Tru64-UNIX boxes; NFS clients are mostly Linux (2.6 kernel) and 
FreeBSD (6.4, 7.0, but w/o kernel lockd) systems.


There are periods of several days without problems, but from time to time, 
on one, two, or several (but not all) clients application processes which 
use locking suddenly hang in kernel mode - namely firefox, opera, pine.


It seems to be no specific operating system problem - all combinations of 
clients and servers are involved.


There are some suspicious facts that out network may cause problems 
although not all ip subnets are protected by cisco firewall modules. But 
there may be other circumstances which could lead to sporadic packet 
losses or whatever else ...


So, if anyone has similar or other experiences with NFS locking, I'm very 
interested in reading about!


Thank you very much in advance!

Konrad Heuer
GWDG, Am Fassberg, 37077 Goettingen, Germany, kheu...@gwdg.de
___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to freebsd-questions-unsubscr...@freebsd.org


Re: Is NFS Locking Reliable?

2009-03-10 Thread Andrew Wright




On Tue, 10 Mar 2009, Konrad Heuer wrote:



I'd like to ask for your experiences with NFS locking in larger environments.

Our experiences are not so satisfying. Our NFS servers for user home


This matches my historical experience, especially if you add in
periodically wedged and ignored lock state.


First, it is useful to realize that locking over NFS has, until
version 4, been done outside of NFS itself.  That is, there
are a pair of daemon (usually called statd and lockd) processes
that negotiate the lock outside of the stateless mechanism that
is the NFS data access method up to v3.

My past v3 experience has been that only in the case where you have
exactly the same version of statd and lockd on both sides (on the
client and on the server) is it possible that you _may_ experience
truly reliable locking.  Note that this is only possible with the
same OS at the same revision/patch on both client and server.

NFS v4 is no longer stateless, and manages locks internally, which
I would guess would make things much better, though my experience
on mixed environments under v4 is much more limited.


What version of the NFS protocol are you using?  You can find this
out via /usr/sbin/nfsstat


If you are stuck with a v3 client, my recommendation would be to
turn locking off altogether for that client, as I have found that
this works in general better, as the applications desiring the
lock are then at least aware that the lock won't work, rather than
being led up the garden path by a successful return from a call
to lockd that later is not honoured.

If upgrading all to v4 is possible, it is probably worth a try,
and good luck!


Andrew.

___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to freebsd-questions-unsubscr...@freebsd.org