On Mon, 13 Feb 2012 12:31:53 -0600 John Tang Boyland <[email protected]> wrote:
> This cell has about 40 students on it accessing files on three servers > using their laptops which probably have firewalls causing them to > ignore callback requests. Unless the OpenAFS installation process > opens up 7001 to outside access What platform are the clients? The windows installation process I believe does indeed (try to) open 7001. > there's basically nothing I can do about this bad behavior. Well, "nothing" is a bit far. On the fileserver-side, you can restrict access to those files to only people who use non-broken clients. You can block clients by IP at the network level if you know what IP the requests are coming from that do not respond to callback breaks. >From a more general standpoint, you can fix the clients. If they're ignoring callback breaks, they are going to continue to cause problems like this, and the clients themselves are likely to see stale/incorrect data. > My guess is that the server's threads all get used up waiting for > callback breaks to be ack'ed and so the fileserver stops responding. > But is there something more I can do to find out why the freeze is > happening? Is there some rxdebug command that I can run when a freeze > happens? I'm a little confused by this... by 'freeze' you mean everything on the server is inaccessible? Or just that the write takes over 30 seconds to do anything, and requests to the same file stall? If the latter, there's not much you can do about that; we must break callbacks before the write completes, and if someone is not responding to a callback break, we need to wait some seconds to ensure we've tried hard enough to inform them. To see if you're running out of threads, running 'rxdebug <fileserver> -noconn' will tell you how many threads are idle and how many requests are waiting for a free thread. If you want to see in general what the fileserver is blocked on, you can look at a core of the fileserver process. However, if you just think that it's a host ignoring callback breaks... that seems pretty likely to be all that it is. > Is there a simple solution -- like tuning a parameter (more threads?) > that could make this behavior less common? If you're using -L or '-p 128', the threads are already the highest they can go for a 1.4 fileserver. For a 1.6 fileserver you can go to... 256, was it? But that's not going to help if the problem is unrelated to running out of threads. -- Andrew Deason [email protected] _______________________________________________ OpenAFS-info mailing list [email protected] https://lists.openafs.org/mailman/listinfo/openafs-info
