On Thu, 2 Dec 2010 13:56:07 -0500 (EST) "Thomas M. Payerle" <[email protected]> wrote:
> I am looking for a way to tune the timeout before failing over to > another AFS server for replicated volumes, but cannot seem to find any > suitable runtime parameters to tweak. Do any such parameters exist? "Sort of". You can do this without recompiling on Linux clients, but you need to set it at startup. You can change this by setting /proc/sys/afs/rx_deadtime, but you would want to set it before the client starts; this is after you load the kernel module but before you run afsd. Otherwise this value only takes effect when a server comes back up after being marked down (or something like that; I forget the details). This is also a rather coarse hammer, as this is a timeout value for all RX network activity in the kernel. > We have some replicated web servers serving data from replicated RO > volumes. If one of the servers hosting one of those volumes goes > down, httpds which were pointing to that server's copy of the volume > seem to get badly wedged. I think it is because enough requests come > in during the time it takes for AFS client on web host to release the > AFS server is down and move on to a replica that all available threads > for apache are used, and apache just gets very unhappy. If the situation never recovers while the fileserver is down, you also _might_ be hitting a problem that is solved by <http://gerrit.openafs.org/3339> and <http://gerrit.openafs.org/3340>. But if you don't want to fiddle with trying patches, that's not really helpful. Lowering rx_deadtime would also work around that issue, anyway. -- Andrew Deason [email protected] _______________________________________________ OpenAFS-info mailing list [email protected] https://lists.openafs.org/mailman/listinfo/openafs-info
