salvage knowledge

Stephen Joyce Mon, 31 Jan 2011 09:18:07 -0800

On Mon, 31 Jan 2011, Steve Simmons wrote:

We have seen similar issues. It occurs when there is a given vicepartition where lots of clients have registered callbacks but thoseclients are no longer accessible. Not all the clients have responded whenthe 1800 second timer goes off, and the fileserver goes down uncleanly.
We have about 235,000 volumes spread across 40 vice partitions. Our 'fix'is a combination of lengthening that timeout to a 3600 seconds andkeeping our vice partitions no longer than 2TB. Active partitions arespread roughly equally across those 40 partitions. But that's just astopgap; the longer a server stays up, the more likely it accumulatesdead callbacks.

Assuming this is true, isn't this a good argument to keep the weekly serverprocess restarts?


Cheers,
Stephen
_______________________________________________
OpenAFS-info mailing list
[email protected]
https://lists.openafs.org/mailman/listinfo/openafs-info

Re: [OpenAFS] Re: Need volume state / fileserver / salvage knowledge

Reply via email to