Re: [OpenAFS] Re: Afs User volume servers in VM's

Booker Bense Wed, 26 Oct 2011 10:33:05 -0700

On Wed, 26 Oct 2011, Andrew Deason wrote:

On Wed, 26 Oct 2011 18:41:15 +0200
Stephan Wiesand <[email protected]> wrote:

Booker and me would probably be ok with errors being returned upon
access to a single volume that's being overwhelmed with I/O requests -
if it just wouldn't make the fileserver as a whole grind to a halt and
not service any request any more.


Well, see, it depends on _what_ is causing it to do that, as Jeffrey
said. If the threads are hanging on a lock somewhere in the host package
or Rx or something, this won't help a whole lot since we still have to
go through those layers and we'll still hang on those locks (same thing
for chewing up CPU, or moving memory around, etc). In fact, we'll do so
even more, since we (eventually) have to go through all that at least
twice for the VBUSY case.

The symptom we see is thread exhaustion due to write callbacksfrom many clients for a single volume[1]. The problem isinsidious as it's not a gradual failure, because everything works just fine

until you hit a tipping point in the number of batch jobs.

It's often a file that the user isn't even aware they areopening, but is a small file used by some library they are

using. Sometimes tracking down the file can take significant
effort.

- Booker C. Bense

[1]- I'm not the stuckee when this happens, just an interestedbystander so I may have the details slightly incorrect.


_______________________________________________
OpenAFS-info mailing list
[email protected]
https://lists.openafs.org/mailman/listinfo/openafs-info

Re: [OpenAFS] Re: Afs User volume servers in VM's

Reply via email to