On 27 Oct 2009, at 13:51, Rainer Toebbicke wrote:
Simon Wilkinson schrieb:
All of that is a long winded way of saying I don't really know
what's causing your issue. One key diagnostic question is whether
the cache manager continues to operate once it's run out of
buffers. If we have a reference count imbalance somewhere, then the
machine will never recover, and will report a lack of buffers for
every operation it performs. If the cache manager does recover,
then it may just mean that we need to look at either having a
larger number of buffers, or making our buffer allocation dynamic.
Both should be pretty straightforward, for Linux at least.
What happens to your clients once they've hit the error?
In two cases AFS continued to work. In two others however afs all
AFS now stops after /afs, and eventually the looong lines with 'all
buffers lockedall buffers lockedall buffers locked' (you could add a
"\n" to your patch while you're at it) appear in the syslog.
It wouldn't surprise me if some codepaths tie themselves in knots when
DRead returns NULL - it's a rare enough occurence (and one which used
to just panic, rather than printing the warning message) that it's
probably not been widely examined. The two that never manage to free
their lockers are interesting though - can you get cmdebug and alt-
sysreq-t output from them while they're stuck? (If you could send that
privately, or to RT, rather than the list)
I hopefully did add a \n, too.
I'll see if I can crank the 50 up an order of magnitude and track
the increases. However, this *is* a stress test with about 100
parallel "jobs" per client, not yet necessarily a leak, and even 25
simultaneous "FindBlobs" aren't unthinkable.
I suspect that ultimately, we're going to need to make the buffer
structures dynamically allocated, with some kind of high and low
watermark system. Each buffer takes up slightly more than 2k of memory
- so having a large number permanently allocated is a little anti-
social on platforms with limited memory.
S.
_______________________________________________
OpenAFS-info mailing list
[email protected]
https://lists.openafs.org/mailman/listinfo/openafs-info