On 27 Oct 2009, at 13:51, Rainer Toebbicke wrote:

Simon Wilkinson schrieb:

All of that is a long winded way of saying I don't really know what's causing your issue. One key diagnostic question is whether the cache manager continues to operate once it's run out of buffers. If we have a reference count imbalance somewhere, then the machine will never recover, and will report a lack of buffers for every operation it performs. If the cache manager does recover, then it may just mean that we need to look at either having a larger number of buffers, or making our buffer allocation dynamic. Both should be pretty straightforward, for Linux at least.
What happens to your clients once they've hit the error?

In two cases AFS continued to work. In two others however afs all AFS now stops after /afs, and eventually the looong lines with 'all buffers lockedall buffers lockedall buffers locked' (you could add a "\n" to your patch while you're at it) appear in the syslog.

It wouldn't surprise me if some codepaths tie themselves in knots when DRead returns NULL - it's a rare enough occurence (and one which used to just panic, rather than printing the warning message) that it's probably not been widely examined. The two that never manage to free their lockers are interesting though - can you get cmdebug and alt- sysreq-t output from them while they're stuck? (If you could send that privately, or to RT, rather than the list)

I hopefully did add a \n, too.

I'll see if I can crank the 50 up an order of magnitude and track the increases. However, this *is* a stress test with about 100 parallel "jobs" per client, not yet necessarily a leak, and even 25 simultaneous "FindBlobs" aren't unthinkable.

I suspect that ultimately, we're going to need to make the buffer structures dynamically allocated, with some kind of high and low watermark system. Each buffer takes up slightly more than 2k of memory - so having a large number permanently allocated is a little anti- social on platforms with limited memory.

S.

_______________________________________________
OpenAFS-info mailing list
[email protected]
https://lists.openafs.org/mailman/listinfo/openafs-info

Reply via email to