Simon Wilkinson schrieb:
All of that is a long winded way of saying I don't really know what's
causing your issue. One key diagnostic question is whether the cache
manager continues to operate once it's run out of buffers. If we have a
reference count imbalance somewhere, then the machine will never
recover, and will report a lack of buffers for every operation it
performs. If the cache manager does recover, then it may just mean that
we need to look at either having a larger number of buffers, or making
our buffer allocation dynamic. Both should be pretty straightforward,
for Linux at least.
What happens to your clients once they've hit the error?
In two cases AFS continued to work. In two others however afs all AFS now
stops after /afs, and eventually the looong lines with 'all buffers lockedall
buffers lockedall buffers locked' (you could add a "\n" to your patch while
you're at it) appear in the syslog.
I'll see if I can crank the 50 up an order of magnitude and track the
increases. However, this *is* a stress test with about 100 parallel "jobs" per
client, not yet necessarily a leak, and even 25 simultaneous "FindBlobs"
aren't unthinkable.
--
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
Rainer Toebbicke
European Laboratory for Particle Physics(CERN) - Geneva, Switzerland
Phone: +41 22 767 8985 Fax: +41 22 767 7155
_______________________________________________
OpenAFS-info mailing list
[email protected]
https://lists.openafs.org/mailman/listinfo/openafs-info