On Mon, 20 Jun 2011 16:19:01 -0700 Jonathan Nilsson <[email protected]> wrote:
> i suspect that the system kept spawning httpd processes as old ones > got blocked and eventually it ran out of memory and became > unresponsive. after a reboot it works fine. so the question is, what > caused the afs cache manager to respond so slow? > > can anyone confirm if they have seen kernel messages like this? how > can i confirm if the problem is with the client or the server? i see > no error messages in BosLog, FileLog, or VolserLog on our servers... If the processes were hanging forever or for a very long time, it's not likely to be the fault of any server, since the client doesn't wait around forever for a response. I assume there were no messages about losing contact with file or vl servers in the client logs around that time? It's easier to see what's going on if we know what's going on with the rest of the system when that happens. If you ever catch it doing that, running 'echo t > /proc/sysrq-trigger' will generate a lot of info (some of it useful) in syslog. Or if you can get the machine to dump core, that's the most useful thing, but you don't want to just go giving that out to anybody. -- Andrew Deason [email protected] _______________________________________________ OpenAFS-info mailing list [email protected] https://lists.openafs.org/mailman/listinfo/openafs-info
