Jim Rees wrote:
The interesting thread will probably be the CheckHost thread.
Maybe, and I've got a patch for this I'm testing now. But I think this is a
different problem. The bug I'm chasing makes all the worker threads hang
waiting for more space in the callback table. The server eventually
recovers.
The problem Christopher describes shows many calls waiting for a thread, and
yet the pstack shows many threads waiting for a call. And the server never
recovers. Looks like the worker threads aren't waking up, or aren't finding
the calls when they do.
The CheckHost loop does hog host locks, but only one at a time.
Yeah, the pstack output I have shows the CheckHost thread being idle at
the time, so it might not be that.
Well, if one of my servers yarf tomorrow (which they probably will),
I'll have a core to examine from a fileserver & libraries built with
"-g", so we might be able to do a bit more research.
-rob
_______________________________________________
OpenAFS-devel mailing list
[email protected]
https://lists.openafs.org/mailman/listinfo/openafs-devel