On Thu, 15 Aug 2013 22:33:13 -0400 [email protected] wrote: > But my question is: If this returns, how can I track down what is > *causing* the calls-waiting value to climb? We had over 100 > workstations using AFS at the time, scattered all around campus. I > did a variety of things to try and pinpoint the culprit, but didn't > have much luck.
Dan's approach is good if you are just legitimately having too much AFS activity and the fileserver can't keep up with it. Some alternative approaches in the same area include looking at the audit log instead of debug FileLog entries, examining wire dumps, or getting info out of dtrace if you're on an applicable platform. Those all have slightly different performance characteristics, but mostly different people use different approaches depending on what's most convenient. The fileserver does record some other stats as well, but they're not broken down per client/peer, so they're not as useful. You can look around for xstat_fs_test if you want some stats, anyway. I believe there are facilities in the code for breaking this down per-peer (the rx peer/process stats), but I don't think we have anything to extract the data. If those show very little, though, you probably have a thread actually hanging on something else, so you won't see a lot of activity. (If you show very little disk, net, and cpu usage at the same time, that seems pretty likely.) In that case you'd need to look at a stack trace for the fileserver process, or ideally capture a core to be examined later. -- Andrew Deason [email protected] _______________________________________________ OpenAFS-info mailing list [email protected] https://lists.openafs.org/mailman/listinfo/openafs-info
