Hi folks- In the past, we've observed prolonged periods where one or more of our servers would report more than 200 calls waiting for a thread. This occurred again this morning and lasted for about four hours. While the server reported the blocked calls, top showed that the fileserver was pegged at >= 100% CPU and FileLog (with verbosity increased via SIGTSTP) showed a huge number of SAFS_FetchStatuses (and very little else).
During this time, I also noticed that the number of blocked calls seemed to oscillate between 0 and ~220 over a period of about 100 seconds (with ~1300 total clients according to the hosts.dump file). This made me wonder if there wasn't some component that was periodically clearing the backlog and, if so, if the period might be easily modifiable. This condition tends to coincide with a large number of batch jobs that, unfortunately, must get some of their shared libraries, binaries and configuration/seed files from our AFS cell. We've done as much as we can to limit the amount of data in AFS that these jobs require, but we still observe blocked calls, especially when a large number of jobs spin up at approximately the same time. It's also possible that the jobs are overwhelming the clients' caches, which could conceivably cause extra/spurious calls to the server. Is this a possibility? If the periodicity of the backlog's level is a red herring, is there something else we might consider? See below for system details on the file server. The clients all run Linux on 32 and 64-bit machines connected to our servers via gigabit links. Thanks! $ uname -a Linux 2.6.9-42.0.3.ELsmp #1 SMP Thu Oct 5 16:29:37 CDT 2006 x86_64 x86_64 x86_64 GNU/Linux $ rpm -qa | grep openafs openafs-1.4.1-0.11.SL.x86_64 openafs-devel-1.4.7-68.SL4.x86_64 kernel-module-openafs-2.6.9-42.0.3.EL-1.4.1-0.11.SL.x86_64 openafs-firstboot-1.2.11-5.SL.noarch openafs-client-1.4.1-0.11.SL.x86_64 kernel-module-openafs-2.6.9-42.0.3.ELsmp-1.4.1-0.11.SL.x86_64 openafs-server-1.4.1-0.11.SL.x86_64 openafs-krb5-1.4.1-0.11.SL.x86_64 openafs-kpasswd-1.4.1-0.11.SL.x86_64 -- [Will maier]-----------------[[email protected]|http://www.lfod.us/] _______________________________________________ OpenAFS-info mailing list [email protected] https://lists.openafs.org/mailman/listinfo/openafs-info
