Hi, all- We've been having very acute and chronic periods during which one of our main fileservers shows large numbers of blocked connections. These periods do not (it seems) correlate with high system load, high network interface utilization, dropped packets, UDP errors, high I/O or other badness indicators that I'm accustomed to looking for.
rxdebug shows up to 200-300 blocked connections during these periods, which last up to an hour or so after which the badness abates. Since this server hosts several critical volumes, including one in which many $PATH elements live, users notice these disruptions very quickly. We've tried our best to balance accesses between our three main servers and have moved several very active volumes off the misbehaving server. After the move, the server handles ~1 million volume accesses in an hour; our busiest server (which does not experience this problem) handles nearly three times as many accesses. rxdebug usually shows ~8 thousand active server and client connections on this server. No events in the FileLog correspond with the blocked connections. I do see regular ProbeUuid failures, but those are benign (right?). This server has a dual-core 3.00GHz Xeon CPU, 4GB RAM and a 1Gbps network connection. Its vice partitions are stored on a fibre-attached Xserve RAID array. What other information would help resolve this problem? Is there another aspect of the system that I should examine? What further steps might we take to try to resolve the issue? Thanks! -- [Will [EMAIL PROTECTED]|http://www.lfod.us/] _______________________________________________ OpenAFS-info mailing list [email protected] https://lists.openafs.org/mailman/listinfo/openafs-info
