It seems I was barking up the wrong tree with the previous error, which confused the issue. The ProbeUuid error may have more to do with the problem. Perhaps a better, more complete description (with ideally no ambiguity) is in order.
- Occasionally when many small files are transferred quickly onto a volume, the server containing the volume will time out on one or more clients. These clients will no longer be able to access the server. - A "Connection timed out" error is shown in a terminal session on an affected client when attempting to access a volume from the affected server, which has now become inaccessable. - When a client can no longer access the affected server, the following entry comes up for the client system in the affected server's FileLog: ProbeUuid failed for host xxx.xxx.xxx.xxx:7001 - Typing 'fs checkserver' on the affected client produces the following error: These servers unavailable due to network or server problems: [affected server hostname] - Some other clients are able to access the server. I believe that this may be due to the unaffected clients not accessing the volumes which were under heavy use. - Shutting down the AFS client and ensuring that the kernel module is removed, then restarting the AFS client does not allow the affected client to access the affected server. - Re-starting the fs, volserver, ptsserver services on the affected server alone does not allow the affected client to access the server. Shutting down and then restarting the AFS service completely on the affected server also has no effect on the affected client. - Rebooting the affected client computer does allow it to access the affected server. - The servers are running OpenAFS 1.2.13, the affected client in this case is also running 1.2.13. Older clients have also shown this behavior in the past. - The firewall allows traffic initiated by the client, which tends to work. This issue tends to happen every few months. The affected system at this point is my workstation, and the affected server does not contain volumes which I need to access directly. Thus, I'm willing to keep it online until I can determine the cause of the issue. Does this issue sound familiar to anyone? Regards, Lester Barrows Asani Solutions, LLC Code TI Systems Group NASA Ames Research Center _______________________________________________ OpenAFS-info mailing list [email protected] https://lists.openafs.org/mailman/listinfo/openafs-info
