In my institution we run an AFS cell where some of the fileservers are OpenAFS and others (most of them) are Transarc AFS Every now and then ( once a month or less ) it happens that one of our fileservers becomes very slow and using rxdebug $servername 7000 -rxstats it can be seen that the server has 9 connections to the SAME client which blocks the activity:
Tue Feb 8 14:33:22 NFT 2005 waiting_for_process wp=00009_res=01287_ig=25802 1 192.107.51.29 Port=1434_id=8bb6c9ac/8162d80_R=2288_S=28124 2 192.107.51.29 Port=1434_id=8bb6c9ac/8162d84_R=2288_S=28124 3 192.107.51.29 Port=1434_id=8bb6c9ac/8162d88_R=2288_S=28124 4 192.107.51.29 Port=1434_id=8bb6c9ac/8162d90_R=2288_S=28124 5 192.107.51.29 Port=1434_id=8bb6c9ac/8162d94_R=2288_S=28124 6 192.107.51.29 Port=1434_id=8bb6c9ac/8162d98_R=2288_S=28124 7 192.107.51.29 Port=1434_id=8bb6c9ac/8162da0_R=2288_S=28124 8 192.107.51.29 Port=1434_id=8bb6c9ac/8162da4_R=2288_S=28124 9 192.107.51.29 Port=1434_id=8bb6c9ac/8162da8_R=2288_S=28124 The client usually is an OpenAFS WIndows Client behind NAT. (it happens also with recent 1.3.x versions) We observe it for sure on Transarc AFS fileserver. Today case is a Solaris with Transarc AFS 3.6 2.32. The only way to end the problem is to disconnect completely the client. If the file server is just restarted using bos, the problem arises again in a short time. When the problem arises the following messages are found (3-4 times each minute) in the FileLog: .. Tue Feb 8 07:57:36 2005 CB: RCallBackConnectBack failed for c06b331d.1434 Tue Feb 8 07:58:32 2005 CB: Call back connect back failed (in break delayed) for c06b331d.1434 Tue Feb 8 07:58:32 2005 BreakDelayedCallbacks FAILED for host c06b331d which IS UP. Possible network or routing failure. ... where c06b331d.1434 is the same address as the one obtained from rxdebug, 192.107.51.29 Looking on the web using the keyword BreakDelayedCallbacks I have found a 2001 posting: https://lists.openafs.org/pipermail/openafs-devel/2001-March/005683.html which seems connected with the "BreakDelayedCallbacks" error message and suggesting a patch for OpenAFS. Actually I have tried to describe the problem, but I do not understand why it arises seldomly and only with NAT clients. The question: has this kind of problem been solved in the current version of OpenAFS and the solution is to migrate to OpenAFS all our file server? Any suggestion or explanation is well accepted! Giovanni -- Giovanni Bracco ENEA INFO (Servizio Informatica e Reti) Via E. Fermi 45 I-00044 Frascati (Roma) Italy phone 00-39-06-9400-5597 FAX 00-39-06-9400-5735 E-mail [EMAIL PROTECTED] WWW http://fusfis.frascati.enea.it/~bracco _______________________________________________ OpenAFS-info mailing list [email protected] https://lists.openafs.org/mailman/listinfo/openafs-info
