We had similar problem with RS6000 clients. It seemed to effect only
the RS6000's and only on large data transfers (e.g. big files or lots
of little ones). It turned out to be a router problem which went away
when they were upgraded. 
        - Woody Kellum

         
         "Derrick J. Brashear" <[EMAIL PROTECTED]>  writes:
         > I posted this to alt.filesystems.afs, but I figured I'd post it here
         > too, in case... 
         > Hi, 
         > 
         > I'm having a problem with connections being timed out; Namely, when I 
         > attempt to cd to /afs/club.cc.cmu.edu I get: 
         > afs: Lost contact with volume location server 128.2.232.226 in cell 
         > club.cc.cmu.edu 
         > /afs/club.cc.cmu.edu: Connection timed out. 
         > 
         > Of course, it's not really down: 
         > /usr/local/bin/fs checks -cell club.cc.cmu.edu 
         > afs: volume location server 128.2.232.226 in cell club.cc.cmu.edu is back 
up 
         > All servers are running. 
         > 
         > >From a different machine, I can get there, no problem. The CellServDB 
         > entry for club.cc.cmu.edu is correct; The machine was rebooted with no 
         > change in the problem. I removed the cache and rebooted, still no change. 
         > 
         > I'm really not sure what I'm looking for. Can anyone give me pointers? 
         > 
         > Thanks 
         > -D 
         > 
         > 
         Derrick,
              I've seen something similar in our cell.  The afs servers have multiple
         network interfaces, but only one "known" address as far as vlserver and/or
         CellServDB are concerned.  The client call to the server goes to the known
         interface, but the response from the server comes via another of its 
interfaces in
         order to reduce the load on our routers.  The problems occurred only on client
         machines that also had multiple network adapters.  When the client heard back 
from
         the server via its unadvertised interface, it sent its response out the wrong
         adapter, e.g.
         
         afs server known by its 129.35.xx.xx address, but it's default router is on 
its
         9.3.xx.xx interface.
         client has 129.35 and 9.something interface to a lab ring.  It talks to the
         server via the 129.35 interface, but the response packet has a 9.3.xx.xx 
source
         address.  When the client attempted to respond to this packet, tcpip routed 
it out
         the 9.interface and it never reached the server - hence Lost contact message. 
         Forcing the response out the 129.35 interface on the client with a route add 
-net
         statement in rc.tcpip cured the problem
         
         David
         ----------------------------------------------------------------
         David Littlewood                             Tie Line  793-8832
         email:              [EMAIL PROTECTED]    Phone 512-823-8832
         

Reply via email to