Christopher D. Clausen wrote:

> Client (flexo.acm.uiuc.edu) is Mac OS X 10.3 running the 1.4.1 binary 
> release from openafs.org (previous time this happened it was 1.4.1-rc8 
> from openafs.org.)
> 
> Server (alnitak.acm.uiuc.edu) is Solaris 10 SPARC running 1.4.1-rc10 
> (previous time it was running 1.4.1-rc8, I think) that I compiled from 
> source.
> 
> The client has a hardcoded IP of: 128.174.251.23 that is on the same 
> non-firewalled subnet as the server.  The server apparently thinks that 
> the client has changed IPs (69.112.249.245) probes to find it, can't, 
> and the client marks the server down and makes all volumes on that 
> server inaccessible.
> 
> Restarting the client had no effect.  I had to restart the fs process on 
> the server to remove the error condition.
> 
> Anyone else seen this happen?  Or have a better solution than restarting 
> the fs process if it happens again?  FileLog is below:
> 
> Thu Apr 27 13:59:13 2006 MultiProbe failed to find new address for host 
> 69.112.249.245:7001
> Thu Apr 27 13:59:20 2006 CB: Call back connect back failed (in break 
> delayed) for Host 69.112.249.245:7001
> Thu Apr 27 13:59:20 2006 BreakDelayedCallbacks FAILED for host 
> 69.112.249.245:7001 which IS UP.  Connection from 128.174.251.23:7001. 
> Possible network or routing failure.
> Thu Apr 27 13:59:20 2006 MultiProbe failed to find new address for host 
> 69.112.249.245:7001
> Thu Apr 27 14:02:20 2006 CB: Call back connect back failed (in break 
> delayed) for Host 69.112.249.245:7001
> Thu Apr 27 14:02:20 2006 BreakDelayedCallbacks FAILED for host 
> 69.112.249.245:7001 which IS UP.  Connection from 128.174.251.23:7001. 
> Possible network or routing failure.
> Thu Apr 27 14:02:20 2006 MultiProbe failed to find new address for host 
> 69.112.249.245:7001
> Thu Apr 27 14:06:56 2006 CB: WhoAreYou failed for 69.112.249.245:7001, 
> error -01
> Thu Apr 27 14:07:03 2006 CB: Call back connect back failed (in break 
> delayed) for Host 69.112.249.245:7001
> Thu Apr 27 14:07:03 2006 BreakDelayedCallbacks FAILED for host 
> 69.112.249.245:7001 which IS UP.  Connection from 128.174.251.23:7001. 
> Possible network or routing failure.
> Thu Apr 27 14:07:03 2006 MultiProbe failed to find new address for host 
> 69.112.249.245:7001
> Thu Apr 27 14:09:05 2006 CB: WhoAreYou failed for 69.112.249.245:7001, 
> error -01
> Thu Apr 27 14:09:12 2006 CB: Call back connect back failed (in break 
> delayed) for Host 69.112.249.245:7001
> Thu Apr 27 14:09:12 2006 BreakDelayedCallbacks FAILED for host 
> 69.112.249.245:7001 which IS UP.  Connection from 128.174.251.23:7001. 
> Possible network or routing failure.
> 
> And yes, I am in the process of upgrading to the 1.4.1 release right now 
> on our servers.
> 
> <<CDC

Please bump the server log level to 125 for a while so we can get more
detail.  I am very interested in the fact that the file server continues
to report the same problem after a client restart.   To the best of my
knowledge the MacOSX client does not preserve the UUID between restarts.
Therefore, the old UUID and the new should be treated as separate
clients.  The only reason that a client on two addr:ports should be
considered the same is if they have the same UUID.

Jeffrey Altman

Attachment: smime.p7s
Description: S/MIME Cryptographic Signature

Reply via email to