We've got a user here who's behind a firewall/nat and having some problems. The 
firewall is likely an aggressive, as it protects a hospital. He's running the 
stock Ubuntu 9.10 client, which he reports as being oafs version 1.4.11.

The symptom is that he gets periodic hangs when accessing files in afs. During 
those times his /var/log/messages shows a lot of sequences like this:

May 14 11:33:57 minime kernel: [353065.752034] afs: Lost contact with
  file server 141.211.1.127 in cell umich.edu (all multi-homed ip
  addresses down for the server)
May 14 11:33:57 minime kernel: [353065.752039] afs: Lost contact with
  file server 141.211.1.127 in cell umich.edu (all multi-homed ip
  addresses down for the server)
May 14 11:34:13 minime kernel: [353081.773810] afs: file server
  141.211.1.127 in cell umich.edu is back up (multi-homed address; other
  same-host interfaces may still be down)
May 14 11:34:13 minime kernel: [353081.773815] afs: file server
  141.211.1.127 in cell umich.edu is back up (multi-homed address; other
  same-host interfaces may still be down)

On the server side we see msgs like:

  Fri May 14 11:33:08 2010 CB: ProbeUuid for <addr>:<port> failed -01

where the IP address is the firewall and the port number is not a standard afs 
port. Port number also varies all over the map. Time correspondence is pretty 
strong.

At this point I'm guessing that the nat box is dropping the mapping between 
internal and external UDP ports. The 1.5.73 release notes mention this issue, 
saying they add UDP keepalive for just that reason.

Next step would be to have him try 1.5.74, but before he goes that far I'd be 
interested in anyone who's seen similar problems and what if anything fixed 
them.

Steve_______________________________________________
OpenAFS-info mailing list
[email protected]
https://lists.openafs.org/mailman/listinfo/openafs-info

Reply via email to