Re: [OpenAFS] 1.6.0pre2 Filelog CB: WhoAreYou failed for host

Dale Pontius Wed, 23 Mar 2011 11:08:07 -0700

On 03/21/2011 03:45 PM, Gémes Géza wrote:

2011-03-21 15:24 keltezéssel, Jeffrey Altman írta:

On 3/21/2011 2:42 AM, Gémes Géza wrote:

Hi,


I know this topic has been discussed before, but the conclusion was that
it is caused by NAT.

It is caused by firewalls, routers, network port translators and network
address translators (or any other similar device) that imposes a fixed
timeout on the length of time that inbound udp packets can be received
in response to outbound packets.

If the timeout period is less than the cache manager probe period, it is
likely that this error will be seen.

This is impossible in my case, as openafs servers are firewalled from
the outside world.
The fileserver has 3 ethernet interfaces:
1: connected to the clients, two IP addresses one active (other in the
NetRestrict file)
2: connected to a SAN, no IP addresses
3: connected to other cluster memebers, IP address in the NetRestrict file
vos listaddrs gives nothing just the right IP address for the vol and
fileserver.
The FileLog is full of entries like:
CB: WhoAreYou failed for host FILESERVER

I would check the firewall rules on the local machine.

Besides that all the clients (1.6.0pre2 on linux, 1.5.78 and 1.6.0pre3
on windows) are working as expected.
Except one (1.6.0pre2 on linux) which has two interfaces (one connected
two the Fileservers network and the other in its NetRestrict file).

Same here.  Check the firewall rules on that machine.

None of the computers in question have any firewalls (except some
Windows XP SP3 default firewalls, but there port 7001/UDP is open)
On any of the linux computers the iptables -L gives:
Chain INPUT (policy ACCEPT)
target     prot opt source               destination

Chain FORWARD (policy ACCEPT)
target     prot opt source               destination

Chain OUTPUT (policy ACCEPT)
target     prot opt source               destination
Besides that everything is connected at layer 2, there are no routers
between, the switches are HP Procurve and Openvswitch (Xen Cloud Platform)

Cheers

Geza

I dealt with this several years ago, and a friend helped me out. My"networking situation" got better, and I haven't needed to do businessthis way since, at least not a work.

Your problem isn't with the firewall proper, its with the masquerading(NAT) logic. The masquerading logic has a keepalive timer for UDPassociations - so it's a piece of state that's needed even if statefulfirewall logic isn't in place. I opened up a timeout and fixed theproblem. Now to remember where the heck that was... I'm browsingaround in /proc/sys/net/ipv4 and not finding anything at the moment.There are 3 udp entries, but none look like a timeout. Don't seeanything down in conf/eth0, either. Come to think of it, that may havebeen far enough back that I was running kernel-2.4 at the time, andthings have changed.

Still don't see anything, but I just want to get across the idea that aNAT timeout will exist even without a regular firewall, and that onceupon a time it could be tweaked. It probably still can, if one knowsthe magic incantation.


Dale Pontius

--
Dale Pontius
Senior Engineer
IBM Corporation
Phone: (802) 769-6850
Tie-Line: 446-6850
email: [email protected]

This e-mail and its attachments, if any, may contain confidential and 
privileged material for the sole use of the intended recipient. Any review, 
use, distribution or disclosure by others is strictly prohibited. If you are 
not the intended recipient (or authorized to receive for the recipient), please 
contact the sender by reply e-mail and delete all copies of this message from 
your system without copying it and notify sender of the misdirection by reply 
e-mail.

_______________________________________________
OpenAFS-info mailing list
[email protected]
https://lists.openafs.org/mailman/listinfo/openafs-info

Re: [OpenAFS] 1.6.0pre2 Filelog CB: WhoAreYou failed for host

Reply via email to