On 03/21/2011 03:45 PM, Gémes Géza wrote:
2011-03-21 15:24 keltezéssel, Jeffrey Altman írta:
On 3/21/2011 2:42 AM, Gémes Géza wrote:
Hi,
I know this topic has been discussed before, but the conclusion was that
it is caused by NAT.
It is caused by firewalls, routers, network port translators and network
address translators (or any other similar device) that imposes a fixed
timeout on the length of time that inbound udp packets can be received
in response to outbound packets.
If the timeout period is less than the cache manager probe period, it is
likely that this error will be seen.
This is impossible in my case, as openafs servers are firewalled from
the outside world.
The fileserver has 3 ethernet interfaces:
1: connected to the clients, two IP addresses one active (other in the
NetRestrict file)
2: connected to a SAN, no IP addresses
3: connected to other cluster memebers, IP address in the NetRestrict file
vos listaddrs gives nothing just the right IP address for the vol and
fileserver.
The FileLog is full of entries like:
CB: WhoAreYou failed for host FILESERVER
I would check the firewall rules on the local machine.
Besides that all the clients (1.6.0pre2 on linux, 1.5.78 and 1.6.0pre3
on windows) are working as expected.
Except one (1.6.0pre2 on linux) which has two interfaces (one connected
two the Fileservers network and the other in its NetRestrict file).
Same here. Check the firewall rules on that machine.
None of the computers in question have any firewalls (except some
Windows XP SP3 default firewalls, but there port 7001/UDP is open)
On any of the linux computers the iptables -L gives:
Chain INPUT (policy ACCEPT)
target prot opt source destination
Chain FORWARD (policy ACCEPT)
target prot opt source destination
Chain OUTPUT (policy ACCEPT)
target prot opt source destination
Besides that everything is connected at layer 2, there are no routers
between, the switches are HP Procurve and Openvswitch (Xen Cloud Platform)
Cheers
Geza
I dealt with this several years ago, and a friend helped me out. My
"networking situation" got better, and I haven't needed to do business
this way since, at least not a work.
Your problem isn't with the firewall proper, its with the masquerading
(NAT) logic. The masquerading logic has a keepalive timer for UDP
associations - so it's a piece of state that's needed even if stateful
firewall logic isn't in place. I opened up a timeout and fixed the
problem. Now to remember where the heck that was... I'm browsing
around in /proc/sys/net/ipv4 and not finding anything at the moment.
There are 3 udp entries, but none look like a timeout. Don't see
anything down in conf/eth0, either. Come to think of it, that may have
been far enough back that I was running kernel-2.4 at the time, and
things have changed.
Still don't see anything, but I just want to get across the idea that a
NAT timeout will exist even without a regular firewall, and that once
upon a time it could be tweaked. It probably still can, if one knows
the magic incantation.
Dale Pontius
--
Dale Pontius
Senior Engineer
IBM Corporation
Phone: (802) 769-6850
Tie-Line: 446-6850
email: [email protected]
This e-mail and its attachments, if any, may contain confidential and
privileged material for the sole use of the intended recipient. Any review,
use, distribution or disclosure by others is strictly prohibited. If you are
not the intended recipient (or authorized to receive for the recipient), please
contact the sender by reply e-mail and delete all copies of this message from
your system without copying it and notify sender of the misdirection by reply
e-mail.
_______________________________________________
OpenAFS-info mailing list
[email protected]
https://lists.openafs.org/mailman/listinfo/openafs-info