On Tuesday 08 November 2005 15.30, Jon Hart wrote: > On Tue, Nov 08, 2005 at 01:39:21AM +0100, Per-Olov Sjöholm wrote: > > Hi > > > > I have a redundant firewall with CARP. 3.6 STABLE plus all patches from > > CVS for stable (updated last week). The firewalls have 7 nic ports each. > > External, internal, pfsync and 4 dmz interfaces. The servers are > > firewalls, DNS, mailrelay, antivirus, spamkillern ntp and dhcp for > > internal hosts. Everything works perfect! Except for the facts that > > sessions are stalling during transfers of big files. I have tried to > > remove "aggressive timeouts", "adaptive timeouts" and "scrub" without > > success. It doesn't matter if the transfer goes over NAT from Lan to > > internet or from a real IP on dmz2 to the internet. We have tried many > > different protocols such as SSH, amanda and more. > > > > Turning on -x loud give ALOT of the below (maybe irrelevant??) > > --snip-- > > Nov 8 00:49:53 san /bsd: pfsync: ignoring stale update (3) id: > > 4367413c000b4c76 creatorid: e31b4f22 > > Nov 8 00:49:53 san /bsd: pfsync: ignoring stale update (3) id: > > 4367413c000b4c75 creatorid: e31b4f22 > > Nov 8 00:49:53 san /bsd: pfsync: ignoring stale update (3) id: > > --snip-- > > Do you get these all the time or just when the system is under load? > For some reason your primary carp host is getting hold updates from > someone else, presumably the other carp machine. Something seems out of > whack here. >
I just enabled "pfctl -x loud" when I had the problem and the above stuff flooded me. I got 8MB logs in just a minute or so caused by the above. I will try to check if these appear all the time. Any suggestion of what is the best and easiest way of doing this? > > Nothing comes up as blocked in the firewall log when a session is > > stalling. I have Intel 10/100 (fxp nics) and Soekris lan1641 quad boards > > (sis nics) > > When I read 'sis' I immediately suspected those cards as the problem as > I know others on the list have had problems with those cards under load > in the past. I believe this may have been fixed in more recent releases > though, but don't quote me on it. I shouldn't be these sis cards. These sis cards exists only on dmz segments. lan=fxp, internet=fxp, pfsync=xl. And I have these problems from the lan as well. > > > Don't look to close to the queuing stuff as it's not complete. > > The rows from Firewall-1 pf.conf (primary) on the link below. > > http://www.incedo.org/~sjoholmp/pf/pf.conf > > (secondary FW have exactly the same pf.conf) > > The only comment I have about that ruleset that may be relevant is the > max states. Even though you've got it commented out it will still > default to 10k states unless you say otherwise. This may not even be > relevant because a large transfer should not necessarily drive the > number of states through the roof. Depends on the method used to > download, of course. We are not near that limit. Checked by "pfctl -s state|wc -l" and "pftop" many times. > > > Any suggestions? > > In no particular order... > > Figure out why you are getting stale updates from pfsync. Do a simple > test. Your two carp hosts, ONE other client machine. From the client, > initiate a connection outbound and ensure that the two pfsync hosts have > similar (if not identical) state tables. They have equal state tables. And nothing is blocked in the firewall. > > When downloading, keep an eye on carp and see if the two hosts are > flopping between master and slave. If you don't feel like doing this > manually, use ifstated (may not have been available in 3.6 though). I will check this... > > Use systat/vmstat to see how the system is acting under load. Looks > like we are dealing with a 2M pipe so it shouldn't be an issue but worth > looking at anyway. > > Take the second host out of the picture entirely and see if your > problems persist. This is the first test I will try. > > -jon Thanks Per-Olov -- GPG keyID: 4DB283CE GPG fingerprint: 45E8 3D0E DE05 B714 D549 45BC CFB4 BBE9 4DB2 83CE
