Hello all

I have recently upgraded a pair of CARPed firewalls from 4.6 to 5.0
(late, I know ...) after almost 2 years of absolutely flawless operation
(ipv4 interfaces only).

I have changed all the nat/rdr rules in pf.conf to the new syntax, not
changed any other fw/nw setting (at least to my knowledge - I used
sysmerge in the process, carefully, and haven't noticed any fw/nw
related changes in any file. The boxes are rather straight forwardly
configured "plain" firewalls and very close to the default settings).

They have 4 interfaces each, the external (egress, carp0 on em0) one
being connected to the provider's switches (professional gear, Cisco or
the like), the dmz (internal, carp1-3 on em1-3) ones being connected to
a pair of levelone gsw-1641 ("web smart switch", the cheap stuff).

The two fw (fw1=master, and fw2=backup) and switches have been rebooted
multiple times by now.

The problem now is that the CARP master selection leads to weird
results. After rebooting both, I get the following picture:

fw1 (master, advbase 1 advskew 1):
carp0: BACKUP
carp1: MASTER
carp2: MASTER
carp3: BACKUP

ifconfig -g carp
carp: carp demote count 3

fw2 (backup, advbase 1 advskew 10)
carp0: MASTER
carp1: MASTER
carp2: MASTER
carp3: MASTER

ifconfig -g carp
carp: carp demote count 2

I get the following in dmesg on fw1:
carp: carp0 demoted group carp by 1 to 129 (carpdev)
carp: carp1 demoted group carp by 1 to 130 (carpdev)
carp: carp2 demoted group carp by 1 to 131 (carpdev)
carp: carp3 demoted group carp by 1 to 132 (carpdev)
carp: carp2 demoted group carp by -1 to 131 (carpdev)
carp: carp2 demoted group xfer by -1 to 0 (carpdev)
carp: carp0 demoted group carp by -1 to 130 (carpdev)
carp: pfsync0 demoted group carp by 1 to 131 (pfsync bulk start)
carp: pfsync0 demoted group pfsync by 1 to 1 (pfsync bulk start)
carp: carp3 demoted group carp by -1 to 130 (carpdev)
carp: carp3 demoted group mgmt by -1 to 0 (carpdev)
carp: carp1 demoted group carp by -1 to 129 (carpdev)
carp: carp1 demoted group coca by -1 to 0 (carpdev)
carp2: state transition: BACKUP -> MASTER
carp1: state transition: BACKUP -> MASTER
carp: pfsync0 demoted group carp by -1 to 128 (pfsync bulk done)
carp: pfsync0 demoted group pfsync by -1 to 0 (pfsync bulk done)
carp: carp2 demoted group carp by 1 to 129 (> snderrors)
carp: carp1 demoted group carp by 1 to 130 (> snderrors)
carp: carp1 demoted group coca by 1 to 1 (> snderrors)
carp: carp2 demoted group xfer by 1 to 1 (> snderrors)
carp0: state transition: BACKUP -> MASTER
carp3: state transition: BACKUP -> MASTER
carp: carp3 demoted group carp by 1 to 3 (> snderrors)
carp: carp3 demoted group mgmt by 1 to 1 (> snderrors)
carp0: state transition: MASTER -> BACKUP
nd6_na_input: duplicate IP6 address fe80:0008::0200:5eff:fe00:01c8
carp3: state transition: MASTER -> BACKUP


dmesg on fw2 gives this:
carp: carp0 demoted group carp by 1 to 129 (carpdev)
carp: carp1 demoted group carp by 1 to 130 (carpdev)
carp: carp2 demoted group carp by 1 to 131 (carpdev)
carp: carp3 demoted group carp by 1 to 132 (carpdev)
carp: pfsync0 demoted group carp by 1 to 133 (pfsync bulk start)
carp: pfsync0 demoted group pfsync by 1 to 1 (pfsync bulk start)
carp: carp2 demoted group carp by -1 to 132 (carpdev)
carp: carp2 demoted group xfer by -1 to 0 (carpdev)
carp: carp1 demoted group carp by -1 to 131 (carpdev)
carp: carp1 demoted group coca by -1 to 0 (carpdev)
carp: carp0 demoted group carp by -1 to 130 (carpdev)
carp: carp3 demoted group carp by -1 to 129 (carpdev)
carp: carp3 demoted group mgmt by -1 to 0 (carpdev)
carp: pfsync0 demoted group carp by -1 to 128 (pfsync bulk done)
carp: pfsync0 demoted group pfsync by -1 to 0 (pfsync bulk done)
carp2: state transition: BACKUP -> MASTER
carp1: state transition: BACKUP -> MASTER
carp: carp2 demoted group carp by 1 to 129 (> snderrors)
carp: carp1 demoted group carp by 1 to 130 (> snderrors)
carp: carp1 demoted group coca by 1 to 1 (> snderrors)
carp: carp2 demoted group xfer by 1 to 1 (> snderrors)
carp0: state transition: BACKUP -> MASTER
carp3: state transition: BACKUP -> MASTER
carp: carp3 demoted group carp by 1 to 3 (> snderrors)
carp: carp3 demoted group mgmt by 1 to 1 (> snderrors)
carp0: state transition: MASTER -> BACKUP
nd6_na_input: duplicate IP6 address fe80:0008::0200:5eff:fe00:01c8
arp info overwritten for 10.10.10.100 by 00:1e:68:9a:e4:4f on em2
nd6_na_input: duplicate IP6 address fe80:0009::0200:5eff:fe00:01c9
carp3: state transition: MASTER -> BACKUP
nd6_na_input: duplicate IP6 address fe80:000b::0200:5eff:fe00:01ff
nd6_na_input: duplicate IP6 address fe80:000a::0200:5eff:fe00:01d2
carp0: state transition: BACKUP -> MASTER
carp3: state transition: BACKUP -> MASTER
carp: carp3 demoted group carp by -1 to 2 (< snderrors)
carp: carp3 demoted group mgmt by -1 to 0 (< snderrors)
nd6_na_input: duplicate IP6 address fe80:000a::0200:5eff:fe00:01d2
nd6_na_input: duplicate IP6 address fe80:0009::0200:5eff:fe00:01c9
carp0: state transition: MASTER -> BACKUP
nd6_na_input: duplicate IP6 address fe80:0008::0200:5eff:fe00:01c8
nd6_na_input: duplicate IP6 address fe80:000b::0200:5eff:fe00:01ff
carp0: state transition: BACKUP -> MASTER
carp0: state transition: MASTER -> BACKUP
nd6_na_input: duplicate IP6 address fe80:0008::0200:5eff:fe00:01c8
nd6_na_input: duplicate IP6 address fe80:000b::0200:5eff:fe00:01ff
carp0: state transition: BACKUP -> MASTER
nd6_na_input: duplicate IP6 address fe80:000a::0200:5eff:fe00:01d2
nd6_na_input: duplicate IP6 address fe80:0009::0200:5eff:fe00:01c9
carp0: state transition: MASTER -> BACKUP
nd6_na_input: duplicate IP6 address fe80:0008::0200:5eff:fe00:01c8
nd6_na_input: duplicate IP6 address fe80:000b::0200:5eff:fe00:01ff
carp0: state transition: BACKUP -> MASTER

Both have this:
net.inet.carp.allow=1
net.inet.carp.preempt=1
net.inet.carp.log=3

I can force fw1 to be master with
#ifconfig -g carp -carpdemote 3
but fw1 still remains master on its interfaces. I actually need to take
them down with ifconfig carpX down.

If I set net.inet.carp.log=7, I get lots of the following on both fws,
only for carp1 and carp2, never for carp0 and carp3:
carp2: ip_output failed: 65
carp1: ip_output failed: 65
carp2: ip_output failed: 65
carp1: ip_output failed: 65
carp2: ip_output failed: 65
carp1: ip_output failed: 65

Having read somewhere that this would hint at pf blocking the carp
sending, I have turned on logging on each and every pf rule, but no
messages about dropped carp packets are logged.

The rules in question in the meantime read:
pass quick on em3 inet proto pfsync all label "RULE -5 -- ACCEPT "
pass quick on em3 inet proto carp all label "RULE -4 -- ACCEPT "
pass quick on em2 inet proto carp all label "RULE -3 -- ACCEPT "
pass quick on em1 inet proto carp all label "RULE -2 -- ACCEPT "
pass quick on em0 inet proto carp all label "RULE -1 -- ACCEPT "

and
pass out quick inet proto carp from <IP> to 224.0.0.18 label "RULE 2 --
ACCEPT " - repeated for each fw <IP>

I did a pfctl -d on fw2, which resulted in this in the log:
Jan 11 23:37:00 wall0102 /bsd: carp: carp2 demoted group carp by -1 to 1
(< snderrors)
Jan 11 23:37:00 wall0102 /bsd: carp: carp2 demoted group xfer by -1 to 0
(< snderrors)
Jan 11 23:37:00 wall0102 /bsd: carp: carp1 demoted group carp by -1 to 0
(< snderrors)
Jan 11 23:37:00 wall0102 /bsd: carp: carp1 demoted group coca by -1 to 0
(< snderrors)

and the other errors stopped

after pfctl -e, they all reappeared:
carp1: ip_output failed: 65
carp3: ip_output failed: 65
carp1: ip_output failed: 65
carp3: ip_output failed: 65
nd6_na_input: duplicate IP6 address fe80:0009::0200:5eff:fe00:01c9
carp1: ip_output failed: 65
carp1: ip_output failed: 65
carp3: ip_output failed: 65
carp: carp3 demoted group carp by 1 to 2 (> snderrors)
carp1: ip_output failed: 65
carp3: ip_output failed: 65
carp: carp3 demoted group mgmt by 1 to 1 (> snderrors)
carp: carp1 demoted group carp by 1 to 2 (> snderrors)
carp1: ip_output failed: 65
carp3: ip_output failed: 65
carp: carp1 demoted group coca by 1 to 1 (> snderrors)
nd6_na_input: duplicate IP6 address fe80:000b::0200:5eff:fe00:01ff


So in conclusion it seems that pf actually is blocking carp. But I
cannot for the life of me see why. Could it be that carp is trying to
send ipv6? The carp interfaces to have ipv6 addresses (automatically
assigned) that seem to correspond to the above "duplicate ip" messages:
ifconfig carp | grep inet6
        inet6 fe80::200:5eff:fe00:1c8%carp0 prefixlen 64 scopeid 0x8
        inet6 fe80::200:5eff:fe00:1c9%carp1 prefixlen 64 scopeid 0x9
        inet6 fe80::200:5eff:fe00:1d2%carp2 prefixlen 64 scopeid 0xa
        inet6 fe80::200:5eff:fe00:1ff%carp3 prefixlen 64 scopeid 0xb


Sorry for the lengthy post, I'm really out of ideas ...
Very grateful for any hint

best
/markus

Reply via email to