Bryan Irvine wrote:
I do believe preempt should be 1 on both servers. Let the advskew
handle which one is primary.

What do you see for output of 'netstat -s -p carp' and 'netstat -s -p pfsync'

-B

I tried it with both servers set to preempt=1, with the same results, but to double check I did it again. The results are identical to everything I posted previous, except (on the secondary server):

$ sysctl net.inet.carp
net.inet.carp.allow=1
net.inet.carp.preempt=1
net.inet.carp.log=2

Per your request:

(on the primary:)
$  netstat -s -p carp
carp:
       226 packets received (IPv4)
       0 packets received (IPv6)
               0 packets discarded for bad interface
               0 packets discarded for wrong TTL
               0 packets shorter than header
               0 discarded for bad checksums
               0 discarded packets with a bad version
               0 discarded because packet too short
               0 discarded for bad authentication
               226 discarded for unknown vhid
               0 discarded because of a bad address list
       387 packets sent (IPv4)
       0 packets sent (IPv6)
               0 send failed due to mbuf memory error
       1 transition to master

(on the secondary:)
$  netstat -s -p carp
carp:
   335 packets received (IPv4)
   0 packets received (IPv6)
       0 packets discarded for bad interface
       0 packets discarded for wrong TTL
       0 packets shorter than header
       0 discarded for bad checksums
       0 discarded packets with a bad version
       0 discarded because packet too short
       0 discarded for bad authentication
       335 discarded for unknown vhid
       0 discarded because of a bad address list
   236 packets sent (IPv4)
   0 packets sent (IPv6)
       0 send failed due to mbuf memory error
   1 transition to master

This was done after a clean reboot (both) and my accessing the site from an external shell account I have (using lynx). The secondary still responds first, and when it is taken offline (halt -p), the primary does not take over (no answer). The primary only takes over normal duties when the hostname.carp0 file has been renamed on the secondary, the secondary has actually been rebooted and sh /etc/netstart has been run on the primary. After the secondary was taken offline, and sh /etc/netstart run on the primary, I accessed the site again (the primary is then the only carp node), and did this: (from the primary)

$ netstat -s -p carp
carp:
       372 packets received (IPv4)
       0 packets received (IPv6)
               0 packets discarded for bad interface
               0 packets discarded for wrong TTL
               0 packets shorter than header
               0 discarded for bad checksums
               0 discarded packets with a bad version
               0 discarded because packet too short
               0 discarded for bad authentication
               372 discarded for unknown vhid
               0 discarded because of a bad address list
       704 packets sent (IPv4)
       0 packets sent (IPv6)
               0 send failed due to mbuf memory error
       1 transition to master

As for output regarding pfsync, all values are zero because I do not use pfsync. It is a single firewall with two web servers internally, not a redundant firewall situation. No changes have been made to the firewall at all.

I'm at my wits end for why this doesn't work. It *must* be something wrong with my config, as I just don't believe it's a "bug" in carp. This config is practically straight out of the FAQ so I'm at a total loss. :(

FWIW, the pf.conf on the firewall uses these values (which normally work fine):
(...)
gw_ext=$ext_ip4 <-- my external IP addy for that web site, I have 5 IPs
gw_int="192.168.0.9" <-- the carp node, or when not using carp, the primary web server #gw_int="192.168.0.19" <-- for when I manually switch to the secondary server
gw_ports="{ 80, 443 }"
int0_if="xl0"
tcp_flags="flags S/SA modulate state"
(...)
not_private="{ \
   !0.0.0.0/8, \
   !10.0.0.0/8, \
   !127.0.0.0/8, \
   !169.254.0.0/16, \
   !172.16.0.0/12, \
   !192.8.2.0/24, \
   !192.168.0.0/16, \
   !240.0.0.0/4, \
   !255.255.255.255/32 \
}"
(...)
rdr on $ext_if proto tcp from $not_private to $gw_ext port \
       $gw_ports -> $gw_int
(...)
pass in log quick on $ext_if inet proto tcp from $not_private to $gw_int \
   port $gw_ports flags S/SA synproxy state
(...)
pass out quick on $int0_if proto tcp from $not_private to $gw_int \
   port $gw_ports $tcp_flags

The firewall config has worked fine and hasn't been changed in ages, but I can't help wonder if something there is screwing up carp. Redoing and simplifying the fw rules (using tags) is next on my todo list, but I figured I'd get carp working first before changing a "known good" fw config and adding another change to the mix.

--

-RSM

http://www.erratic.ca

Reply via email to