Re: carp failover problem

2015-01-31 Thread Leclerc, Sebastien
  Will try it during the weekend...
 

After reconnecting the firewalls differently, I got it fixed.
Logically, the connections are the same, but apparently the 5300xl had a hard 
time with its arp table...
Instead of connecting both firewalls directly on the routing switch, I made a 
trunk back to the 2524, and connected the firewalls there.
Within seconds after disconnecting a port or rebooting either firewall, carp 
now handles the failover smoothly!

Thanks!

Sebastien



Re: carp failover problem

2015-01-30 Thread Leclerc, Sebastien
 Rebooted fw2 at 3h02, fw1 kept master state, but had downtime until 3h12
 Rebooted fw1 at 3h15, got downtime until 4h10, fw1 got master state at 3h16, 
 fw2 got backup state at the same time
 

Inspecting further my logs, I see that smtp services were functioning between 
wan and dmz during the downtime period.  Our monitoring is done from the lan, 
so I suspect the 5300xl is causing the problem...
Any thoughts?

Thanks

Sebastien



Re: carp failover problem

2015-01-30 Thread Leclerc, Sebastien
Jan 30, 2015; 8:10am Stuart Henderson wrote :

/etc/hostname.carp0
advskew 0 carpdev em0 carppeer 192.168.3.10 pass secret1 state master
vhid 1 inet 192.0.2.2/28

Maybe unrelated, but it's not usual to set state master like this.

I know, it was not in the config at first, I added it to test.

Also inet should normally be at the start of a line in hostname.if.

Fails miserably if I do it :(
Only aliases get assigned to the interface, and a message indicates that the 
address cannot be assigned to the interface (I don't have the exact message, I 
rebooted after the failure, and it's not in the logs...)

My config was like this :

inet 192.0.2.2/28
advskew 0 carpdev em0 pass secret1 state master vhid 1
alias 192.0.2.3/32

I also tried with this, with the same result :

inet 192.0.2.2/28 advskew 0 carpdev em0 pass secret1 state master vhid 1
alias 192.0.2.3/32

Do things work if you use the default multicast, rather than carppeer?

As you can see above, I removed the carppeer from the config.
I had to add back the addresses manually to the carp interfaces, but then I got 
worst results : fw1 was master on all carp interfaces, but fw2 was backup on 
carp0 and carp2, and master on carp1
So I reverted to my previous configuration.

I changed some pf rules yesterday (removed antispoof) and disabled sasyncd, and 
rebooted during the night.
At least in the morning, everything was ok, but inspecting our monitoring 
system, here is what I found :

Rebooted fw2 at 3h02, fw1 kept master state, but had downtime until 3h12
Rebooted fw1 at 3h15, got downtime until 4h10, fw1 got master state at 3h16, 
fw2 got backup state at the same time

Thanks for your help


This mail was missing a few things. dmesg and ifconfig -A output would
be useful for starters (then we don't have to wonder how netstart parsed
your files).

Fw1 :

lo0: flags=8049UP,LOOPBACK,RUNNING,MULTICAST mtu 33144
priority: 0
groups: lo
inet6 ::1 prefixlen 128
inet6 fe80::1%lo0 prefixlen 64 scopeid 0x6
inet 127.0.0.1 netmask 0xff00
em0: flags=8b43UP,BROADCAST,RUNNING,PROMISC,ALLMULTI,SIMPLEX,MULTICAST mtu 
1500
lladdr 00:25:90:f2:6e:9a
priority: 0
media: Ethernet autoselect (100baseTX full-duplex)
status: active
inet 192.168.3.9 netmask 0xfffc broadcast 192.168.3.11
inet6 fe80::225:90ff:fef2:6e9a%em0 prefixlen 64 scopeid 0x1
em1: flags=8b43UP,BROADCAST,RUNNING,PROMISC,ALLMULTI,SIMPLEX,MULTICAST mtu 
1500
lladdr 00:25:90:f2:6e:9b
priority: 0
media: Ethernet autoselect (100baseTX full-duplex)
status: active
inet 192.168.3.1 netmask 0xfff8 broadcast 192.168.3.7
inet6 fe80::225:90ff:fef2:6e9b%em1 prefixlen 64 scopeid 0x2
em2: flags=8b43UP,BROADCAST,RUNNING,PROMISC,ALLMULTI,SIMPLEX,MULTICAST mtu 
1500
lladdr 00:25:90:f2:6e:9c
priority: 0
media: Ethernet autoselect (100baseTX full-duplex)
status: active
inet 192.168.3.13 netmask 0xfffc broadcast 192.168.3.15
inet6 fe80::225:90ff:fef2:6e9c%em2 prefixlen 64 scopeid 0x3
em3: flags=8843UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST mtu 1500
lladdr 00:25:90:f2:6e:9d
priority: 0
media: Ethernet autoselect (1000baseT 
full-duplex,master,rxpause,txpause)
status: active
inet 192.168.3.17 netmask 0xfffc broadcast 192.168.3.19
inet6 fe80::225:90ff:fef2:6e9d%em3 prefixlen 64 scopeid 0x4
enc0: flags=41UP,RUNNING
priority: 0
groups: enc
status: active
tun0: flags=8051UP,POINTOPOINT,RUNNING,MULTICAST mtu 1500
priority: 0
groups: tun
status: active
inet 10.233.0.1 -- 10.233.0.2 netmask 0x
pfsync0: flags=41UP,RUNNING mtu 1500
priority: 0
pfsync: syncdev: em3 syncpeer: 192.168.3.18 maxupd: 128 defer: off
groups: carp pfsync
pflog0: flags=141UP,RUNNING,PROMISC mtu 33144
priority: 0
groups: pflog
carp0: flags=8843UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST mtu 1500
lladdr 00:00:5e:00:01:01
priority: 0
carp: MASTER carpdev em0 vhid 1 advbase 1 advskew 0 carppeer 
192.168.3.10
groups: carp egress
status: master
inet6 fe80::200:5eff:fe00:101%carp0 prefixlen 64 scopeid 0x7
inet 192.0.2.2 netmask 0xfff0 broadcast 192.0.2.15
inet 192.0.2.3 netmask 0x
inet 192.0.2.4 netmask 0x
inet 192.0.2.5 netmask 0x
carp1: flags=8843UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST mtu 1500
lladdr 00:00:5e:00:01:02
priority: 0
carp: MASTER carpdev em1 vhid 2 advbase 1 advskew 0 carppeer 192.168.3.4
groups: carp
status: master
inet6 fe80::200:5eff:fe00:102%carp1 prefixlen 64 scopeid 0x8
inet 192.168.3.6 netmask 0x
carp2: flags=8843UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST mtu 1500
lladdr 00:00:5e:00:01:03
priority: 0
carp: MASTER carpdev 

Re: carp failover problem

2015-01-30 Thread Christopher Barry
On Fri, 30 Jan 2015 17:18:07 -0500
Leclerc, Sebastien sebastien.lecl...@saint-georges.ca wrote:

 Rebooted fw2 at 3h02, fw1 kept master state, but had downtime until
 3h12 Rebooted fw1 at 3h15, got downtime until 4h10, fw1 got master
 state at 3h16, fw2 got backup state at the same time
 

Inspecting further my logs, I see that smtp services were functioning
between wan and dmz during the downtime period.  Our monitoring is
done from the lan, so I suspect the 5300xl is causing the problem...
Any thoughts?

Thanks

Sebastien


the issue I had with Procurve switches was related to it's STP
implementation. strange things were happening while trying to PXE
boot a large number of Linux cluster nodes using gpxe. Swapping out the
switch with a different brand solved the problem, and I never revisited
it.

if you can do a quick test on a different switch, that would at least
rule that out as your issue. if not, try disabling STP and retest.

-C



Re: carp failover problem

2015-01-30 Thread Leclerc, Sebastien
if you can do a quick test on a different switch, that would at least

rule that out as your issue. if not, try disabling STP and retest


That was my guess, using a trunk to link the vlan to an edge switch not 
affected by stp, and connecting the firewalls there.
This way, the 5300xl won't have to detect which port is connected to the 
gateway (the 5300xl is a routing switch for the lan)
Will try it during the weekend...

Sebastien



Re: carp failover problem

2015-01-30 Thread Stuart Henderson
On 2015-01-27, Christopher Barry christopher.r.ba...@gmail.com wrote:
 On Tue, 27 Jan 2015 12:01:37 -0500
 Leclerc, Sebastien sebastien.lecl...@saint-georges.ca wrote:

/etc/hostname.carp0
advskew 0 carpdev em0 carppeer 192.168.3.10 pass secret1 state master
vhid 1 inet 192.0.2.2/28

Maybe unrelated, but it's not usual to set state master like this.
Also inet should normally be at the start of a line in hostname.if.

Do things work if you use the default multicast, rather than carppeer?

This mail was missing a few things. dmesg and ifconfig -A output would
be useful for starters (then we don't have to wonder how netstart parsed
your files).

 Well, it's been many years since I ran carp, so I cannot actually help
 with the carp config, but I can absolutely say that I have experienced a
 lot of unexplainable weirdness with ProCurve switches, so I can
 appreciate your suspicions there. I'll never buy another.

Procurve switches have been working nicely for me in various setups
involving carp etc. I've used various: 2626 2824 2510-24 4200vl 5300zl
2530-24g etc. Not saying it's impossible but other areas seem more likely.



carp failover problem

2015-01-27 Thread Leclerc, Sebastien
Hi,

I have two firewalls in a carp failover setup, but the failover does not work 
as expected...
The problem happens when I reboot the backup firewall (while in backup state).
Just after the reboot, I have these entries in dmesg :

carp0: state transition: BACKUP - MASTER
carp1: state transition: BACKUP - MASTER
carp0: state transition: MASTER - BACKUP
carp1: state transition: MASTER - BACKUP

Why would there be no mention of carp2?
And no corresponding entries on the master?

States are consistent (all backup on backup, and all master on master), but 
forwarded connections hang, until I force back the master with this :
 sudo ifconfig -g carp carpdemote 128
 sudo ifconfig -g carp -carpdemote 128
Between these two commands, on the backup firewall, I see traffic coming from 
WAN and DMZ, but almost nothing from LAN, so it may be related to the LAN 
switch. I cannot see what the problem is though...

Here is the setup :

On both firewalls :
 - em0 is connected to WAN
 - em1 is connected to LAN
 - em2 is connected to DMZ
 - em3 is interconnected with a crossover cable, used for pfsync and rdist

WAN and DMZ connections are on the same switch, but on different untagged VLANs 
(Procurve 2524)
LAN is on a separate layer 3 switch (Procurve 5300xl)

Another strange behavior :
With tcpdump, on the backup, I can see this traffic :
 - on em1 and em2, I see only carp advertisements to the configured unicast IP 
address and physical MAC address
 - on em3, I see only pfsync packets
 - but on em0, I see carp advertisements, but also a lot of traffic from the 
ISP router's MAC, to the virtual MAC (00:00:5e:00:01:01)
Which situation is normal? (em0 with lots of packets, or em1/em2 with only carp 
advertisements)
The only difference I see :
 - on em0, both firewalls and the ISP router are connected to the switch
 - on em1, both firewalls are connected to the L3 switch, which is also the 
router
 - on em2, there is no router, the firewalls communicate directly with hosts 
connected on the switch


Common configuration (public addresses anonymized, but the network sizes are 
correct) :

/etc/mygate
192.0.2.1

/etc/sysctl.conf
net.inet.carp.preempt=1
net.inet.ip.forwarding=1

/etc/pf.conf (excerpt only)
ext_if  = em0
ext_if_carp = carp0
int_if  = em1
int_if_carp = carp1
dmz_if  = em2
dmz_if_carp = carp2
sync_if = em3
set skip on lo
set skip on $sync_if
pass quick on { $int_if, $ext_if, $dmz_if } inet proto carp keep state (no-sync)


Firewall A (expected to be always master) :
OpenBSD 5.5 (GENERIC.MP) #315: Wed Mar  5 09:37:46 MST 2014
dera...@amd64.openbsd.org:/usr/src/sys/arch/amd64/compile/GENERIC.MP

/etc/hostname.em0
inet 192.168.3.9/30

/etc/hostname.em1
inet 192.168.3.1/29
!route add 192.168.0.0/16 192.168.3.5
!route add 172.16.0.0/12 192.168.3.5

/etc/hostname.em2
inet 192.168.3.13/30

/etc/hostname.em3
inet 192.168.3.17 255.255.255.252

/etc/hostname.carp0
advskew 0 carpdev em0 carppeer 192.168.3.10 pass secret1 state master vhid 1
inet 192.0.2.2/28
alias 192.0.2.3/32
alias 192.0.2.4/32
alias 192.0.2.5/32

/etc/hostname.carp1
advskew 0 carpdev em1 carppeer 192.168.3.4 pass secret2 state master vhid 2
inet 192.168.3.6/32

/etc/hostname.carp2
advskew 0 carpdev em2 carppeer 192.168.3.14 pass secret3 state master vhid 3
inet 192.0.2.17/28
alias 192.0.2.29/32

/etc/hostname.pfsync0
up
syncdev em3
syncpeer 192.168.3.18


Firewall B (expected to be always backup) :
OpenBSD 5.6 (GENERIC.MP) #5: Thu Dec 11 09:51:08 CET 2014

r...@stable-56-amd64.mtier.org:/binpatchng/work-binpatch56-amd64/src/sys/arch/amd64/compile/GENERIC.MP

/etc/hostname.em0
inet 192.168.3.10/30

/etc/hostname.em1
inet 192.168.3.4/29
!route add 192.168.0.0/16 192.168.3.5
!route add 172.16.0.0/12 192.168.3.5

/etc/hostname.em2
inet 192.168.3.14/30

/etc/hostname.em3
inet 192.168.3.18/30

/etc/hostname.carp0
advskew 200 carpdev em0 carppeer 192.168.3.9 pass secret1 state backup vhid 1
inet 192.0.2.2/28
alias 192.0.2.3/32
alias 192.0.2.4/32
alias 192.0.2.5/32

/etc/hostname.carp1
advskew 200 carpdev em1 carppeer 192.168.3.1 pass secret2 state backup vhid 2
inet 192.168.3.6/32

/etc/hostname.carp2
advskew 200 carpdev em2 carppeer 192.168.3.13 pass secret3 state backup vhid 3
inet 192.0.2.17/28
alias 192.0.2.29/32

/etc/hostname.pfsync0
up
syncdev em3
syncpeer 192.168.3.17


This message is already long, but if any other information would be helpful, I 
would be glad to provide it.
Any help or suggestion is appreciated.
Thank you!

Sebastien



Re: carp failover problem

2015-01-27 Thread Christopher Barry
On Tue, 27 Jan 2015 12:01:37 -0500
Leclerc, Sebastien sebastien.lecl...@saint-georges.ca wrote:

Hi,

I have two firewalls in a carp failover setup, but the failover does
not work as expected... The problem happens when I reboot the backup
firewall (while in backup state). Just after the reboot, I have these
entries in dmesg :

carp0: state transition: BACKUP - MASTER
carp1: state transition: BACKUP - MASTER
carp0: state transition: MASTER - BACKUP
carp1: state transition: MASTER - BACKUP

Why would there be no mention of carp2?
And no corresponding entries on the master?

States are consistent (all backup on backup, and all master on
master), but forwarded connections hang, until I force back the master
with this :
 sudo ifconfig -g carp carpdemote 128
 sudo ifconfig -g carp -carpdemote 128
Between these two commands, on the backup firewall, I see traffic
coming from WAN and DMZ, but almost nothing from LAN, so it may be
related to the LAN switch. I cannot see what the problem is though...

Here is the setup :

On both firewalls :
 - em0 is connected to WAN
 - em1 is connected to LAN
 - em2 is connected to DMZ
 - em3 is interconnected with a crossover cable, used for pfsync and
 rdist

WAN and DMZ connections are on the same switch, but on different
untagged VLANs (Procurve 2524) LAN is on a separate layer 3 switch
(Procurve 5300xl)

Another strange behavior :
With tcpdump, on the backup, I can see this traffic :
 - on em1 and em2, I see only carp advertisements to the configured
 unicast IP address and physical MAC address
 - on em3, I see only pfsync packets
 - but on em0, I see carp advertisements, but also a lot of traffic
 from the ISP router's MAC, to the virtual MAC (00:00:5e:00:01:01)
Which situation is normal? (em0 with lots of packets, or em1/em2 with
only carp advertisements) The only difference I see :
 - on em0, both firewalls and the ISP router are connected to the
 switch
 - on em1, both firewalls are connected to the L3 switch, which is
 also the router
 - on em2, there is no router, the firewalls communicate directly with
 hosts connected on the switch


Common configuration (public addresses anonymized, but the network
sizes are correct) :

/etc/mygate
192.0.2.1

/etc/sysctl.conf
net.inet.carp.preempt=1
net.inet.ip.forwarding=1

/etc/pf.conf (excerpt only)
ext_if  = em0
ext_if_carp = carp0
int_if  = em1
int_if_carp = carp1
dmz_if  = em2
dmz_if_carp = carp2
sync_if = em3
set skip on lo
set skip on $sync_if
pass quick on { $int_if, $ext_if, $dmz_if } inet proto carp keep state
(no-sync)


Firewall A (expected to be always master) :
OpenBSD 5.5 (GENERIC.MP) #315: Wed Mar  5 09:37:46 MST 2014
dera...@amd64.openbsd.org:/usr/src/sys/arch/amd64/compile/GENERIC.MP

/etc/hostname.em0
inet 192.168.3.9/30

/etc/hostname.em1
inet 192.168.3.1/29
!route add 192.168.0.0/16 192.168.3.5
!route add 172.16.0.0/12 192.168.3.5

/etc/hostname.em2
inet 192.168.3.13/30

/etc/hostname.em3
inet 192.168.3.17 255.255.255.252

/etc/hostname.carp0
advskew 0 carpdev em0 carppeer 192.168.3.10 pass secret1 state master
vhid 1 inet 192.0.2.2/28
alias 192.0.2.3/32
alias 192.0.2.4/32
alias 192.0.2.5/32

/etc/hostname.carp1
advskew 0 carpdev em1 carppeer 192.168.3.4 pass secret2 state master
vhid 2 inet 192.168.3.6/32

/etc/hostname.carp2
advskew 0 carpdev em2 carppeer 192.168.3.14 pass secret3 state master
vhid 3 inet 192.0.2.17/28
alias 192.0.2.29/32

/etc/hostname.pfsync0
up
syncdev em3
syncpeer 192.168.3.18


Firewall B (expected to be always backup) :
OpenBSD 5.6 (GENERIC.MP) #5: Thu Dec 11 09:51:08 CET 2014

 r...@stable-56-amd64.mtier.org:/binpatchng/work-binpatch56-amd64/src/sys/arch/amd64/compile/GENERIC.MP

/etc/hostname.em0
inet 192.168.3.10/30

/etc/hostname.em1
inet 192.168.3.4/29
!route add 192.168.0.0/16 192.168.3.5
!route add 172.16.0.0/12 192.168.3.5

/etc/hostname.em2
inet 192.168.3.14/30

/etc/hostname.em3
inet 192.168.3.18/30

/etc/hostname.carp0
advskew 200 carpdev em0 carppeer 192.168.3.9 pass secret1 state backup
vhid 1 inet 192.0.2.2/28
alias 192.0.2.3/32
alias 192.0.2.4/32
alias 192.0.2.5/32

/etc/hostname.carp1
advskew 200 carpdev em1 carppeer 192.168.3.1 pass secret2 state backup
vhid 2 inet 192.168.3.6/32

/etc/hostname.carp2
advskew 200 carpdev em2 carppeer 192.168.3.13 pass secret3 state
backup vhid 3 inet 192.0.2.17/28
alias 192.0.2.29/32

/etc/hostname.pfsync0
up
syncdev em3
syncpeer 192.168.3.17


This message is already long, but if any other information would be
helpful, I would be glad to provide it. Any help or suggestion is
appreciated. Thank you!

Sebastien


Sebastien,

Well, it's been many years since I ran carp, so I cannot actually help
with the carp config, but I can absolutely say that I have experienced a
lot of unexplainable weirdness with ProCurve switches, so I can
appreciate your suspicions there. I'll never buy another.