Carp issues

2013-02-28 Thread Carlos Flor
I have two firewalls running OpenBSD 5.1 with a 5.2 kernel amd64.  I am
running the 5.2 kernel because of another, unrelated bug.  I have 4
ethernet interfaces (em0-4).  em0 and em1 are in a failover trunk mode on
trunk0 while em2 and em3 are members of trunk1 in failover mode.  On
trunk0, I have 3 VLANs (2,3,4) and on trunk1, I have 2 VLANs(10,11).  I am
running carp on each of these vlan interfaces.  I am also running pfsync.
 I have an ipsec vpn configured which is using sasync between the two
firewalls.

We had fw1 kernel panic and die yesterday.  Everything seemed to switch
over as expected to fw2.  When we restarted fw1, all carp interfaces
switched back to master on fw1 and *most* switched to backup on fw2.
 However, carp2 (carp for vlan2) stayed master on fw2.  This was obviously
an issue because it was also master on fw1.  This caused lots of dropped
packets since two machines are claiming the same IP address.  I ifconfig
carp2 down'd the carp interface and traffic was passing as it should again.
 However, as soon as I ifconfig carp2 up'd the carp interface, the carp2
interface on fw2 went to master mode again, and carp2 on fw1 stayed master
as well.  I have all carp interfaces on fw2 configured with an advskew of
128 and I have preempt enabled.

I had to reboot fw2 for things to go back to normal with all interfaces on
fw2 in backup mode while all on fw1 were in master mode.  Below are my
hostname.* config files as well as the carp sysctl values.

Please let me know if anyone needs more information or if you have any
suggestions on how to avoid this in the future.


=== FW1 ==
** hostname.em0 **
up
** hostname.em1 **
up
** hostname.em2 **
up
** hostname.em3 **
up
** hostname.trunk0 **
up
trunkproto failover trunkport em0 trunkport em1
** hostname.trunk1 **
up
trunkproto failover trunkport em2 trunkport em3
** hostname.vlan10 **
up
inet x.x.x.27 255.255.255.248 NONE vlan 10 vlandev trunk1
** hostname.vlan11 **
up
inet x.x.x.131 255.255.255.248 NONE vlan 11 vlandev trunk1
** hostname.vlan2 **
up
inet 172.16.20.2 255.255.255.0 NONE vlan 2 vlandev trunk0
** hostname.vlan3 **
up
inet x.x.x.210 255.255.255.240 NONE vlan 3 vlandev trunk0
** hostname.vlan4 **
up
inet x.x.x.98 255.255.255.224 NONE vlan 4 vlandev trunk0
** hostname.carp10 **
up
inet x.x.x.26 255.255.255.248 x.x.x.31 vhid 10 pass xxx carpdev vlan10
** hostname.carp11 **
up
inet x.x.x.130 255.255.255.248 x.x.x.135 vhid 11 pass xx carpdev vlan11
** hostname.carp2 **
up
inet 172.16.20.1 255.255.255.0 172.16.20.255 vhid 2 pass x carpdev vlan2
** hostname.carp3 **
up
inet x.x.x.209 255.255.255.240 x.x.x.223 vhid 3 pass x carpdev vlan3
** hostname.carp4 **
up
inet x.x.x.97 255.255.255.224 x.x.x.127 vhid 4 pass x carpdev vlan4
** hostname.pfsync0 **
up syncdev vlan2 syncpeer 172.16.20.3


=== FW2 
** hostname.em0 **
up
** hostname.em1 **
up
** hostname.em2 **
up
** hostname.em3 **
up
** hostname.trunk0 **
up
trunkproto failover trunkport em0 trunkport em1
** hostname.trunk1 **
up
trunkproto failover trunkport em2 trunkport em3
** hostname.vlan10 **
up
inet x.x.x.28 255.255.255.248 NONE vlan 10 vlandev trunk1
** hostname.vlan11 **
up
inet x.x.x.132 255.255.255.248 NONE vlan 11 vlandev trunk1
** hostname.vlan2 **
up
inet 172.16.20.3 255.255.255.0 NONE vlan 2 vlandev trunk0
** hostname.vlan3 **
up
inet x.x.x.213 255.255.255.240 NONE vlan 3 vlandev trunk0
** hostname.vlan4 **
up
inet x.x.x.99 255.255.255.224 NONE vlan 4 vlandev trunk0
** hostname.carp10 **
up
inet x.x.x 26 255.255.255.248 x.x.x 31 vhid 10 pass  carpdev vlan10
advskew 128
** hostname.carp11 **
up
inet x.x.x 130 255.255.255.248 x.x.x 135 vhid 11 pass  carpdev vlan11
advskew 128
** hostname.carp2 **
up
inet 172.16.20.1 255.255.255.0 172.16.20.255 vhid 2 pass  carpdev vlan2
advskew 128
** hostname.carp3 **
up
inet x.x.x 209 255.255.255.240 x.x.x.223 vhid 3 carpdev vlan3 pass 
advskew 128
** hostname.carp4 **
up
inet x.x.x..97 255.255.255.224 x.x.x.127 vhid 4 pass  carpdev vlan4
advskew 128
** hostname.pfsync0 **
up syncdev vlan2 syncpeer 172.16.20.2



Re: carp issues

2011-08-10 Thread Michael Lechtermann

Hi,

just wanted to let you know that the problematic IP it is working to 
now and no problems has been seen in the last 16-18 hours.


Problem vanished while trying to figure out the root cause.


-Michael


On Tue, 09 Aug 2011 17:39:54 +0200, Michael Lechtermann wrote:

Hi,


 # ifconfig carp0
 carp0: flags=8843 mtu 1500
 lladdr 00:00:5e:00:01:0a
 priority: 0
 carp: carpdev em0 advbase 1 balancing ip-stealth carppeer
 10.0.1.11
 state MASTER vhid 10 advskew 0
 state BACKUP vhid 11 advskew 100


Hmmm, why do you have different vhid ?


That is the way it is suggested by the manpage:

   LOAD BALANCING
 In order to set up a load balanced virtual host, it is necessary 
to
 configure one carpnodes entry for each physical host.  In the 
following
 example, two physical hosts are configured to provide balancing 
and

 failover for the IP address 192.168.1.10.

 First the carp interface on Host A is configured.  The advskew 
of 100 on
 the second carpnode entry means that its advertisements will be 
sent out

 slightly less frequently and will therefore become the
designated backup.

   # ifconfig carp0 192.168.1.10 carpnodes 1:0,2:100 
balancing ip


 The configuration for host B is identical, except the skew is on 
the

 carpnode entry with virtual host 1 rather than virtual host 2.

   # ifconfig carp0 192.168.1.10 carpnodes 1:100,2:0 
balancing ip


 If ARP balancing or a different mode of IP balancing is desired 
the

 balancing mode can be adjusted accordingly.


-Michael




Re: carp issues

2011-08-09 Thread Michael Lechtermann

Hi,


 # ifconfig carp0
 carp0: flags=8843 mtu 1500
 lladdr 00:00:5e:00:01:0a
 priority: 0
 carp: carpdev em0 advbase 1 balancing ip-stealth carppeer
 10.0.1.11
 state MASTER vhid 10 advskew 0
 state BACKUP vhid 11 advskew 100


Hmmm, why do you have different vhid ?


That is the way it is suggested by the manpage:

   LOAD BALANCING
 In order to set up a load balanced virtual host, it is necessary 
to
 configure one carpnodes entry for each physical host.  In the 
following
 example, two physical hosts are configured to provide balancing 
and

 failover for the IP address 192.168.1.10.

 First the carp interface on Host A is configured.  The advskew of 
100 on
 the second carpnode entry means that its advertisements will be 
sent out
 slightly less frequently and will therefore become the designated 
backup.


   # ifconfig carp0 192.168.1.10 carpnodes 1:0,2:100 balancing 
ip


 The configuration for host B is identical, except the skew is on 
the

 carpnode entry with virtual host 1 rather than virtual host 2.

   # ifconfig carp0 192.168.1.10 carpnodes 1:100,2:0 balancing 
ip


 If ARP balancing or a different mode of IP balancing is desired 
the

 balancing mode can be adjusted accordingly.


-Michael



Re: carp issues

2011-08-09 Thread Patrick Lamaiziere
Le Tue, 09 Aug 2011 15:29:17 +0200,
Michael Lechtermann  a icrit :

>  Hi all,

hello,

>  # ifconfig carp0
>  carp0: flags=8843 mtu 1500
>  lladdr 00:00:5e:00:01:0a
>  priority: 0
>  carp: carpdev em0 advbase 1 balancing ip-stealth carppeer 
>  10.0.1.11
>  state MASTER vhid 10 advskew 0
>  state BACKUP vhid 11 advskew 100

Hmmm, why do you have different vhid ?



carp issues

2011-08-09 Thread Michael Lechtermann

Hi all,

we are having some issues with CARP. One IP of three configured is 
causing trouble. The systems are running OpenBSD 4.9-release.


Description:
IP 10.0.1.9 and 10.0.1.13 are working just fine, however, sometimes it 
isn't possible to connect using IP 10.0.1.12.


Destroying the interface and bringing it back up doesn't help, waiting 
does, at some point it just starts to work again.


Since the switch is a bit problematic, we tried both ip-unicast and 
ip-stealth which didn't help. Destroying carp0 on one of the hosts 
doesn't help either.


Again, IP 10.0.1.9 and 10.0.1.13 are working just fine, no problems 
there. The routing entry also looks the same for .12 and .13.


For configuration of node 1, see end of mail.

I'd be really happy if someone had an idea what is going on.

Thanks in advance!

Best regards,
Michael




# cat /etc/hostname.em0
inet 10.0.1.10 255.255.254.0

# cat /etc/hostname.carp0
inet 10.0.1.9 255.255.254.0
carpnodes 10:0,11:100 carpdev em0
carppeer 10.0.1.11 balancing ip-stealth
pass secret
inet alias 10.0.1.12 255.255.254.0
inet alias 10.0.1.13 255.255.254.0

# ifconfig em0
em0: 
flags=8b43 mtu 
1500

lladdr 96:f7:71:ac:8f:c1
priority: 0
groups: egress
media: Ethernet autoselect (1000baseT full-duplex)
status: active
inet 10.0.1.10 netmask 0xfe00 broadcast 10.0.1.255
inet6 fe80::94f7:71ff:feac:8fc1%em0 prefixlen 64 scopeid 0x1

# ifconfig carp0
carp0: flags=8843 mtu 1500
lladdr 00:00:5e:00:01:0a
priority: 0
carp: carpdev em0 advbase 1 balancing ip-stealth carppeer 
10.0.1.11

state MASTER vhid 10 advskew 0
state BACKUP vhid 11 advskew 100
groups: carp
status: master
inet 10.0.1.9 netmask 0xfe00 broadcast 10.0.1.255
inet6 fe80::200:5eff:fe00:10a%carp0 prefixlen 64 scopeid 0x8
inet 10.0.1.12 netmask 0xfe00 broadcast 10.0.1.255
inet 10.0.1.13 netmask 0xfe00 broadcast 10.0.1.255

# route -n show -inet | grep ^10.0.1.1[23]
10.0.1.12127.0.0.1  UGHS   00 33200 8 
lo0
10.0.1.13127.0.0.1  UGHS   00 33200 8 
lo0


# cat /var/run/dmesg.boot
OpenBSD 4.9 (GENERIC.MP) #794: Wed Mar  2 07:19:02 MST 2011
dera...@i386.openbsd.org:/usr/src/sys/arch/i386/compile/GENERIC.MP
cpu0: Intel(R) Xeon(R) CPU E5504 @ 2.00GHz ("GenuineIntel" 686-class) 2 
GHz
cpu0: 
FPU,V86,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,CFLUSH,MMX,FXSR,SSE,SSE2,SSE3,SSSE3,CX16,SSE4.1,SSE4.2

real mem  = 2134429696 (2035MB)
avail mem = 2089349120 (1992MB)
mainbus0 at root
bios0 at mainbus0: AT/286+ BIOS, date 06/23/99, BIOS32 rev. 0 @ 
0xfa900, SMBIOS rev. 2.4 @ 0xe901f (14 entries)

bios0: vendor Xen version "3.3.1" date 10/13/2009
bios0: Xen HVM domU
acpi0 at bios0: rev 2, ACPI control unavailable
mpbios0 at bios0: Intel MP Specification 1.4
cpu0 at mainbus0: apid 0 (boot processor)
cpu0: unknown i686 model 0x0, can't get bus clock (0x0)
cpu0: apic clock running at 100MHz
cpu1 at mainbus0: apid 2 (application processor)
cpu1: Intel(R) Xeon(R) CPU E5504 @ 2.00GHz ("GenuineIntel" 686-class)
cpu1: FPU,APIC,SSE3,SSSE3,CX16,SSE4.1,SSE4.2
cpu2 at mainbus0: apid 4 (application processor)
cpu2: Intel(R) Xeon(R) CPU E5504 @ 2.00GHz ("GenuineIntel" 686-class)
cpu2: FPU,APIC,SSE3,SSSE3,CX16,SSE4.1,SSE4.2
cpu3 at mainbus0: apid 6 (application processor)
cpu3: Intel(R) Xeon(R) CPU E5504 @ 2.00GHz ("GenuineIntel" 686-class)
cpu3: FPU,APIC,SSE3,SSSE3,CX16,SSE4.1,SSE4.2
mpbios0: bus 0 is type ISA
ioapic0 at mainbus0: apid 1 pa 0xfec0, version 11, 48 pins
ioapic0: misconfigured as apic 0, remapped to apid 1
bios0: ROM list: 0xc/0x8c00
pci0 at mainbus0 bus 0: configuration mode 1 (bios)
pchb0 at pci0 dev 0 function 0 "Intel 82441FX" rev 0x02
pcib0 at pci0 dev 1 function 0 "Intel 82371SB ISA" rev 0x00
pciide0 at pci0 dev 1 function 1 "Intel 82371SB IDE" rev 0x00: DMA, 
channel 0 wired to compatibility, channel 1 wired to compatibility

wd0 at pciide0 channel 0 drive 1: 
wd0: 16-sector PIO, LBA48, 16384MB, 33554432 sectors
wd0(pciide0:0:1): using PIO mode 0, DMA mode 2
atapiscsi0 at pciide0 channel 1 drive 1
scsibus0 at atapiscsi0: 2 targets
cd0 at scsibus0 targ 0 lun 0:  ATAPI 5/cdrom 
removable

cd0(pciide0:1:1): using PIO mode 0
"Intel 82371SB USB" rev 0x01 at pci0 dev 1 function 2 not configured
piixpm0 at pci0 dev 1 function 3 "Intel 82371AB Power" rev 0x01: SMBus 
disabled

vga1 at pci0 dev 2 function 0 "Cirrus Logic CL-GD5446" rev 0x00
wsdisplay0 at vga1 mux 1: console (80x25, vt100 emulation)
wsdisplay0: screen 1-5 added (80x25, vt100 emulation)
"XenSource Platform Device" rev 0x01 at pci0 dev 3 function 0 not 
configured
em0 at pci0 dev 4 function 0 "Intel PRO/1000MT (82540EM)" rev 0x03: 
apic 1 int 5 (irq 5), address 96:f7:71:ac:8f:c1

isa0 at pcib0
isadma0 at isa0
com0 at isa0 port 0x3f8/8 irq 4: ns16550a, 16 byte fifo
com0: probed fifo depth: 0 bytes
pckbc0 at isa0 

Re: CARP issues 4.3

2009-01-07 Thread numb3rs1x
a shot in the dark: Are you sure that CARP traffic flows freely between
the two firewalls, and that they both have the same password? That the
IP setup is generally consistent?

All I can say about that is that when I set this up and tested it,
everything seemed to be working fine. I was able to tcpdump and see pfsync
traffic across the interfaces on both firewalls. I manually failed the
primary over to the secondary at that time and it worked. All of this seemed
to start happening when I added and then removed the alias from the WAN
interface. I've double and triple checked the config on that interface and I
can't see that anything is amiss.


(Eg. I have trouble with what you call a "WAN" interface - those
interfaces that I am aware of, should not be able to support CARP
operation because they are point-to-point interfaces.)

There is a switch between the firewall and the ISP's router. 


I've seen this, too, and tracked it down to be either a
misconfiguration (eg. a typo), or overlapping networks.


I use class C networks, and they don't overlap like what you described.


Try "sh netstart " to see proper error messages.

I tried this and got denied permission. I don't see anything useful in the
man page on this. Is there something I'm missing?

Thanks alot for taking the time to reply.


Jon



-- 
View this message in context: 
http://www.nabble.com/CARP-issues-4.3-tp21322265p21336067.html
Sent from the openbsd user - misc mailing list archive at Nabble.com.



Re: CARP issues 4.3

2009-01-07 Thread Toni Mueller
Hi,

On Tue, 06.01.2009 at 17:11:45 -0600, Jon Slusher  
wrote:
> and for some reason it tried to take over as the MASTER, while its CARP 

a shot in the dark: Are you sure that CARP traffic flows freely between
the two firewalls, and that they both have the same password? That the
IP setup is generally consistent?

(Eg. I have trouble with what you call a "WAN" interface - those
interfaces that I am aware of, should not be able to support CARP
operation because they are point-to-point interfaces.)

> LAN interface would also not go beyond the INIT state. I had to shut it 

I've seen this, too, and tracked it down to be either a
misconfiguration (eg. a typo), or overlapping networks.

Eg. I have something like this on a pair of firewalls:

interface1: 10.10.0.0/16
interface2: 10.10.10.0/24

Doing this manually works like a charm, but CARP can't handle it (at
least not in 4.4).

Try "sh netstart " to see proper error messages.


Kind regards,
--Toni++



CARP issues 4.3

2009-01-06 Thread Jon Slusher
Yesterday, while troubleshooting a rdr on the pair of openBSD 4.3 
firewalls we use here I discovered there was a rule that required a 
particular IP to be listed as an alias on the WAN interface. I used 
ifconfig to add the alias to the interface and this brought our network 
down. I didn't realize that the IP I added as the alias was already 
being used as an the IP of the physical WAN interface of the BACKUP 
firewall. 

Here is where things started to get wonky: I then removed the alias from 
the firewall. The box failed over to the secondary at this point, and 
when that happened, about 10% of our packets started dropping. I tried 
to bring the primary back as the main firewall, but it didn't seem to 
want to respond. I rebooted out of desperation, and when the main box 
came back, the CARP LAN interface remained in an INIT state, which meant 
the secondary, which drops 10% of its packets, was still acting as the 
gateway. I was able to get it to accept the Carp IP, and after taking 
down the secondary, things went back to stable. I booted the secondary, 
and for some reason it tried to take over as the MASTER, while its CARP 
LAN interface would also not go beyond the INIT state. I had to shut it 
down and give the main fw back its priority.

Anyway, the state of things now is that when I bring either machine up, 
the CARP LAN interface will not move from its INIT state. The secondary 
firewall dropping packets might be unrelated. I guess I'm looking for a 
direction toward which to start troubleshooting. I was going to try to 
upgrade to 4.4, but I wanted to get some advice first. I'll include a 
dmesg and the carp interface configs.

*Main FW dmesg:

OpenBSD 4.3 (GENERIC) #1368: Wed Mar 12 11:05:31 MDT 2008
dera...@amd64.openbsd.org:/usr/src/sys/arch/amd64/compile/GENERIC
real mem = 468250624 (446MB)
avail mem = 442597376 (422MB)
mainbus0 at root
bios0 at mainbus0: SMBIOS rev. 2.4 @ 0xf (67 entries)
bios0: vendor Phoenix Technologies, LTD version "3.09" date 06/14/2006
bios0: Compaq Presario 061 EX310AA-ABA SR1910NX NA630
acpi0 at bios0: rev 0
acpi0: tables DSDT FACP SSDT MCFG APIC
acpi0: wakeup devices HUB0(S5) XVRA(S5) XVRB(S5) XVRC(S5) USB0(S3) 
USB2(S3) AZAD(S5) MMAC(S5) MMCI(S5) UAR1(S5) PS2M(S4) PS2K(S4)
acpitimer0 at acpi0: 3579545 Hz, 24 bits
acpiprt0 at acpi0: bus 0 (PCI0)
acpiprt1 at acpi0: bus 3 (HUB0)
acpicpu0 at acpi0: PSS
acpitz0 at acpi0: critical temperature 75 degC
acpibtn0 at acpi0: PWRB
cpu0 at mainbus0: (uniprocessor)
cpu0: AMD Sempron(tm) Processor 3200+, 1804.01 MHz
cpu0: 
FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,MMX,FXSR,SSE,SSE2,SSE3,NXE,MMXX,FFXSR,LONG,3DNOW2,3DNOW
cpu0: 64KB 64b/line 2-way I-cache, 64KB 64b/line 2-way D-cache, 256KB 
64b/line 16-way L2 cache
cpu0: ITLB 32 4KB entries fully associative, 8 4MB entries fully associative
cpu0: DTLB 32 4KB entries fully associative, 8 4MB entries fully associative
cpu0: AMD erratum 89 present, BIOS upgrade may be required
cpu0: Cool'n'Quiet K8 1804 MHz: speeds: 1800 1000 MHz
pci0 at mainbus0 bus 0: configuration mode 1
"NVIDIA C51 Host" rev 0xa2 at pci0 dev 0 function 0 not configured
"NVIDIA C51 Memory" rev 0xa2 at pci0 dev 0 function 1 not configured
"NVIDIA C51 Memory" rev 0xa2 at pci0 dev 0 function 2 not configured
"NVIDIA C51 Memory" rev 0xa2 at pci0 dev 0 function 3 not configured
"NVIDIA C51 Memory" rev 0xa2 at pci0 dev 0 function 4 not configured
"NVIDIA C51 Memory" rev 0xa2 at pci0 dev 0 function 5 not configured
"NVIDIA C51 Memory" rev 0xa2 at pci0 dev 0 function 6 not configured
"NVIDIA C51 Memory" rev 0xa2 at pci0 dev 0 function 7 not configured
ppb0 at pci0 dev 2 function 0 "NVIDIA C51 PCIE" rev 0xa1
pci1 at ppb0 bus 1
ppb1 at pci0 dev 4 function 0 "NVIDIA C51 PCIE" rev 0xa1
pci2 at ppb1 bus 2
vga1 at pci0 dev 5 function 0 "NVIDIA GeForce 6150 LE" rev 0xa2
wsdisplay0 at vga1 mux 1: console (80x25, vt100 emulation)
wsdisplay0: screen 1-5 added (80x25, vt100 emulation)
"NVIDIA MCP51 Host" rev 0xa2 at pci0 dev 9 function 0 not configured
pcib0 at pci0 dev 10 function 0 "NVIDIA MCP51 ISA" rev 0xa3
nviic0 at pci0 dev 10 function 1 "NVIDIA MCP51 SMBus" rev 0xa3
iic0 at nviic0
adt0 at iic0 addr 0x2e: sch5017 rev 0x8a
spdmem0 at iic0 addr 0x50: 256MB DDR SDRAM non-parity PC3200CL3.0
spdmem1 at iic0 addr 0x51: 256MB DDR SDRAM non-parity PC3200CL3.0
iic1 at nviic0
"NVIDIA MCP51 Memory" rev 0xa3 at pci0 dev 10 function 2 not configured
ohci0 at pci0 dev 11 function 0 "NVIDIA MCP51 USB" rev 0xa3: couldn't 
map interrupt
ehci0 at pci0 dev 11 function 1 "NVIDIA MCP51 USB" rev 0xa3: couldn't 
map interrupt
pciide0 at pci0 dev 13 function 0 "NVIDIA MCP51 IDE" rev 0xa1: DMA, 
channel 0 configured to compatibility, channel 1 configured to compatibility
pciide0: channel 0 disabled (no drives)
atapiscsi0 at pciide0 channel 1 drive 0
scsibus0 at atapiscsi0: 2 targets
cd0 at scsibus0 targ 0 lun 0:  SCSI0 
5/cdrom removable
cd0(pciide0:1:0): using PIO mode 4, Ultra-DMA mode 2
pciide1 at pci0

Re: OpenOSPF routing and CARP issues (?)

2008-06-27 Thread Claer
On Fri, Jun 20 2008 at 48:12, Chris Naselli wrote:
> Hi all!
Hi,

[...]
> OpenOSPFD have the following configuration:
> 
> area 0.0.0.0 {
>interface em0  # carped with carp0
>interface em1  # carped with carp1
>interface carp2
> }
> 
> In this topology I found a problem: OpenOSPF daemon is configured with
> "interface carpX" for any interface with except em0/em1 to announce the
> connected interface only if master but however there are the announce of all
> the route learned from other cisco router behind it, thus causing (unwanted)
> traffic also in the router in backup carp state.
> 
> How I can make OpenBSD redistribute ospf learned routes only if carp state
> is master even if in ospfd.conf have configured "interface em0" (and not
> "interface carp0")? Is my topology just broken?
If you wish to execute commands (for example ospfd) regarding carp
states, I recommend you to check ifstated(8) and ifstated.conf(5)

> Sorry for the long email and thanks in advance.
Sorry I shortened it :)

Claer



OpenOSPF routing and CARP issues (?)

2008-06-20 Thread Chris Naselli
Hi all!
We are trying our two "edge" router (aging Cisco 7500 with ATM) with two
pairs of "carped" multi-function (firewalling/routing) OpenBSD boxes, both
for redundancy and for very advanced shaping/firewalling/bgp routing and
also due to future network upgrade to native METRO-Ethernet solution.

A sample schematic of the desiderata network schematic follows..

Location A
  O O O
  | | |
 ---
  em0||em0
 em2---[A1]  [A2]em2
  em1||em1
 --
   |
   | ISP Ethernet over MPLS service
   |
Location B   --
  em1||em1
 em2---[B1]  [B2]em2
  em0||em0
 --
  | | |
  O O O

Where:
- O are some small Cisco routers on some fiber-connected sites near our main
offices (A/B locations), speaking OSPF
- A1/A2 are OpenBSD routers in location A with all interface in carp mode.
- B1/B2 are OpenBSD routers in location B with all interface in carp mode.

I'm trying this configuration in laboratory, in order to check if everything
works fine and preparing the changeover, as I'm not a OpenBSD sysadmin
guru.. so I try to familiarize a bit with it. 

OpenOSPFD have the following configuration:

area 0.0.0.0 {
   interface em0  # carped with carp0
   interface em1  # carped with carp1
   interface carp2
}

In this topology I found a problem: OpenOSPF daemon is configured with
"interface carpX" for any interface with except em0/em1 to announce the
connected interface only if master but however there are the announce of all
the route learned from other cisco router behind it, thus causing (unwanted)
traffic also in the router in backup carp state.

How I can make OpenBSD redistribute ospf learned routes only if carp state
is master even if in ospfd.conf have configured "interface em0" (and not
"interface carp0")? Is my topology just broken?

Sorry for the long email and thanks in advance.

Best wishes,
Chris

-- 
View this message in context: 
http://www.nabble.com/OpenOSPF-routing-and-CARP-issues-%28-%29-tp18036287p18036287.html
Sent from the openbsd user - misc mailing list archive at Nabble.com.



Re: Strange carp issues

2006-06-03 Thread Henning Brauer
* Steven S <[EMAIL PROTECTED]> [2006-06-03 02:01]:
> The self inflicted issue came when I added an alias IP to FW1:carp0 but not
> yet to FW2:carp0.  Both FW1 and FW2 became master for the interface, until I
> added the alias to FW2.

that can lead to master-master situations unfortunately. not too much 
we can do about it :(

-- 
BS Web Services, http://www.bsws.de/
OpenBSD-based Webhosting, Mail Services, Managed Servers, ...
Unix is very simple, but it takes a genius to understand the simplicity.
(Dennis Ritchie)



Re: Strange carp issues

2006-06-02 Thread Steven S
Steven S wrote:
> It would appear my issues are related to timekeeping on these boxes
> (Compaq DL360 G1).
> 
> If I bump advbase to '3' on each box everything is more stable. 
> Given this, I now have a roughly 10 second fail-over time, but that
> is still acceptable. 
> 
> Since these are production boxes I'll probably wait until my
> 3.9 arrives to
> see if any of the kern_time/kern_clock changes help.  I'll let
> everyone know more when I do.

For the archives...

I upgraded the backup firewall to 3.9-stable but it still appeared to have
the MASTER-MASTER issue (with primary at 3.8).  Based on some other posts in
misc I tried using aliases on a single carp interface instead of multiple
carp interfaces on the same physical interface.  I upgraded the primary to
3.9-stable and things seem to be operating as expected.  I have not had any
MASTER<-->MASTER issues that weren't self inflicted.  I guess I'm not 100%
sure if the cure was upgrading or migrating to aliases, but it's working.  

The self inflicted issue came when I added an alias IP to FW1:carp0 but not
yet to FW2:carp0.  Both FW1 and FW2 became master for the interface, until I
added the alias to FW2.

Thanks again for the pointers and the great OS!

-Steve S.



Re: Strange carp issues

2006-03-20 Thread Steven S
It would appear my issues are related to timekeeping on these boxes (Compaq
DL360 G1).  

If I bump advbase to '3' on each box everything is more stable.  Given this,
I now have a roughly 10 second fail-over time, but that is still acceptable.

Since these are production boxes I'll probably wait until my 3.9 arrives to
see if any of the kern_time/kern_clock changes help.  I'll let everyone know
more when I do.  

Thanks for all the pointers and assistance!

Steve's corollary to Henning's carp theorem ("carp works."):  Unless the
system clock is broken:-)

-Steve S.



Re: Strange carp issues

2006-03-18 Thread Joachim Schipper
On Sat, Mar 18, 2006 at 02:28:24PM -0500, Steven S wrote:
> Joachim Schipper wrote:
> >> Using NTPDATE in cron (30 minutes),  I was able to handle this weird
> >> behavior. 
> >> 
> >> Take a look in your date/time, maybe it's the reason of your strange
> >> carp issues.
> > 
> > As to problems with adjtime(2) and SMP machines, there is a small
> > diff from tedu@ on tech@ at
> > http://marc.theaimsgroup.com/?l=openbsd-tech&m=113592306900483&w=2,
> > which stemmed from the discussion on misc@ around the same time,
> > involving another SMP machine with severely screwed timekeeping - in
> > fact, it was so bad that NTPd couldn't keep up. Ted's diff 
> > allows NTP to
> > keep up with time slew even on very imprecise hosts.
> > 
> > It's a workaround, but might work for you.
> 
> I tried the patch, but it didn't apply cleanly against 3.8:-(  
> 
> I tried booting FW2 with the SP kernel, but the problem still persists.  It
> doesn't appear to be ntpd related since ntp updates didn't correlate with
> carp BACKUP -> MASTER transitions.  I'll keep plugging away at it...

Nonetheless, it does lead to the question if timekeeping, especially
without ntpd, is accurate. You seem to believe this is not the case;
fixing this might well fix the carp problems.

Joachim



Re: Strange carp issues

2006-03-18 Thread Steven S
Joachim Schipper wrote:
>> Using NTPDATE in cron (30 minutes),  I was able to handle this weird
>> behavior. 
>> 
>> Take a look in your date/time, maybe it's the reason of your strange
>> carp issues.
> 
> As to problems with adjtime(2) and SMP machines, there is a small
> diff from tedu@ on tech@ at
> http://marc.theaimsgroup.com/?l=openbsd-tech&m=113592306900483&w=2,
> which stemmed from the discussion on misc@ around the same time,
> involving another SMP machine with severely screwed timekeeping - in
> fact, it was so bad that NTPd couldn't keep up. Ted's diff 
> allows NTP to
> keep up with time slew even on very imprecise hosts.
> 
> It's a workaround, but might work for you.

I tried the patch, but it didn't apply cleanly against 3.8:-(  

I tried booting FW2 with the SP kernel, but the problem still persists.  It
doesn't appear to be ntpd related since ntp updates didn't correlate with
carp BACKUP -> MASTER transitions.  I'll keep plugging away at it...

-Steve S. 



Re: Strange carp issues

2006-03-17 Thread Adam D. Morley
On Fri, Mar 17, 2006 at 03:41:01PM -0500, Steven S wrote:
> Adam D. Morley wrote:
> > On Fri, Mar 17, 2006 at 02:35:55PM -0500, Steven S wrote:
> >> Adam D. Morley wrote:
> ...
> >> Thanks, this is helpful.  The settings on the FW's are as above.  An
> >> incorrect setting (above) would seem to make it not work -- as
> >> opposed to 
> > 
> > Ok.  But mine works and yours doesn't?
> > 
> >> what I'm seeing.  Sometimes FW2 takes over as MASTER for some
> >> interfaces, but FW1 never moves to BACKUP.  I do have
> >> net.inet.carp.preempt=1 set on FW1, but not FW2.
> > 
> > You're supposed to set preempt on both, iirc.
> 
> With both firewalls set to preempt=1 I had a common DMZ switch get shut-off.
> Both FW's went to a carp skew of 240.  They had a MASTER fight.  By setting
> one with preempt=1 and the other with preempt=0, I avoid this.  

This is likely because one of the firewalls does not have a pass rule for
carp packets.  I have seen this happen before, especially when adding a
new carp interface and not updating ext_ints.

> 
> >> As another experiment I moved advbase on FW2 to '2' for all carps,
> >> but the 
> > 
> > base is how often.  skew is priority.
> 
> Sort of...  'man ifconfig' Says,

Sort of, yes, but for the purposes of pre-empted pf-walls, it's a very
convinent simplification.

> 
> "Taken together the advbase and advskew indicate how frequently, in seconds,
> the host will advertise the fact that it considers itself master of the
> virtual host.  The formula is advbase + (advskew / 256).  If the master does
> not advertise within three times this interval, this host will begin
> advertising as master."

Yes, I have read man ifconfig.  ;-)

> 
> So if I set FW1 with 1/0 and FW2 at 2/180, FW1 advertises every one second.
> If FW2 hasn't heard a carp advertisement in 2.7*3=8.1 seconds it will take
> over.  When FW1 returns, it will start advertising once/sec.  As noted in my
> OP, this doesn't seem to happen on my FW pair.

Well, FW1 expects FW2 to advertise as master within 3 seconds, and it's
advertising every 2.7 seconds.  This is pretty close, though it should be
more than enough time.  It would be "irrelevant" if one had default master
status (preempt on on both), but that's not the case in your setup.

Out of curiosity, why are you twiddling advbase?  Do you have some 
high-latency serial lines in there or something?  Are you attempting to
get longer detection intervals in case fw1 is not actually down, but fw2
thinks it is?  If it's the later, then that's not what it's for.

Personally, I would keep advbase the same on both hosts and twiddle advskew, 
set preempt (on both), and see what happens.  If something goes wrong, then 
it's likely a missing pass rule for carp packets.

-- 
adam



Re: Strange carp issues

2006-03-17 Thread Adam D. Morley
On Fri, Mar 17, 2006 at 12:48:49PM -0800, Jon Simola wrote:
> On 3/17/06, Adam D. Morley <[EMAIL PROTECTED]> wrote:
> 
> > > As another experiment I moved advbase on FW2 to '2' for all carps, but the
> >
> > base is how often.  skew is priority.
> 
> No, advbase is integer seconds between advertisements, advskew is
> fractional seconds. Taken together, advbase and advskew are an 8.8 bit
> fixed point number allowing you to specify advertisment intervals
> between 4ms and 255.996s (in theory anyways, setting advskew to 240 or
> above is used with preempting as a magic number).

Of course.  However, for many, the combination of both advbase and
advskew is confusing, and a simplication is "easier" for them to grasp.
Especially when trying to explain pre-empting.  Ie: the average user won't
need advbase, and so explaining advskew as "priority" (when in fact is
is not) makes it "easier" to understand.  For example, when a user is
following:

http://www.countersiege.com/doc/pfsync-carp/

explaining advskew as a "priority" often makes people grasp the concept
faster, and then once it "works," they can go, "hmm...oh yes, the man
page!"

-- 
adam



Re: Strange carp issues

2006-03-17 Thread Joachim Schipper
On Fri, Mar 17, 2006 at 10:34:29AM -0300, Anderson Nadal wrote:
> -BEGIN PGP SIGNED MESSAGE-
> Hash: SHA1
> 
> Hello.
> 
> I have the same problem.
> 
> I have 2 Fw, the Master is a Dell 2850 (2 processors) and the slave is
> a Dell 2850, with 1 processor, both with 3 dual cards.
> 
> The NTPD daemon doesn't work in the Master. I mean, the date/time  is
> always  wrong (is it a bug in OpenBSD 3.7 with SMP ??).
> 
> When the difference between Master and Slave's date/time becomes  too
> large, the carp goes down!!
> The Master becomes Slave, and the Slave becomes Master.
> 
> Using NTPDATE in cron (30 minutes),  I was able to handle this weird
> behavior.
> 
> Take a look in your date/time, maybe it's the reason of your strange
> carp issues.

As to problems with adjtime(2) and SMP machines, there is a small diff
from tedu@ on tech@ at
http://marc.theaimsgroup.com/?l=openbsd-tech&m=113592306900483&w=2,
which stemmed from the discussion on misc@ around the same time,
involving another SMP machine with severely screwed timekeeping - in
fact, it was so bad that NTPd couldn't keep up. Ted's diff allows NTP to
keep up with time slew even on very imprecise hosts.

It's a workaround, but might work for you.

Joachim



Re: Strange carp issues

2006-03-17 Thread Jon Simola
On 3/17/06, Adam D. Morley <[EMAIL PROTECTED]> wrote:

> > As another experiment I moved advbase on FW2 to '2' for all carps, but the
>
> base is how often.  skew is priority.

No, advbase is integer seconds between advertisements, advskew is
fractional seconds. Taken together, advbase and advskew are an 8.8 bit
fixed point number allowing you to specify advertisment intervals
between 4ms and 255.996s (in theory anyways, setting advskew to 240 or
above is used with preempting as a magic number).

Around line 610 of ip_carp.c:
ch_tv.tv_sec = ch->carp_advbase;
ch_tv.tv_usec = ch->carp_advskew * 100 / 256;


--
Jon Simola
Systems Administrator
ABC Communications



Re: Strange carp issues

2006-03-17 Thread Steven S
Adam D. Morley wrote:
> On Fri, Mar 17, 2006 at 02:35:55PM -0500, Steven S wrote:
>> Adam D. Morley wrote:
...
>> Thanks, this is helpful.  The settings on the FW's are as above.  An
>> incorrect setting (above) would seem to make it not work -- as
>> opposed to 
> 
> Ok.  But mine works and yours doesn't?
> 
>> what I'm seeing.  Sometimes FW2 takes over as MASTER for some
>> interfaces, but FW1 never moves to BACKUP.  I do have
>> net.inet.carp.preempt=1 set on FW1, but not FW2.
> 
> You're supposed to set preempt on both, iirc.

With both firewalls set to preempt=1 I had a common DMZ switch get shut-off.
Both FW's went to a carp skew of 240.  They had a MASTER fight.  By setting
one with preempt=1 and the other with preempt=0, I avoid this.  

>> As another experiment I moved advbase on FW2 to '2' for all carps,
>> but the 
> 
> base is how often.  skew is priority.

Sort of...  'man ifconfig' Says,

"Taken together the advbase and advskew indicate how frequently, in seconds,
the host will advertise the fact that it considers itself master of the
virtual host.  The formula is advbase + (advskew / 256).  If the master does
not advertise within three times this interval, this host will begin
advertising as master."

So if I set FW1 with 1/0 and FW2 at 2/180, FW1 advertises every one second.
If FW2 hasn't heard a carp advertisement in 2.7*3=8.1 seconds it will take
over.  When FW1 returns, it will start advertising once/sec.  As noted in my
OP, this doesn't seem to happen on my FW pair.

-Steve S.



Re: Strange carp issues

2006-03-17 Thread Henning Brauer
* Steven S <[EMAIL PROTECTED]> [2006-03-17 20:23]:
> Henning Brauer wrote:
> > * Steven S <[EMAIL PROTECTED]> [2006-03-17 19:10]:
> >> beginning to think it might be a component of the number of carp
> >> interfaces 
> > 
> > unlikely.
> > <[EMAIL PROTECTED]>  $ ifconfig | grep '^carp' | wc -l
> >   15
> > and growing.
> > and yes, that is real-world production use.
> How do you monitor if a carp interface changes state?

I don't. carp works.
We do monitor the hosts in the cluster individually.

> And are these on any multi-port NICs?

in this case, no, but I have carps on multiport. which is entirely 
irrelevant.

-- 
BS Web Services, http://www.bsws.de/
OpenBSD-based Webhosting, Mail Services, Managed Servers, ...
Unix is very simple, but it takes a genius to understand the simplicity.
(Dennis Ritchie)



Re: Strange carp issues

2006-03-17 Thread Adam D. Morley
On Fri, Mar 17, 2006 at 02:35:55PM -0500, Steven S wrote:
> Adam D. Morley wrote:
> ...
> > Have you checked:
> > 
> > - carp settings in sysctl?
> > - carp pass rules (and ordering) in pf.conf (if you have default
> > deny)? 
> > - that you have advskew set "right" on the backup firewall?
> > 
> > # grep carp /etc/sysctl.conf
> > net.inet.carp.allow=1   # allow incoming CARP packets
> > net.inet.carp.preempt=1 # failover all CARP
> > interfaces if one fails
> > 
> > # grep carp /etc/pf.conf
> > pass quick on $ext_ints proto carp keep state
> > pass on $int_phys proto carp keep state
> > pass on $int_vlan proto carp keep state
> > 
> > # cat /etc/hostname.carp1
> > vhid 1 advskew 100 pass 
> > inet XXX 0xff00
> 
> Thanks, this is helpful.  The settings on the FW's are as above.  An
> incorrect setting (above) would seem to make it not work -- as opposed to

Ok.  But mine works and yours doesn't?

> what I'm seeing.  Sometimes FW2 takes over as MASTER for some interfaces,
> but FW1 never moves to BACKUP.  I do have net.inet.carp.preempt=1 set on
> FW1, but not FW2.  

You're supposed to set preempt on both, iirc.

> 
> As another experiment I moved advbase on FW2 to '2' for all carps, but the

base is how often.  skew is priority.

-- 
adam



Re: Strange carp issues

2006-03-17 Thread Steven S
Adam D. Morley wrote:
...
> Have you checked:
> 
> - carp settings in sysctl?
> - carp pass rules (and ordering) in pf.conf (if you have default
> deny)? 
> - that you have advskew set "right" on the backup firewall?
> 
> # grep carp /etc/sysctl.conf
> net.inet.carp.allow=1   # allow incoming CARP packets
> net.inet.carp.preempt=1 # failover all CARP
> interfaces if one fails
> 
> # grep carp /etc/pf.conf
> pass quick on $ext_ints proto carp keep state
> pass on $int_phys proto carp keep state
> pass on $int_vlan proto carp keep state
> 
> # cat /etc/hostname.carp1
> vhid 1 advskew 100 pass 
> inet XXX 0xff00

Thanks, this is helpful.  The settings on the FW's are as above.  An
incorrect setting (above) would seem to make it not work -- as opposed to
what I'm seeing.  Sometimes FW2 takes over as MASTER for some interfaces,
but FW1 never moves to BACKUP.  I do have net.inet.carp.preempt=1 set on
FW1, but not FW2.  

As another experiment I moved advbase on FW2 to '2' for all carps, but the
mysterious BACKUP-->MASTER transition still occurred on FW2 (in thinking
back, I did this with a reboot of FW2, which re-started ntpd.)  Perhaps I'll
try again and not starting ntpd.

-Steve S.  



Re: Strange carp issues

2006-03-17 Thread Adam D. Morley
On Fri, Mar 17, 2006 at 07:59:35PM +0100, Henning Brauer wrote:
> * Steven S <[EMAIL PROTECTED]> [2006-03-17 19:10]:
> > beginning to think it might be a component of the number of carp interfaces
> 
> unlikely.
> <[EMAIL PROTECTED]>  $ ifconfig | grep '^carp' | wc -l 
>   15 
> and growing.
> and yes, that is real-world production use.

I would agree that number of carp interfaces doesn't matter:

# ifconfig |grep ^carp |wc -l
  23

This is real-world also, with many of the carp interfaces layered on top
of VLANs.  Intel dual GE cards.

Have you checked:

- carp settings in sysctl?
- carp pass rules (and ordering) in pf.conf (if you have default deny)?
- that you have advskew set "right" on the backup firewall?

# grep carp /etc/sysctl.conf
net.inet.carp.allow=1   # allow incoming CARP packets
net.inet.carp.preempt=1 # failover all CARP interfaces if one fails

# grep carp /etc/pf.conf
pass quick on $ext_ints proto carp keep state
pass on $int_phys proto carp keep state
pass on $int_vlan proto carp keep state

# cat /etc/hostname.carp1
vhid 1 advskew 100 pass 
inet XXX 0xff00

-- 
adam



Re: Strange carp issues

2006-03-17 Thread Steven S
Henning Brauer wrote:
> * Steven S <[EMAIL PROTECTED]> [2006-03-17 19:10]:
>> beginning to think it might be a component of the number of carp
>> interfaces 
> 
> unlikely.
> <[EMAIL PROTECTED]>  $ ifconfig | grep '^carp' | wc -l
>   15
> and growing.
> and yes, that is real-world production use.

How do you monitor if a carp interface changes state?  And are these on any
multi-port NICs?

Thanks!

-Steve S.



Re: Strange carp issues

2006-03-17 Thread Henning Brauer
* Steven S <[EMAIL PROTECTED]> [2006-03-17 19:10]:
> beginning to think it might be a component of the number of carp interfaces

unlikely.
<[EMAIL PROTECTED]>  $ ifconfig | grep '^carp' | wc -l 
  15 
and growing.
and yes, that is real-world production use.

-- 
BS Web Services, http://www.bsws.de/
OpenBSD-based Webhosting, Mail Services, Managed Servers, ...
Unix is very simple, but it takes a genius to understand the simplicity.
(Dennis Ritchie)



Re: Strange carp issues

2006-03-17 Thread Bryan Irvine
On 3/17/06, Steven S <[EMAIL PROTECTED]> wrote:
> Bryan Irvine wrote:
> > I tried before with 2 quad cards to no avail.  That was under 3.6
> > though IIRC.  1 or 2 if's would fail over within a couple of hours,
> > but if left to it's own devices, eventually they all would.
> >
> > If you do figure something out lemme know, I'd love to go back to the
> > quad cards.
> >
> > ifstated didn't work for me but give it a go.  I also had a script on
> > each machine that would ping the other every 5 seconds for ever
> > The interfaces seemed to last longer but eventually failed that way
> > too.
> >
> > --Bryan
>
> The interfaces that I'm having the most problem with are the built-in
> interfaces on a Compaq DL360 (I misstated earlier that it was a dual
> interface nic) although I have two other dual port nics in the machine.  I
> wonder if the built-in looks like a dual port nic?

I never had a machine with built-in NICs *and* multi-port cards.

> I find it odd that the problem might be related to the multi-port nic.  Did
> you try the same configuration with single port nics and it worked?  I'm
> beginning to think it might be a component of the number of carp interfaces
> (you would likely have more carp interfaces on a machine with multiport
> nics.)

Yup!  The quad cards were all intel-based, replaced the cards with
other intel cards, and everything worked.  Same pf.conf, same
hostname.if, everything.


--Bryan



Re: Strange carp issues

2006-03-17 Thread Steven S
Bryan Irvine wrote:
> I tried before with 2 quad cards to no avail.  That was under 3.6
> though IIRC.  1 or 2 if's would fail over within a couple of hours,
> but if left to it's own devices, eventually they all would.
> 
> If you do figure something out lemme know, I'd love to go back to the
> quad cards. 
> 
> ifstated didn't work for me but give it a go.  I also had a script on
> each machine that would ping the other every 5 seconds for ever
> The interfaces seemed to last longer but eventually failed that way
> too. 
> 
> --Bryan

The interfaces that I'm having the most problem with are the built-in
interfaces on a Compaq DL360 (I misstated earlier that it was a dual
interface nic) although I have two other dual port nics in the machine.  I
wonder if the built-in looks like a dual port nic?

I find it odd that the problem might be related to the multi-port nic.  Did
you try the same configuration with single port nics and it worked?  I'm
beginning to think it might be a component of the number of carp interfaces
(you would likely have more carp interfaces on a machine with multiport
nics.)

Ifstated was broken for me on 3.8-stable too.  I notice some changes in the
3.9 version so I compiled the 3.9 source under 3.8 and ifstated works *much*
better.  Perhaps it should be posted for 3.8 as a "reliability patch."

-Steve S.



Re: Strange carp issues

2006-03-17 Thread Bryan Irvine
On 3/17/06, Steven S <[EMAIL PROTECTED]> wrote:
> Anderson Nadal wrote:
> > -BEGIN PGP SIGNED MESSAGE-
> > Hash: SHA1
> >
> > Hello.
> >
> > I have the same problem.
> >
> ...
> >
> > Take a look in your date/time, maybe it's the reason of your strange
> > carp issues.
> ...
>
> I thought of that too.  If time changed by a couple seconds on the backup
> server then the backup might think it hadn't heard from FW1 in the carp
> time-out, so I stopped ntpd on both servers.  I still experienced the
> problem.  Oddly, it's not all carp interfaces on my fxp0.  It only seems to
> affect carp16 - carp20, but inconsistently so.
>
> I'm going to try an experiment with a couple lab boxes, multiple carp
> interfaces, and ifstate (for monitoring).  The plan is to see if it is
> related to the number of carp interfaces.  Unless someone has tried this
> already (hint, hint;-)

I tried before with 2 quad cards to no avail.  That was under 3.6
though IIRC.  1 or 2 if's would fail over within a couple of hours,
but if left to it's own devices, eventually they all would.

If you do figure something out lemme know, I'd love to go back to the
quad cards.

ifstated didn't work for me but give it a go.  I also had a script on
each machine that would ping the other every 5 seconds for ever 
The interfaces seemed to last longer but eventually failed that way
too.

--Bryan



Re: Strange carp issues

2006-03-17 Thread Steven S
Anderson Nadal wrote:
> -BEGIN PGP SIGNED MESSAGE-
> Hash: SHA1
> 
> Hello.
> 
> I have the same problem.
> 
...
> 
> Take a look in your date/time, maybe it's the reason of your strange
> carp issues. 
...

I thought of that too.  If time changed by a couple seconds on the backup
server then the backup might think it hadn't heard from FW1 in the carp
time-out, so I stopped ntpd on both servers.  I still experienced the
problem.  Oddly, it's not all carp interfaces on my fxp0.  It only seems to
affect carp16 - carp20, but inconsistently so.  

I'm going to try an experiment with a couple lab boxes, multiple carp
interfaces, and ifstate (for monitoring).  The plan is to see if it is
related to the number of carp interfaces.  Unless someone has tried this
already (hint, hint;-)

-Steve S.



Re: Strange carp issues

2006-03-17 Thread Anderson Nadal
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Hello.

I have the same problem.

I have 2 Fw, the Master is a Dell 2850 (2 processors) and the slave is
a Dell 2850, with 1 processor, both with 3 dual cards.

The NTPD daemon doesn't work in the Master. I mean, the date/time  is
always  wrong (is it a bug in OpenBSD 3.7 with SMP ??).

When the difference between Master and Slave's date/time becomes  too
large, the carp goes down!!
The Master becomes Slave, and the Slave becomes Master.

Using NTPDATE in cron (30 minutes),  I was able to handle this weird
behavior.

Take a look in your date/time, maybe it's the reason of your strange
carp issues.

[]'s
Nadal





Bryan Irvine wrote:

> Thought so. Had the same problem. Never got them working with
> CARP.
>
> There's some threads in the archives, but they probably won't help
> since there is apparently no solution.
>
> --Bryan
>
> On 3/15/06, Steven S <[EMAIL PROTECTED]> wrote:
>
>> Bryan Irvine wrote:
>>
>>> I don't suppose you are using a quad card of some kind are you?
>>>
>>>
>>>
>> ... Three dual cards, dmesg (extracted from /var/log/messages)
>> below:
>>
>> OpenBSD 3.8-stable (GENERIC.MP) #0: Thu Jan 5 03:55:53 EST 2006
>> [EMAIL PROTECTED]:/usr/src/sys/arch/i386/compile/GENERIC.MP
>> cpu0: Intel Pentium III ("GenuineIntel" 686-class) 798 MHz cpu0:
>>
FPU,V86,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,MMX,
>> FXSR,SSE real mem = 536436736 (523864K) avail mem = 482525184
>> (471216K) using 4278 buffers containing 26923008 bytes (26292K)
>> of memory mainbus0 (root) bios0 at mainbus0: AT/286+(00) BIOS,
>> date 12/31/99, BIOS32 rev. 0 @ 0xf pcibios0 at bios0: rev 2.1
>> @ 0xf/0x2000 pcibios0: PCI BIOS has 6 Interrupt Routing table
>> entries pcibios0: PCI Interrupt Router at 000:15:0 ("ServerWorks
>> ROSB4 SouthBridge" rev 0x00) pcibios0: PCI bus #1 is the last bus
>> bios0: ROM list: 0xc/0x8000 0xc8000/0x4000! 0xe8000/0x6000
>> 0xee000/0x2000! mainbus0: Intel MP Specification (Version 1.4)
>> (COMPAQ PROLIANT ) cpu0 at mainbus0: apid 3 (boot processor)
>> cpu0: apic clock running at 132 MHz cpu1 at mainbus0: apid 0
>> (application processor) cpu1: Intel Pentium III ("GenuineIntel"
>> 686-class) 797 MHz cpu1:
>>
FPU,V86,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,MMX,
>> FXSR,SSE mainbus0: bus 0 is type PCI mainbus0: bus 3 is type PCI
>> mainbus0: bus 9 is type ISA ioapic0 at mainbus0: apid 8 pa
>> 0xfec0, version 11, 35 pins ioapic0: misconfigured as apic 0,
>> remapped to apic 8 pci0 at mainbus0 bus 0: configuration mode 1
>> (no bios) pchb0 at pci0 dev 0 function 0 "ServerWorks CNB20LE
>> Host" rev 0x05 pchb1 at pci0 dev 0 function 1 "ServerWorks
>> CNB20LE Host" rev 0x05 pci1 at pchb1 bus 3 fxp0 at pci1 dev 4
>> function 0 "Intel 82557" rev 0x08, i82559: apic 8 int 10 (irq
>> 10), address 00:50:8b:e2:6e:fb inphy0 at fxp0 phy 1: i82555
>> 10/100 PHY, rev. 4 fxp1 at pci1 dev 5 function 0 "Intel 82557"
>> rev 0x08, i82559: apic 8 int 11 (irq 11), address
>> 00:50:8b:e2:6e:fa inphy1 at fxp1 phy 1: i82555 10/100 PHY, rev. 4
>> ppb0 at pci1 dev 6 function 0 "DEC 21154 PCI-PCI" rev 0x05 pci2
>> at ppb0 bus 4 fxp2 at pci2 dev 4 function 0 "Intel 82557" rev
>> 0x08, i82559: apic 8 int 11 (irq 11), address 00:02:a5:60:58:50
>> inphy2 at fxp2 phy 1: i82555 10/100 PHY, rev. 4 fxp3 at pci2 dev
>> 5 function 0 "Intel 82557" rev 0x08, i82559: apic 8 int 10 (irq
>> 10), address 00:02:a5:60:58:51 inphy3 at fxp3 phy 1: i82555
>> 10/100 PHY, rev. 4 cac0 at pci0 dev 1 function 0 "Symbios Logic
>> 53c1510" rev 0x02: apic 8 int 3 (irq 3) Compaq Integrated Array
>> scsibus0 at cac0: 1 targets sd0 at scsibus0 targ 0 lun 0:
>>  SCSI2 0/direct fixed sd0: 17359MB,
>> 4357 cyl, 255 head, 32 sec, 512 bytes/sec, 35553120 sec total
>> vga1 at pci0 dev 3 function 0 "ATI Mach64 GV" rev 0x7a wsdisplay0
>> at vga1 mux 1: console (80x25, vt100 emulation) wsdisplay0:
>> screen 1-5 added (80x25, vt100 emulation) "Compaq Netelligent
>> ASMC" rev 0x00 at pci0 dev 4 function 0 not configured ppb1 at
>> pci0 dev 5 function 0 "IBM 82351 PCI-PCI" rev 0x01 pci3 at ppb1
>> bus 1 tl0 at pci3 dev 0 function 0 "Compaq DP Netelligent
>> 10/100TX" rev 0x10: apic 8 int 5 (irq 5) address
>> 00:08:c7:a4:84:6d nsphy0 at tl0 phy 1: DP83840 10/100 PHY, rev. 1
>> ukphy0 at tl0 phy 31: Generic IEEE 802.3u media interface
>> ukphy0: OUI 0x100014, model 0x0001, rev. 5 

Re: Strange carp issues

2006-03-15 Thread Bryan Irvine
Thought so.  Had the same problem.  Never got them working with CARP.

There's some threads in the archives, but they probably won't help
since there is apparently no solution.

--Bryan

On 3/15/06, Steven S <[EMAIL PROTECTED]> wrote:
> Bryan Irvine wrote:
> > I don't suppose you are using a quad card of some kind are you?
> >
> >
> ...
> Three dual cards, dmesg (extracted from /var/log/messages) below:
>
> OpenBSD 3.8-stable (GENERIC.MP) #0: Thu Jan  5 03:55:53 EST 2006
> [EMAIL PROTECTED]:/usr/src/sys/arch/i386/compile/GENERIC.MP
> cpu0: Intel Pentium III ("GenuineIntel" 686-class) 798 MHz
> cpu0:
> FPU,V86,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,MMX,
> FXSR,SSE
> real mem  = 536436736 (523864K)
> avail mem = 482525184 (471216K)
> using 4278 buffers containing 26923008 bytes (26292K) of memory
> mainbus0 (root)
> bios0 at mainbus0: AT/286+(00) BIOS, date 12/31/99, BIOS32 rev. 0 @ 0xf
> pcibios0 at bios0: rev 2.1 @ 0xf/0x2000
> pcibios0: PCI BIOS has 6 Interrupt Routing table entries
> pcibios0: PCI Interrupt Router at 000:15:0 ("ServerWorks ROSB4 SouthBridge"
> rev 0x00)
> pcibios0: PCI bus #1 is the last bus
> bios0: ROM list: 0xc/0x8000 0xc8000/0x4000! 0xe8000/0x6000
> 0xee000/0x2000!
> mainbus0: Intel MP Specification (Version 1.4) (COMPAQ   PROLIANT)
> cpu0 at mainbus0: apid 3 (boot processor)
> cpu0: apic clock running at 132 MHz
> cpu1 at mainbus0: apid 0 (application processor)
> cpu1: Intel Pentium III ("GenuineIntel" 686-class) 797 MHz
> cpu1:
> FPU,V86,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,MMX,
> FXSR,SSE
> mainbus0: bus 0 is type PCI
> mainbus0: bus 3 is type PCI
> mainbus0: bus 9 is type ISA
> ioapic0 at mainbus0: apid 8 pa 0xfec0, version 11, 35 pins
> ioapic0: misconfigured as apic 0, remapped to apic 8
> pci0 at mainbus0 bus 0: configuration mode 1 (no bios)
> pchb0 at pci0 dev 0 function 0 "ServerWorks CNB20LE Host" rev 0x05
> pchb1 at pci0 dev 0 function 1 "ServerWorks CNB20LE Host" rev 0x05
> pci1 at pchb1 bus 3
> fxp0 at pci1 dev 4 function 0 "Intel 82557" rev 0x08, i82559: apic 8 int 10
> (irq 10), address 00:50:8b:e2:6e:fb
> inphy0 at fxp0 phy 1: i82555 10/100 PHY, rev. 4
> fxp1 at pci1 dev 5 function 0 "Intel 82557" rev 0x08, i82559: apic 8 int 11
> (irq 11), address 00:50:8b:e2:6e:fa
> inphy1 at fxp1 phy 1: i82555 10/100 PHY, rev. 4
> ppb0 at pci1 dev 6 function 0 "DEC 21154 PCI-PCI" rev 0x05
> pci2 at ppb0 bus 4
> fxp2 at pci2 dev 4 function 0 "Intel 82557" rev 0x08, i82559: apic 8 int 11
> (irq 11), address 00:02:a5:60:58:50
> inphy2 at fxp2 phy 1: i82555 10/100 PHY, rev. 4
> fxp3 at pci2 dev 5 function 0 "Intel 82557" rev 0x08, i82559: apic 8 int 10
> (irq 10), address 00:02:a5:60:58:51
> inphy3 at fxp3 phy 1: i82555 10/100 PHY, rev. 4
> cac0 at pci0 dev 1 function 0 "Symbios Logic 53c1510" rev 0x02: apic 8 int 3
> (irq 3) Compaq Integrated Array
> scsibus0 at cac0: 1 targets
> sd0 at scsibus0 targ 0 lun 0:  SCSI2 0/direct
> fixed
> sd0: 17359MB, 4357 cyl, 255 head, 32 sec, 512 bytes/sec, 35553120 sec total
> vga1 at pci0 dev 3 function 0 "ATI Mach64 GV" rev 0x7a
> wsdisplay0 at vga1 mux 1: console (80x25, vt100 emulation)
> wsdisplay0: screen 1-5 added (80x25, vt100 emulation)
> "Compaq Netelligent ASMC" rev 0x00 at pci0 dev 4 function 0 not configured
> ppb1 at pci0 dev 5 function 0 "IBM 82351 PCI-PCI" rev 0x01
> pci3 at ppb1 bus 1
> tl0 at pci3 dev 0 function 0 "Compaq DP Netelligent 10/100TX" rev 0x10: apic
> 8 int 5 (irq 5) address 00:08:c7:a4:84:6d
> nsphy0 at tl0 phy 1: DP83840 10/100 PHY, rev. 1
> ukphy0 at tl0 phy 31: Generic IEEE 802.3u media interface
> ukphy0: OUI 0x100014, model 0x0001, rev. 5
> tl1 at pci3 dev 1 function 0 "Compaq DP Netelligent 10/100TX" rev 0x10: apic
> 8 int 7 (irq 7) address 00:08:c7:a4:84:ed
> nsphy1 at tl1 phy 1: DP83840 10/100 PHY, rev. 1
> ukphy1 at tl1 phy 31: Generic IEEE 802.3u media interface
> ukphy1: OUI 0x100014, model 0x0001, rev. 5
> pcib0 at pci0 dev 15 function 0 "ServerWorks ROSB4 SouthBridge" rev 0x4f
> pciide0 at pci0 dev 15 function 1 "ServerWorks OSB4 IDE" rev 0x00: DMA
> atapiscsi0 at pciide0 channel 1 drive 0
> scsibus1 at atapiscsi0: 2 targets
> cd0 at scsibus1 targ 0 lun 0:  SCSI0 5/cdrom
> removable
> cd0(pciide0:1:0): using PIO mode 4, DMA mode 2
> isa0 at pcib0
> isadma0 at isa0
> pckbc0 at isa0 port 0x60/5
> pckbd0 at pckbc0 (kbd slot)
> pckbc0: using irq 1 for kbd slot
> wskbd0 at pckbd0: console keyboard, using wsdisplay0
> pmsi0 at pckbc0 (aux slot)
> pckbc0: using irq 12 for aux slot
> wsmouse0 at pmsi0 mux 0
> pcppi0 at isa0 port 0x61
> midi0 at pcppi0: 
> spkr0 at pcppi0
> sysbeep0 at pcppi0
> npx0 at isa0 port 0xf0/16: using exception 16
> pccom0 at isa0 port 0x3f8/8 irq 4: ns16550a, 16 byte fifo
> fdc0 at isa0 port 0x3f0/6 irq 6 drq 2
> fd0 at fdc0 drive 0: 1.44MB 80 cyl, 2 head, 18 sec
> biomask 0 netmask 0 ttymask 0
> pctr: 686-class user-level performance counters enabled
> mtrr: Pentium Pro MTRR support
> dkcsum: sd0 matches BIOS dr

Re: Strange carp issues

2006-03-15 Thread Steven S
Bryan Irvine wrote:
> I don't suppose you are using a quad card of some kind are you?
> 
> 
...
Three dual cards, dmesg (extracted from /var/log/messages) below:

OpenBSD 3.8-stable (GENERIC.MP) #0: Thu Jan  5 03:55:53 EST 2006
[EMAIL PROTECTED]:/usr/src/sys/arch/i386/compile/GENERIC.MP
cpu0: Intel Pentium III ("GenuineIntel" 686-class) 798 MHz
cpu0:
FPU,V86,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,MMX,
FXSR,SSE
real mem  = 536436736 (523864K)
avail mem = 482525184 (471216K)
using 4278 buffers containing 26923008 bytes (26292K) of memory
mainbus0 (root)
bios0 at mainbus0: AT/286+(00) BIOS, date 12/31/99, BIOS32 rev. 0 @ 0xf
pcibios0 at bios0: rev 2.1 @ 0xf/0x2000
pcibios0: PCI BIOS has 6 Interrupt Routing table entries
pcibios0: PCI Interrupt Router at 000:15:0 ("ServerWorks ROSB4 SouthBridge"
rev 0x00)
pcibios0: PCI bus #1 is the last bus
bios0: ROM list: 0xc/0x8000 0xc8000/0x4000! 0xe8000/0x6000
0xee000/0x2000!
mainbus0: Intel MP Specification (Version 1.4) (COMPAQ   PROLIANT)
cpu0 at mainbus0: apid 3 (boot processor)
cpu0: apic clock running at 132 MHz
cpu1 at mainbus0: apid 0 (application processor)
cpu1: Intel Pentium III ("GenuineIntel" 686-class) 797 MHz
cpu1:
FPU,V86,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,MMX,
FXSR,SSE
mainbus0: bus 0 is type PCI
mainbus0: bus 3 is type PCI
mainbus0: bus 9 is type ISA
ioapic0 at mainbus0: apid 8 pa 0xfec0, version 11, 35 pins
ioapic0: misconfigured as apic 0, remapped to apic 8
pci0 at mainbus0 bus 0: configuration mode 1 (no bios)
pchb0 at pci0 dev 0 function 0 "ServerWorks CNB20LE Host" rev 0x05
pchb1 at pci0 dev 0 function 1 "ServerWorks CNB20LE Host" rev 0x05
pci1 at pchb1 bus 3
fxp0 at pci1 dev 4 function 0 "Intel 82557" rev 0x08, i82559: apic 8 int 10
(irq 10), address 00:50:8b:e2:6e:fb
inphy0 at fxp0 phy 1: i82555 10/100 PHY, rev. 4
fxp1 at pci1 dev 5 function 0 "Intel 82557" rev 0x08, i82559: apic 8 int 11
(irq 11), address 00:50:8b:e2:6e:fa
inphy1 at fxp1 phy 1: i82555 10/100 PHY, rev. 4
ppb0 at pci1 dev 6 function 0 "DEC 21154 PCI-PCI" rev 0x05
pci2 at ppb0 bus 4
fxp2 at pci2 dev 4 function 0 "Intel 82557" rev 0x08, i82559: apic 8 int 11
(irq 11), address 00:02:a5:60:58:50
inphy2 at fxp2 phy 1: i82555 10/100 PHY, rev. 4
fxp3 at pci2 dev 5 function 0 "Intel 82557" rev 0x08, i82559: apic 8 int 10
(irq 10), address 00:02:a5:60:58:51
inphy3 at fxp3 phy 1: i82555 10/100 PHY, rev. 4
cac0 at pci0 dev 1 function 0 "Symbios Logic 53c1510" rev 0x02: apic 8 int 3
(irq 3) Compaq Integrated Array
scsibus0 at cac0: 1 targets
sd0 at scsibus0 targ 0 lun 0:  SCSI2 0/direct
fixed
sd0: 17359MB, 4357 cyl, 255 head, 32 sec, 512 bytes/sec, 35553120 sec total
vga1 at pci0 dev 3 function 0 "ATI Mach64 GV" rev 0x7a
wsdisplay0 at vga1 mux 1: console (80x25, vt100 emulation)
wsdisplay0: screen 1-5 added (80x25, vt100 emulation)
"Compaq Netelligent ASMC" rev 0x00 at pci0 dev 4 function 0 not configured
ppb1 at pci0 dev 5 function 0 "IBM 82351 PCI-PCI" rev 0x01
pci3 at ppb1 bus 1
tl0 at pci3 dev 0 function 0 "Compaq DP Netelligent 10/100TX" rev 0x10: apic
8 int 5 (irq 5) address 00:08:c7:a4:84:6d
nsphy0 at tl0 phy 1: DP83840 10/100 PHY, rev. 1
ukphy0 at tl0 phy 31: Generic IEEE 802.3u media interface
ukphy0: OUI 0x100014, model 0x0001, rev. 5
tl1 at pci3 dev 1 function 0 "Compaq DP Netelligent 10/100TX" rev 0x10: apic
8 int 7 (irq 7) address 00:08:c7:a4:84:ed
nsphy1 at tl1 phy 1: DP83840 10/100 PHY, rev. 1
ukphy1 at tl1 phy 31: Generic IEEE 802.3u media interface
ukphy1: OUI 0x100014, model 0x0001, rev. 5
pcib0 at pci0 dev 15 function 0 "ServerWorks ROSB4 SouthBridge" rev 0x4f
pciide0 at pci0 dev 15 function 1 "ServerWorks OSB4 IDE" rev 0x00: DMA
atapiscsi0 at pciide0 channel 1 drive 0
scsibus1 at atapiscsi0: 2 targets
cd0 at scsibus1 targ 0 lun 0:  SCSI0 5/cdrom
removable
cd0(pciide0:1:0): using PIO mode 4, DMA mode 2
isa0 at pcib0
isadma0 at isa0
pckbc0 at isa0 port 0x60/5
pckbd0 at pckbc0 (kbd slot)
pckbc0: using irq 1 for kbd slot
wskbd0 at pckbd0: console keyboard, using wsdisplay0
pmsi0 at pckbc0 (aux slot)
pckbc0: using irq 12 for aux slot
wsmouse0 at pmsi0 mux 0
pcppi0 at isa0 port 0x61
midi0 at pcppi0: 
spkr0 at pcppi0
sysbeep0 at pcppi0
npx0 at isa0 port 0xf0/16: using exception 16
pccom0 at isa0 port 0x3f8/8 irq 4: ns16550a, 16 byte fifo
fdc0 at isa0 port 0x3f0/6 irq 6 drq 2
fd0 at fdc0 drive 0: 1.44MB 80 cyl, 2 head, 18 sec
biomask 0 netmask 0 ttymask 0
pctr: 686-class user-level performance counters enabled
mtrr: Pentium Pro MTRR support
dkcsum: sd0 matches BIOS drive 0x80
root on sd0a
rootdev=0x400 rrootdev=0xd00 rawdev=0xd02



Re: Strange carp issues

2006-03-15 Thread Bryan Irvine
I don't suppose you are using a quad card of some kind are you?



On 3/15/06, Steven S <[EMAIL PROTECTED]> wrote:
> I have two firewalls (FW1 & FW2) with multiple carp interfaces on an
> external interface (carp1, carp12, carp14, carp15, carp16, carp17, carp18,
> carp19, carp20).  FW1 has all carp interfaces set with advbase 1 advskew 0
> and FW2 has all carp interfaces with advbase 1 advskew 180.  Frequently FW2
> thinks it is the master for some of the carp interfaces.  Here is a tcpdump
> (-ni fxp0 proto carp) from FW2.  As you can see, even though FW2 sees the
> advertisement for carp16, carp17, carp18, carp19 and carp20 from FW1 it
> sometimes takes over as master for those interfaces and advertises.  To find
> these events look for advskew=180 in the tcpdump below.
>
> The event at 19:19:05.023848 seemed to be from lost packets.  The event at
> 19:19:10.013844 is very odd since FW2 saw the carp20 advertisement from FW1
> at 19:19:09.07.  This should be enough time for a failover, should it?
>
> Any pointers would be appreciated (relevant pf rules below.)
>
> -Steve S.
>
> 19:19:02.290779 CARPv2-advertise 36: vhid=1 advbase=1 advskew=0 (DF) [tos
> 0x10]
> 19:19:02.290807 CARPv2-advertise 36: vhid=12 advbase=1 advskew=0 (DF) [tos
> 0x10]
> 19:19:02.290828 CARPv2-advertise 36: vhid=14 advbase=1 advskew=0 (DF) [tos
> 0x10]
> 19:19:02.290849 CARPv2-advertise 36: vhid=15 advbase=1 advskew=0 (DF) [tos
> 0x10]
> 19:19:02.290869 CARPv2-advertise 36: vhid=16 advbase=1 advskew=0 (DF) [tos
> 0x10]
> 19:19:02.290887 CARPv2-advertise 36: vhid=17 advbase=1 advskew=0 (DF) [tos
> 0x10]
> 19:19:02.290914 CARPv2-advertise 36: vhid=18 advbase=1 advskew=0 (DF) [tos
> 0x10]
> 19:19:02.290936 CARPv2-advertise 36: vhid=19 advbase=1 advskew=0 (DF) [tos
> 0x10]
> 19:19:02.290957 CARPv2-advertise 36: vhid=20 advbase=1 advskew=0 (DF) [tos
> 0x10]
> 19:19:02.890823 CARPv2-advertise 36: vhid=1 advbase=1 advskew=0 (DF) [tos
> 0x10]
> 19:19:02.890849 CARPv2-advertise 36: vhid=12 advbase=1 advskew=0 (DF) [tos
> 0x10]
> 19:19:02.890871 CARPv2-advertise 36: vhid=14 advbase=1 advskew=0 (DF) [tos
> 0x10]
> 19:19:02.890892 CARPv2-advertise 36: vhid=15 advbase=1 advskew=0 (DF) [tos
> 0x10]
> 19:19:02.890912 CARPv2-advertise 36: vhid=16 advbase=1 advskew=0 (DF) [tos
> 0x10]
> 19:19:02.890933 CARPv2-advertise 36: vhid=17 advbase=1 advskew=0 (DF) [tos
> 0x10]
> 19:19:02.890962 CARPv2-advertise 36: vhid=18 advbase=1 advskew=0 (DF) [tos
> 0x10]
> 19:19:02.890986 CARPv2-advertise 36: vhid=19 advbase=1 advskew=0 (DF) [tos
> 0x10]
> 19:19:02.891010 CARPv2-advertise 36: vhid=20 advbase=1 advskew=0 (DF) [tos
> 0x10]
> 19:19:03.880791 CARPv2-advertise 36: vhid=1 advbase=1 advskew=0 (DF) [tos
> 0x10]
> 19:19:03.880818 CARPv2-advertise 36: vhid=12 advbase=1 advskew=0 (DF) [tos
> 0x10]
> 19:19:03.880839 CARPv2-advertise 36: vhid=14 advbase=1 advskew=0 (DF) [tos
> 0x10]
> 19:19:03.880860 CARPv2-advertise 36: vhid=15 advbase=1 advskew=0 (DF) [tos
> 0x10]
> 19:19:03.880881 CARPv2-advertise 36: vhid=16 advbase=1 advskew=0 (DF) [tos
> 0x10]
> 19:19:03.880901 CARPv2-advertise 36: vhid=17 advbase=1 advskew=0 (DF) [tos
> 0x10]
> 19:19:03.880932 CARPv2-advertise 36: vhid=18 advbase=1 advskew=0 (DF) [tos
> 0x10]
> 19:19:03.880955 CARPv2-advertise 36: vhid=19 advbase=1 advskew=0 (DF) [tos
> 0x10]
> 19:19:03.880979 CARPv2-advertise 36: vhid=20 advbase=1 advskew=0 (DF) [tos
> 0x10]
> 19:19:05.023848 CARPv2-advertise 36: vhid=17 advbase=1 advskew=180 (DF) [tos
> 0x10]
> 19:19:05.024936 CARPv2-advertise 36: vhid=18 advbase=1 advskew=180 (DF) [tos
> 0x10]
> 19:19:05.026003 CARPv2-advertise 36: vhid=19 advbase=1 advskew=180 (DF) [tos
> 0x10]
> 19:19:05.027069 CARPv2-advertise 36: vhid=20 advbase=1 advskew=180 (DF) [tos
> 0x10]
> 19:19:05.341023 CARPv2-advertise 36: vhid=1 advbase=1 advskew=0 (DF) [tos
> 0x10]
> 19:19:05.341047 CARPv2-advertise 36: vhid=12 advbase=1 advskew=0 (DF) [tos
> 0x10]
> 19:19:05.341068 CARPv2-advertise 36: vhid=14 advbase=1 advskew=0 (DF) [tos
> 0x10]
> 19:19:05.341088 CARPv2-advertise 36: vhid=15 advbase=1 advskew=0 (DF) [tos
> 0x10]
> 19:19:05.341109 CARPv2-advertise 36: vhid=16 advbase=1 advskew=0 (DF) [tos
> 0x10]
> 19:19:05.341129 CARPv2-advertise 36: vhid=17 advbase=1 advskew=0 (DF) [tos
> 0x10]
> 19:19:05.341154 CARPv2-advertise 36: vhid=18 advbase=1 advskew=0 (DF) [tos
> 0x10]
> 19:19:05.341176 CARPv2-advertise 36: vhid=19 advbase=1 advskew=0 (DF) [tos
> 0x10]
> 19:19:05.341199 CARPv2-advertise 36: vhid=20 advbase=1 advskew=0 (DF) [tos
> 0x10]
> 19:19:06.295736 CARPv2-advertise 36: vhid=1 advbase=1 advskew=0 (DF) [tos
> 0x10]
> 19:19:06.295760 CARPv2-advertise 36: vhid=12 advbase=1 advskew=0 (DF) [tos
> 0x10]
> 19:19:06.295782 CARPv2-advertise 36: vhid=14 advbase=1 advskew=0 (DF) [tos
> 0x10]
> 19:19:06.295802 CARPv2-advertise 36: vhid=15 advbase=1 advskew=0 (DF) [tos
> 0x10]
> 19:19:06.295822 CARPv2-advertise 36: vhid=16 advbase=1 advskew=0 (DF) [tos
> 0x10]
> 19:19:06.297299 CARPv2-advertise 36: vhid=17 advbase=1 advskew=0 (DF) [

Strange carp issues

2006-03-15 Thread Steven S
I have two firewalls (FW1 & FW2) with multiple carp interfaces on an
external interface (carp1, carp12, carp14, carp15, carp16, carp17, carp18,
carp19, carp20).  FW1 has all carp interfaces set with advbase 1 advskew 0
and FW2 has all carp interfaces with advbase 1 advskew 180.  Frequently FW2
thinks it is the master for some of the carp interfaces.  Here is a tcpdump
(-ni fxp0 proto carp) from FW2.  As you can see, even though FW2 sees the
advertisement for carp16, carp17, carp18, carp19 and carp20 from FW1 it
sometimes takes over as master for those interfaces and advertises.  To find
these events look for advskew=180 in the tcpdump below.

The event at 19:19:05.023848 seemed to be from lost packets.  The event at
19:19:10.013844 is very odd since FW2 saw the carp20 advertisement from FW1
at 19:19:09.07.  This should be enough time for a failover, should it?

Any pointers would be appreciated (relevant pf rules below.)

-Steve S.

19:19:02.290779 CARPv2-advertise 36: vhid=1 advbase=1 advskew=0 (DF) [tos
0x10]
19:19:02.290807 CARPv2-advertise 36: vhid=12 advbase=1 advskew=0 (DF) [tos
0x10]
19:19:02.290828 CARPv2-advertise 36: vhid=14 advbase=1 advskew=0 (DF) [tos
0x10]
19:19:02.290849 CARPv2-advertise 36: vhid=15 advbase=1 advskew=0 (DF) [tos
0x10]
19:19:02.290869 CARPv2-advertise 36: vhid=16 advbase=1 advskew=0 (DF) [tos
0x10]
19:19:02.290887 CARPv2-advertise 36: vhid=17 advbase=1 advskew=0 (DF) [tos
0x10]
19:19:02.290914 CARPv2-advertise 36: vhid=18 advbase=1 advskew=0 (DF) [tos
0x10]
19:19:02.290936 CARPv2-advertise 36: vhid=19 advbase=1 advskew=0 (DF) [tos
0x10]
19:19:02.290957 CARPv2-advertise 36: vhid=20 advbase=1 advskew=0 (DF) [tos
0x10]
19:19:02.890823 CARPv2-advertise 36: vhid=1 advbase=1 advskew=0 (DF) [tos
0x10]
19:19:02.890849 CARPv2-advertise 36: vhid=12 advbase=1 advskew=0 (DF) [tos
0x10]
19:19:02.890871 CARPv2-advertise 36: vhid=14 advbase=1 advskew=0 (DF) [tos
0x10]
19:19:02.890892 CARPv2-advertise 36: vhid=15 advbase=1 advskew=0 (DF) [tos
0x10]
19:19:02.890912 CARPv2-advertise 36: vhid=16 advbase=1 advskew=0 (DF) [tos
0x10]
19:19:02.890933 CARPv2-advertise 36: vhid=17 advbase=1 advskew=0 (DF) [tos
0x10]
19:19:02.890962 CARPv2-advertise 36: vhid=18 advbase=1 advskew=0 (DF) [tos
0x10]
19:19:02.890986 CARPv2-advertise 36: vhid=19 advbase=1 advskew=0 (DF) [tos
0x10]
19:19:02.891010 CARPv2-advertise 36: vhid=20 advbase=1 advskew=0 (DF) [tos
0x10]
19:19:03.880791 CARPv2-advertise 36: vhid=1 advbase=1 advskew=0 (DF) [tos
0x10]
19:19:03.880818 CARPv2-advertise 36: vhid=12 advbase=1 advskew=0 (DF) [tos
0x10]
19:19:03.880839 CARPv2-advertise 36: vhid=14 advbase=1 advskew=0 (DF) [tos
0x10]
19:19:03.880860 CARPv2-advertise 36: vhid=15 advbase=1 advskew=0 (DF) [tos
0x10]
19:19:03.880881 CARPv2-advertise 36: vhid=16 advbase=1 advskew=0 (DF) [tos
0x10]
19:19:03.880901 CARPv2-advertise 36: vhid=17 advbase=1 advskew=0 (DF) [tos
0x10]
19:19:03.880932 CARPv2-advertise 36: vhid=18 advbase=1 advskew=0 (DF) [tos
0x10]
19:19:03.880955 CARPv2-advertise 36: vhid=19 advbase=1 advskew=0 (DF) [tos
0x10]
19:19:03.880979 CARPv2-advertise 36: vhid=20 advbase=1 advskew=0 (DF) [tos
0x10]
19:19:05.023848 CARPv2-advertise 36: vhid=17 advbase=1 advskew=180 (DF) [tos
0x10]
19:19:05.024936 CARPv2-advertise 36: vhid=18 advbase=1 advskew=180 (DF) [tos
0x10]
19:19:05.026003 CARPv2-advertise 36: vhid=19 advbase=1 advskew=180 (DF) [tos
0x10]
19:19:05.027069 CARPv2-advertise 36: vhid=20 advbase=1 advskew=180 (DF) [tos
0x10]
19:19:05.341023 CARPv2-advertise 36: vhid=1 advbase=1 advskew=0 (DF) [tos
0x10]
19:19:05.341047 CARPv2-advertise 36: vhid=12 advbase=1 advskew=0 (DF) [tos
0x10]
19:19:05.341068 CARPv2-advertise 36: vhid=14 advbase=1 advskew=0 (DF) [tos
0x10]
19:19:05.341088 CARPv2-advertise 36: vhid=15 advbase=1 advskew=0 (DF) [tos
0x10]
19:19:05.341109 CARPv2-advertise 36: vhid=16 advbase=1 advskew=0 (DF) [tos
0x10]
19:19:05.341129 CARPv2-advertise 36: vhid=17 advbase=1 advskew=0 (DF) [tos
0x10]
19:19:05.341154 CARPv2-advertise 36: vhid=18 advbase=1 advskew=0 (DF) [tos
0x10]
19:19:05.341176 CARPv2-advertise 36: vhid=19 advbase=1 advskew=0 (DF) [tos
0x10]
19:19:05.341199 CARPv2-advertise 36: vhid=20 advbase=1 advskew=0 (DF) [tos
0x10]
19:19:06.295736 CARPv2-advertise 36: vhid=1 advbase=1 advskew=0 (DF) [tos
0x10]
19:19:06.295760 CARPv2-advertise 36: vhid=12 advbase=1 advskew=0 (DF) [tos
0x10]
19:19:06.295782 CARPv2-advertise 36: vhid=14 advbase=1 advskew=0 (DF) [tos
0x10]
19:19:06.295802 CARPv2-advertise 36: vhid=15 advbase=1 advskew=0 (DF) [tos
0x10]
19:19:06.295822 CARPv2-advertise 36: vhid=16 advbase=1 advskew=0 (DF) [tos
0x10]
19:19:06.297299 CARPv2-advertise 36: vhid=17 advbase=1 advskew=0 (DF) [tos
0x10]
19:19:06.297318 CARPv2-advertise 36: vhid=18 advbase=1 advskew=0 (DF) [tos
0x10]
19:19:06.297335 CARPv2-advertise 36: vhid=19 advbase=1 advskew=0 (DF) [tos
0x10]
19:19:06.297352 CARPv2-advertise 36: vhid=20 advbase=1 advskew=0 (DF) [tos
0x10]
19:19:06.900831 CARPv2-advertise 36: vhid=1 advbase=1 advskew=0 (DF) [tos
0x