Carp issues
I have two firewalls running OpenBSD 5.1 with a 5.2 kernel amd64. I am running the 5.2 kernel because of another, unrelated bug. I have 4 ethernet interfaces (em0-4). em0 and em1 are in a failover trunk mode on trunk0 while em2 and em3 are members of trunk1 in failover mode. On trunk0, I have 3 VLANs (2,3,4) and on trunk1, I have 2 VLANs(10,11). I am running carp on each of these vlan interfaces. I am also running pfsync. I have an ipsec vpn configured which is using sasync between the two firewalls. We had fw1 kernel panic and die yesterday. Everything seemed to switch over as expected to fw2. When we restarted fw1, all carp interfaces switched back to master on fw1 and *most* switched to backup on fw2. However, carp2 (carp for vlan2) stayed master on fw2. This was obviously an issue because it was also master on fw1. This caused lots of dropped packets since two machines are claiming the same IP address. I ifconfig carp2 down'd the carp interface and traffic was passing as it should again. However, as soon as I ifconfig carp2 up'd the carp interface, the carp2 interface on fw2 went to master mode again, and carp2 on fw1 stayed master as well. I have all carp interfaces on fw2 configured with an advskew of 128 and I have preempt enabled. I had to reboot fw2 for things to go back to normal with all interfaces on fw2 in backup mode while all on fw1 were in master mode. Below are my hostname.* config files as well as the carp sysctl values. Please let me know if anyone needs more information or if you have any suggestions on how to avoid this in the future. === FW1 == ** hostname.em0 ** up ** hostname.em1 ** up ** hostname.em2 ** up ** hostname.em3 ** up ** hostname.trunk0 ** up trunkproto failover trunkport em0 trunkport em1 ** hostname.trunk1 ** up trunkproto failover trunkport em2 trunkport em3 ** hostname.vlan10 ** up inet x.x.x.27 255.255.255.248 NONE vlan 10 vlandev trunk1 ** hostname.vlan11 ** up inet x.x.x.131 255.255.255.248 NONE vlan 11 vlandev trunk1 ** hostname.vlan2 ** up inet 172.16.20.2 255.255.255.0 NONE vlan 2 vlandev trunk0 ** hostname.vlan3 ** up inet x.x.x.210 255.255.255.240 NONE vlan 3 vlandev trunk0 ** hostname.vlan4 ** up inet x.x.x.98 255.255.255.224 NONE vlan 4 vlandev trunk0 ** hostname.carp10 ** up inet x.x.x.26 255.255.255.248 x.x.x.31 vhid 10 pass xxx carpdev vlan10 ** hostname.carp11 ** up inet x.x.x.130 255.255.255.248 x.x.x.135 vhid 11 pass xx carpdev vlan11 ** hostname.carp2 ** up inet 172.16.20.1 255.255.255.0 172.16.20.255 vhid 2 pass x carpdev vlan2 ** hostname.carp3 ** up inet x.x.x.209 255.255.255.240 x.x.x.223 vhid 3 pass x carpdev vlan3 ** hostname.carp4 ** up inet x.x.x.97 255.255.255.224 x.x.x.127 vhid 4 pass x carpdev vlan4 ** hostname.pfsync0 ** up syncdev vlan2 syncpeer 172.16.20.3 === FW2 ** hostname.em0 ** up ** hostname.em1 ** up ** hostname.em2 ** up ** hostname.em3 ** up ** hostname.trunk0 ** up trunkproto failover trunkport em0 trunkport em1 ** hostname.trunk1 ** up trunkproto failover trunkport em2 trunkport em3 ** hostname.vlan10 ** up inet x.x.x.28 255.255.255.248 NONE vlan 10 vlandev trunk1 ** hostname.vlan11 ** up inet x.x.x.132 255.255.255.248 NONE vlan 11 vlandev trunk1 ** hostname.vlan2 ** up inet 172.16.20.3 255.255.255.0 NONE vlan 2 vlandev trunk0 ** hostname.vlan3 ** up inet x.x.x.213 255.255.255.240 NONE vlan 3 vlandev trunk0 ** hostname.vlan4 ** up inet x.x.x.99 255.255.255.224 NONE vlan 4 vlandev trunk0 ** hostname.carp10 ** up inet x.x.x 26 255.255.255.248 x.x.x 31 vhid 10 pass carpdev vlan10 advskew 128 ** hostname.carp11 ** up inet x.x.x 130 255.255.255.248 x.x.x 135 vhid 11 pass carpdev vlan11 advskew 128 ** hostname.carp2 ** up inet 172.16.20.1 255.255.255.0 172.16.20.255 vhid 2 pass carpdev vlan2 advskew 128 ** hostname.carp3 ** up inet x.x.x 209 255.255.255.240 x.x.x.223 vhid 3 carpdev vlan3 pass advskew 128 ** hostname.carp4 ** up inet x.x.x..97 255.255.255.224 x.x.x.127 vhid 4 pass carpdev vlan4 advskew 128 ** hostname.pfsync0 ** up syncdev vlan2 syncpeer 172.16.20.2
Odd carp behavior
I have 2 firewalls setup running OpenBSD 5.1 amd64. I have 4 nics on each box. The nics are paired off into interface failover trunks. I then have 4 vlans configured on each box. 3 Vlans go over trunk0, one goes over trunk 1. I have carp setup on each box as well. I have a carp interface set up for each vlan. On FW2, I have an advskew of 128 configured so that this box will act as the backup for all carp devices. I also have pfsync and sasyncd running as well. When I first set this up, I had some odd behavior when I booted both machines. Sometimes fw1 would come up as master for everything, sometimes both fw1 and fw2 would come up as master for everything, and sometimes it would be a mix. I noticed that the carp demote counter on each box would be a different number each time I rebooted the box. It would be anywhere from 0 to 126. I looked at the /etc/rc script to see where the demote counter is being jacked up to 128 while various networking interfaces are being started. I put a 'sleep 20' right after '. /etc/netstart' in the file and that seemed to allow carpdemote to consistently come down to 0 as the machine finished booting. This seemed to fix my problems, or so I thought. Today I noticed that my FW1 had crashed for some reason (still investigating). FW2 assumed master of all carp devices, as it should. I rebooted FW1 and it came up. I checked the ifconfig status for the carp devices and for one carp device, it was backup, while the other 3 it was master. I checked on FW2 and FW2 was master for all 4 carp devices. This doesn't seem correct as now there are two machines advertising master for 3 of the 4 carp devices. I also thought I had it setup so that one box would be carp master for all or none of the carp groups, not a mix. I have net.inet.carp.preempt=1 set in sysctl. Basically, I need to figure out why carp is not behaving correctly, or at least what my understanding of correct is. I'm happy to post any configs required, however I am currently not at a machine that can access the systems in question so that is why they aren't included in this email.
Re: Odd PMTU issue on ipsec tunnel
Matthias, I'm not sure if you got an answer to your question, but I have found a workaround. I set up an IPIP tunnel using a gif interface on both sides of the IPSEC VPN. The ipsec vpn is now running between the two public IPs of the boxes with only a point to point flow defined. The gif tunnel is then setup to run over the ipsec tunnel. Appropriate routes are configured for the private subnets behind each side of the VPN. While adding another layer of encapsulation is not ideal, you can set the MTU on gif interfaces. This allowed me to set an appropriate MTU and now PMTU discovery works for all of my machines behind the VPN. The correct value is reported in the next-hop field. I hope this makes sense. I think the original issue is likely a bug and this is the only way I've seen to get around it, at least temporarily. On Sun, May 13, 2012 at 3:48 AM, Matthias Vey matthias@hm.edu wrote: Hi, nobody an idea? I have the same problem. Currently I set the MTU of the internal networks to 1200. It's a workaround but actually it wastes a lot of bandwith. But without this the MTU of the VPN traffic falls down to something around 550 and that's really bad :-( Thanks Matthias Vey Am 11.05.2012 um 23:06 schrieb Carlos Flor jac...@cybershroud.net: I have an openbsd 5.1-release box configured with an ipsec vpn to another identical openbsd machine. I am trying to test PMTU discovery by sending packets, both TCP and UDP, with the DF bit set. I get an ICMP Unreachable - Fragmentation needed packet as expected, however the Next-Hop MTU: field is set to 0. The RFC says this should never be below 68. I am wondering if the issue is related to the fact that you can no longer set an MTU on enc0 (the ipsec tunnel interface). My first question is why am I getting 0 as the next-hop mtu? Secondly, why can I no longer set an MTU for my enc0 interface (when I try with ifconfig, I get : SIOCSIFMTU: Inappropriate ioctl for device)? Thanks.
Odd PMTU issue on ipsec tunnel
I have an openbsd 5.1-release box configured with an ipsec vpn to another identical openbsd machine. I am trying to test PMTU discovery by sending packets, both TCP and UDP, with the DF bit set. I get an ICMP Unreachable - Fragmentation needed packet as expected, however the Next-Hop MTU: field is set to 0. The RFC says this should never be below 68. I am wondering if the issue is related to the fact that you can no longer set an MTU on enc0 (the ipsec tunnel interface). My first question is why am I getting 0 as the next-hop mtu? Secondly, why can I no longer set an MTU for my enc0 interface (when I try with ifconfig, I get : SIOCSIFMTU: Inappropriate ioctl for device)? Thanks.