On Sun, May 31, 2009 at 2:16 PM, (private) HKS<[email protected]> wrote: > On Sun, May 31, 2009 at 1:58 PM, (private) HKS <[email protected]> wrote: >> I have two networks: an office and a datacenter. The office has a >> single router (dmesg below) that I upgraded to 4.5 today. The >> datacenter has two routers running 4.4. The datacenter routers share a >> CARP address. The locations communicate over a gif tunnel protected by >> IPsec. >> >> After upgrading to 4.5 today, connections made across this tunnel are >> dropped after about 30 seconds. >> >> For instance, I ssh into a my datacenter backup server from my >> workstation. A state is created, traffic passes normally - until about >> 30 seconds later when the state is terminated. This does not happen >> for traffic passed out to the net outside this tunnel. >> >> The only weirdness I've been able to quantify is the state that is created: >> >> # pfctl -vvs state | grep -A 2 <workstaiton> | grep -A 2 <server> >> all tcp <server>:22 <- <workstation>:2733 ESTABLISHED:ESTABLISHED >> [1948621377 + 65119] [2814490494 + 17520] >> age 00:00:27, expires in 23:59:43, 76:93 pkts, 5756:11189 bytes, rule 25 >> all tcp <workstation>:2733 -> <server>:22 SYN_SENT:CLOSED >> [2814490494 + 4294964697] [0 + 65535] >> age 00:00:27, expires in 00:00:03, 76:0 pkts, 5756:0 bytes, rule 203 >> >> Once that SYN_SENT:CLOSED state's expiration counter reaches zero, my >> newly upgraded firewall starts blocking traffic from my workstation to >> the server. >> >> When pf debugging is set to misc, I get the following sort of message >> in my syslog (these were pulled from two different examples - the >> ports do match when it happens): >> >> May 31 12:05:47 <router> /bsd: pf: loose state match: TCP out wire: >> <server>:22 <workstation>:2105 stack: - [lo=1243591892 high=1243591894 >> win=65535 modulator=0] [lo=0 high=65535 win=1 modulator=0] 2:0 PA >> seq=1243591893 (1243591893) ack=0 len=28 ackskew=0 pkts=2:0 >> dir=out,fwd >> >> I'm at a loss. My pf.conf is pretty huge, so I inserted a "pass quick >> from <workstation> to <server>" at the top above my "block log" >> policy. Same thing. >> >> I'm not sure what else is even needed to troubleshoot this. Can anyone >> give me some ideas? >> >> -HKS >> >> >> OpenBSD 4.5 (GENERIC) #1749: Sat Feb 28 14:51:18 MST 2009 >> [email protected]:/usr/src/sys/arch/i386/compile/GENERIC >> cpu0: Intel(R) Xeon(TM) CPU 2.80GHz ("GenuineIntel" 686-class) 2.80 GHz >> cpu0: FPU,V86,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUS H,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,SBF,SSE3,MWAIT,DS-CPL,CNXT-ID,CX16,xTPR >> real mem = 2146795520 (2047MB) >> avail mem = 2067582976 (1971MB) >> mainbus0 at root >> bios0 at mainbus0: AT/286+ BIOS, date 04/25/08, BIOS32 rev. 0 @ >> 0xffe90, SMBIOS rev. 2.3 @ 0xf9920 (87 entries) >> bios0: vendor Dell Computer Corporation version "A07" date 04/25/2008 >> bios0: Dell Computer Corporation PowerEdge 2850 >> acpi0 at bios0: rev 0 >> acpi0: tables DSDT FACP APIC SPCR HPET MCFG >> acpi0: wakeup devices PCI0(S5) PALO(S5) PBLO(S5) VPR0(S5) PBHI(S5) >> VPR1(S5) PICH(S5) >> acpitimer0 at acpi0: 3579545 Hz, 24 bits >> acpimadt0 at acpi0 addr 0xfee00000: PC-AT compat >> cpu0 at mainbus0: apid 0 (boot processor) >> cpu0: apic clock running at 199MHz >> cpu at mainbus0: not configured >> ioapic0 at mainbus0: apid 2 pa 0xfec00000, version 20, 24 pins >> ioapic0: misconfigured as apic 0, remapped to apid 2 >> ioapic1 at mainbus0: apid 3 pa 0xfec80000, version 20, 24 pins >> ioapic1: misconfigured as apic 0, remapped to apid 3 >> ioapic2 at mainbus0: apid 4 pa 0xfec83000, version 20, 24 pins >> ioapic2: misconfigured as apic 0, remapped to apid 4 >> ioapic3 at mainbus0: apid 5 pa 0xfec84000, version 20, 24 pins >> ioapic3: misconfigured as apic 0, remapped to apid 5 >> acpihpet0 at acpi0: 14318179 Hz >> acpiprt0 at acpi0: bus 0 (PCI0) >> acpiprt1 at acpi0: bus 1 (PALO) >> acpiprt2 at acpi0: bus 2 (DOBA) >> acpiprt3 at acpi0: bus 3 (DOBB) >> acpiprt4 at acpi0: bus 4 (PBLO) >> acpiprt5 at acpi0: bus 5 (PBHI) >> acpiprt6 at acpi0: bus 6 (PXB1) >> acpiprt7 at acpi0: bus 7 (PXB2) >> acpiprt8 at acpi0: bus 8 (VPR1) >> acpiprt9 at acpi0: bus 9 (PXC1) >> acpiprt10 at acpi0: bus 10 (PXC2) >> acpiprt11 at acpi0: bus 11 (PICH) >> acpicpu0 at acpi0 >> bios0: ROM list: 0xc0000/0xb000! 0xcb000/0x1000 0xcc000/0x1000 >> 0xcd000/0x2200 0xec000/0x4000! >> ipmi at mainbus0 not configured >> pci0 at mainbus0 bus 0: configuration mode 1 (bios) >> pchb0 at pci0 dev 0 function 0 "Intel E7520 Host" rev 0x09 >> ppb0 at pci0 dev 2 function 0 "Intel E7520 PCIE" rev 0x09 >> pci1 at ppb0 bus 1 >> ppb1 at pci1 dev 0 function 0 "Intel IOP332 PCIE-PCIX" rev 0x06 >> pci2 at ppb1 bus 2 >> ami0 at pci2 dev 14 function 0 "Dell PERC 4e/Di" rev 0x06: apic 3 int 14 (irq 7) >> ami0: Dell 16d, 32b, FW 513O, BIOS vH418, 256MB RAM >> ami0: 2 channels, 0 FC loops, 1 logical drives >> scsibus0 at ami0: 40 targets >> sd0 at scsibus0 targ 0 lun 0: <AMI, Host drive #00, > SCSI2 0/direct fixed >> sd0: 139900MB, 512 bytes/sec, 286515200 sec total >> scsibus1 at ami0: 16 targets >> safte0 at scsibus1 targ 6 lun 0: <PE/PV, 1x6 SCSI BP, 1.0> SCSI2 >> 3/processor fixed >> scsibus2 at ami0: 16 targets >> ppb2 at pci1 dev 0 function 2 "Intel IOP332 PCIE-PCIX" rev 0x06 >> pci3 at ppb2 bus 3 >> ppb3 at pci0 dev 4 function 0 "Intel E7520 PCIE" rev 0x09 >> pci4 at ppb3 bus 4 >> ppb4 at pci0 dev 5 function 0 "Intel E7520 PCIE" rev 0x09 >> pci5 at ppb4 bus 5 >> ppb5 at pci5 dev 0 function 0 "Intel PCIE-PCIE" rev 0x09 >> pci6 at ppb5 bus 6 >> em0 at pci6 dev 7 function 0 "Intel PRO/1000MT (82541GI)" rev 0x05: >> apic 4 int 0 (irq 11), address 00:11:43:d9:17:36 >> ppb6 at pci5 dev 0 function 2 "Intel PCIE-PCIE" rev 0x09 >> pci7 at ppb6 bus 7 >> em1 at pci7 dev 8 function 0 "Intel PRO/1000MT (82541GI)" rev 0x05: >> apic 4 int 1 (irq 3), address 00:11:43:d9:17:37 >> ppb7 at pci0 dev 6 function 0 "Intel E7520 PCIE" rev 0x09 >> pci8 at ppb7 bus 8 >> ppb8 at pci8 dev 0 function 0 "Intel PCIE-PCIE" rev 0x09 >> pci9 at ppb8 bus 9 >> ppb9 at pci8 dev 0 function 2 "Intel PCIE-PCIE" rev 0x09 >> pci10 at ppb9 bus 10 >> em2 at pci10 dev 2 function 0 "Intel PRO/1000MT (82546GB)" rev 0x03: >> apic 5 int 0 (irq 11), address 00:04:23:ad:04:04 >> em3 at pci10 dev 2 function 1 "Intel PRO/1000MT (82546GB)" rev 0x03: >> apic 5 int 1 (irq 3), address 00:04:23:ad:04:05 >> uhci0 at pci0 dev 29 function 0 "Intel 82801EB/ER USB" rev 0x02: apic >> 2 int 16 (irq 11) >> uhci1 at pci0 dev 29 function 1 "Intel 82801EB/ER USB" rev 0x02: apic >> 2 int 19 (irq 10) >> uhci2 at pci0 dev 29 function 2 "Intel 82801EB/ER USB" rev 0x02: apic >> 2 int 18 (irq 7) >> ehci0 at pci0 dev 29 function 7 "Intel 82801EB/ER USB2" rev 0x02: apic >> 2 int 23 (irq 5) >> usb0 at ehci0: USB revision 2.0 >> uhub0 at usb0 "Intel EHCI root hub" rev 2.00/1.00 addr 1 >> ppb10 at pci0 dev 30 function 0 "Intel 82801BA Hub-to-PCI" rev 0xc2 >> pci11 at ppb10 bus 11 >> vga1 at pci11 dev 13 function 0 "ATI Radeon VE" rev 0x00 >> wsdisplay0 at vga1 mux 1: console (80x25, vt100 emulation) >> wsdisplay0: screen 1-5 added (80x25, vt100 emulation) >> radeondrm0 at vga1: apic 2 int 18 (irq 7) >> drm0 at radeondrm0 >> ichpcib0 at pci0 dev 31 function 0 "Intel 82801EB/ER LPC" rev 0x02 >> pciide0 at pci0 dev 31 function 1 "Intel 82801EB/ER IDE" rev 0x02: >> DMA, channel 0 configured to compatibility, channel 1 configured to >> compatibility >> atapiscsi0 at pciide0 channel 0 drive 0 >> scsibus3 at atapiscsi0: 2 targets >> cd0 at scsibus3 targ 0 lun 0: <HL-DT-ST, DVD-ROM GDR8082N, 0106> ATAPI >> 5/cdrom removable >> cd0(pciide0:0:0): using PIO mode 4, Ultra-DMA mode 2 >> pciide0: channel 1 disabled (no drives) >> usb1 at uhci0: USB revision 1.0 >> uhub1 at usb1 "Intel UHCI root hub" rev 1.00/1.00 addr 1 >> usb2 at uhci1: USB revision 1.0 >> uhub2 at usb2 "Intel UHCI root hub" rev 1.00/1.00 addr 1 >> usb3 at uhci2: USB revision 1.0 >> uhub3 at usb3 "Intel UHCI root hub" rev 1.00/1.00 addr 1 >> isa0 at ichpcib0 >> isadma0 at isa0 >> com0 at isa0 port 0x3f8/8 irq 4: ns16550a, 16 byte fifo >> pckbc0 at isa0 port 0x60/5 >> pckbd0 at pckbc0 (kbd slot) >> pckbc0: using irq 1 for kbd slot >> wskbd0 at pckbd0: console keyboard, using wsdisplay0 >> pcppi0 at isa0 port 0x61 >> midi0 at pcppi0: <PC speaker> >> spkr0 at pcppi0 >> npx0 at isa0 port 0xf0/16: reported by CPUID; using exception 16 >> fdc0 at isa0 port 0x3f0/6 irq 6 drq 2 >> mtrr: Pentium Pro MTRR support >> uhub4 at uhub0 port 3 "Dell product 0xa001" rev 2.00/0.00 addr 2 >> uhidev0 at uhub4 port 1 configuration 1 interface 0 "Dell Dell USB >> Keyboard" rev 1.10/3.01 addr 3 >> uhidev0: iclass 3/1 >> ukbd0 at uhidev0: 8 modifier keys, 6 key codes >> wskbd1 at ukbd0 mux 1 >> wskbd1: connecting to wsdisplay0 >> softraid0 at root >> root on sd0a swap on sd0b dump on sd0b >> > > > I've temporarily gotten around this by adding "keep state (sloppy)" to > the end of the rule permitting the traffic. It still has me worried as > hell, though. > > -HKS >
Turns out this was not an IPSec issue (at least, I don't think so), it seems to be an issue with the way pf handles state evaluation for a gre-encapsulated packets. The gif interfaces were built between the datacenter's carp address and the office router's external address. With this configuration, the gre packets were flowing out from the office router to the carp address, then out from the datacenter router's physical interface's IP to the router. When I rebuilt the gif interfaces to point between physical interfaces, the issue cleared up. Is this expected behavior? The gre packets were never dropped, it was only the state riding on top of them that failed after it was initially built. To my uneducated eye, it seems like a bug. -HKS

