Re: pf/queue questions
From: Daniel Melameth dan...@melameth.com Subject: Re: pf/queue questions On Tue, Sep 23, 2014 at 9:39 AM, Dewey Hylton dewey.hyl...@gmail.com wrote: i have a site-to-site vpn setup across a 40Mbps wan link (average ~30ms latency). one of its uses is for san replication, but of course management traffic (ssh sessions, etc.) have to cross the link as well. without using queues, at times the replication traffic is such that management traffic suffers to the point of being unusable. so i setup queues, which fixed the management problem. but despite the management bandwidth requirements being minimal, the san replication traffic was then seen to plateau well below where i believe it should have been. one specific thing i'm seeing with this particular configuration is that san replication traffic tops out at 24Mbps, as seen on the wan circuit itself (outside of openbsd). removing the queues results in 100% wan utilization, even up to 100Mbps when the circuit is temporarily reconfigured to allow it. It's not clear to me in which direction or on what interface the SAN traffic is, but your 20Mb queue on $INETIF might be limiting your maximum throughput. That said, you might also want to consider configuring qlimit and you can tweak this based on QLEN in systat queues. Lastly, I recall henning@ saying queuing on VLANs is mostly useless, so you only want to apply altq to physical interfaces. daniel, thanks for your input. after going back and reading henning's comments regarding queuing on vlans, i moved the queue definition to the physical interface and things are now working as expected.
pf/queue questions
i have a site-to-site vpn setup across a 40Mbps wan link (average ~30ms latency). one of its uses is for san replication, but of course management traffic (ssh sessions, etc.) have to cross the link as well. without using queues, at times the replication traffic is such that management traffic suffers to the point of being unusable. so i setup queues, which fixed the management problem. but despite the management bandwidth requirements being minimal, the san replication traffic was then seen to plateau well below where i believe it should have been. one specific thing i'm seeing with this particular configuration is that san replication traffic tops out at 24Mbps, as seen on the wan circuit itself (outside of openbsd). removing the queues results in 100% wan utilization, even up to 100Mbps when the circuit is temporarily reconfigured to allow it. i have to assume that i've misunderstood the documentation and am looking for some help. i'll paste the pf.conf below, followed by dmesg. we have a fairly complex network on each end of the vpn, and we do in fact need the nat that you will see though not for reasons related to the san replication traffic. i have no doubt that i've done things incorrectly even in areas seemingly unrelated to san replication, so feel free to fire away ... pf.conf: === ## # macros LANIF = em0 WANIF = em1 PFSYNC = em2 INETIF = vlan2 TWP2PIF = vlan3 table vpnendpointspersist { $PUBLIC1 $PUBLIC2 $REMOTEPUB3 $REMOTEPUB4 172.30.255.240/28 } table carpmembers persist { $PUBLIC1 $PUBLIC2 172.28.0.251 172.28.0.252 } table trustednets persist { 10.200.0.0/16 192.168.0.0/16 172.16.0.0/12 } table recoverpointpersist { 10.200.80.0/24 172.28.0.247 172.28.0.248 10.200.72.0/24 172.28.0.10 172.28.0.11 172.28.0.12 172.28.2.0/24 } ## # queues altq on $INETIF cbq bandwidth 20Mb queue { ssh, sansync, std } altq on $TWP2PIF cbq bandwidth 35Mb queue { ssh, sansync, std } queue sansync bandwidth 35% priority 1 cbq(borrow ecn) queue std bandwidth 50% priority 2 cbq(default borrow) queue ssh bandwidth 15% priority 3 cbq(borrow ecn) ## # options set skip on lo set skip on enc0 set skip on gif set skip on $PFSYNC set block-policy return set loginterface $WANIF ## # ftp proxy anchor ftp-proxy/* pass in quick on $LANIF inet proto tcp to any port ftp \ divert-to 127.0.0.1 port 8021 ## # match rules match in from trustednets scrub (no-df random-id max-mss 1200) ## # filter rules block in log pass out pass out proto tcp all modulate state # site-to-site vpn pass in quick log proto esp from vpnendpoints pass in quick log proto udp from vpnendpoints port isakmp antispoof quick for { lo $LANIF } pass in quick proto carp from any to any pass in quick inet proto icmp from any to any icmp-type { echoreq echorep timex unreach } pass in quick on $LANIF to recoverpoint queue sansync label sansync pass in quick on $LANIF proto tcp to port { ssh } queue ssh label ssh pass in quick on $LANIF proto tcp to port { 3389 } queue ssh label rdp pass in log on $LANIF queue std label std dmesg: = OpenBSD 5.4 (GENERIC.MP) #41: Tue Jul 30 15:30:02 MDT 2013 dera...@amd64.openbsd.org:/usr/src/sys/arch/amd64/compile/GENERIC.MP real mem = 8403169280 (8013MB) avail mem = 8171749376 (7793MB) mainbus0 at root bios0 at mainbus0: SMBIOS rev. 2.8 @ 0xb7fcb000 (82 entries) bios0: vendor HP version P80 date 11/08/2013 bios0: HP ProLiant DL320e Gen8 v2 acpi0 at bios0: rev 2 acpi0: sleep states S0 S4 S5 acpi0: tables DSDT FACP SPCR MCFG HPET SPMI ERST APIC BERT HEST DMAR SSDT SSDT SSDT SSDT SSDT acpi0: wakeup devices PCI0(S4) acpitimer0 at acpi0: 3579545 Hz, 24 bits acpimcfg0 at acpi0 addr 0xb800, bus 0-63 acpihpet0 at acpi0: 14318179 Hz acpimadt0 at acpi0 addr 0xfee0: PC-AT compat cpu0 at mainbus0: apid 0 (boot processor) cpu0: Intel(R) Xeon(R) CPU E3-1270 v3 @ 3.50GHz, 3492.44 MHz cpu0: FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUS H,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE,SSE3,PCLMUL,DTES64,MWAIT,DS-CPL,VMX ,SMX,EST,TM2,SSSE3,FMA3,CX16,xTPR,PDCM,PCID,SSE4.1,SSE4.2,x2APIC,MOVBE,POPCNT ,DEADLINE,AES,XSAVE,AVX,F16C,RDRAND,NXE,LONG,LAHF,ABM,PERF,ITSC,FSGSBASE,BMI1 ,HLE,AVX2,SMEP,BMI2,ERMS,INVPCID,RTM cpu0: 256KB 64b/line 8-way L2 cache cpu0: smt 0, core 0, package 0 cpu0: apic clock running at 99MHz cpu1 at mainbus0: apid 2 (application processor) cpu1: Intel(R) Xeon(R) CPU E3-1270 v3 @ 3.50GHz, 3491.92 MHz cpu1: FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUS H,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE,SSE3,PCLMUL,DTES64,MWAIT,DS-CPL,VMX ,SMX,EST,TM2,SSSE3,FMA3,CX16,xTPR,PDCM,PCID,SSE4.1,SSE4.2,x2APIC,MOVBE,POPCNT ,DEADLINE,AES,XSAVE,AVX,F16C,RDRAND,NXE,LONG,LAHF,ABM,PERF,ITSC,FSGSBASE,BMI1
pf/queue questions
i have a site-to-site vpn setup across a 40Mbps wan link (average ~30ms latency). one of its uses is for san replication, but of course management traffic (ssh sessions, etc.) have to cross the link as well. without using queues, at times the replication traffic is such that management traffic suffers to the point of being unusable. so i setup queues, which fixed the management problem. but despite the management bandwidth requirements being minimal, the san replication traffic was then seen to plateau well below where i believe it should have been. one specific thing i'm seeing with this particular configuration is that san replication traffic tops out at 24Mbps, as seen on the wan circuit itself (outside of openbsd). removing the queues results in 100% wan utilization, even up to 100Mbps when the circuit is temporarily reconfigured to allow it. i have to assume that i've misunderstood the documentation and am looking for some help. i'll paste the pf.conf below, followed by dmesg. we have a fairly complex network on each end of the vpn, and we do in fact need the nat that you will see though not for reasons related to the san replication traffic. i have no doubt that i've done things incorrectly even in areas seemingly unrelated to san replication, so feel free to fire away ... pf.conf: === ## # macros LANIF = em0 WANIF = em1 PFSYNC = em2 INETIF = vlan2 TWP2PIF = vlan3 table vpnendpointspersist { $PUBLIC1 $PUBLIC2 $REMOTEPUB3 $REMOTEPUB4 172.30.255.240/28 } table carpmembers persist { $PUBLIC1 $PUBLIC2 172.28.0.251 172.28.0.252 } table trustednets persist { 10.200.0.0/16 192.168.0.0/16 172.16.0.0/12 } table recoverpointpersist { 10.200.80.0/24 172.28.0.247 172.28.0.248 10.200.72.0/24 172.28.0.10 172.28.0.11 172.28.0.12 172.28.2.0/24 } ## # queues altq on $INETIF cbq bandwidth 20Mb queue { ssh, sansync, std } altq on $TWP2PIF cbq bandwidth 35Mb queue { ssh, sansync, std } queue sansync bandwidth 35% priority 1 cbq(borrow ecn) queue std bandwidth 50% priority 2 cbq(default borrow) queue ssh bandwidth 15% priority 3 cbq(borrow ecn) ## # options set skip on lo set skip on enc0 set skip on gif set skip on $PFSYNC set block-policy return set loginterface $WANIF ## # ftp proxy anchor ftp-proxy/* pass in quick on $LANIF inet proto tcp to any port ftp \ divert-to 127.0.0.1 port 8021 ## # match rules match in from trustednets scrub (no-df random-id max-mss 1200) ## # filter rules block in log pass out pass out proto tcp all modulate state # site-to-site vpn pass in quick log proto esp from vpnendpoints pass in quick log proto udp from vpnendpoints port isakmp antispoof quick for { lo $LANIF } pass in quick proto carp from any to any pass in quick inet proto icmp from any to any icmp-type { echoreq echorep timex unreach } pass in quick on $LANIF to recoverpoint queue sansync label sansync pass in quick on $LANIF proto tcp to port { ssh } queue ssh label ssh pass in quick on $LANIF proto tcp to port { 3389 } queue ssh label rdp pass in log on $LANIF queue std label std dmesg: = OpenBSD 5.4 (GENERIC.MP) #41: Tue Jul 30 15:30:02 MDT 2013 dera...@amd64.openbsd.org:/usr/src/sys/arch/amd64/compile/GENERIC.MP real mem = 8403169280 (8013MB) avail mem = 8171749376 (7793MB) mainbus0 at root bios0 at mainbus0: SMBIOS rev. 2.8 @ 0xb7fcb000 (82 entries) bios0: vendor HP version P80 date 11/08/2013 bios0: HP ProLiant DL320e Gen8 v2 acpi0 at bios0: rev 2 acpi0: sleep states S0 S4 S5 acpi0: tables DSDT FACP SPCR MCFG HPET SPMI ERST APIC BERT HEST DMAR SSDT SSDT SSDT SSDT SSDT acpi0: wakeup devices PCI0(S4) acpitimer0 at acpi0: 3579545 Hz, 24 bits acpimcfg0 at acpi0 addr 0xb800, bus 0-63 acpihpet0 at acpi0: 14318179 Hz acpimadt0 at acpi0 addr 0xfee0: PC-AT compat cpu0 at mainbus0: apid 0 (boot processor) cpu0: Intel(R) Xeon(R) CPU E3-1270 v3 @ 3.50GHz, 3492.44 MHz cpu0: FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE,SSE3,PCLMUL,DTES64,MWAIT,DS-CPL,VMX,SMX,EST,TM2,SSSE3,FMA3,CX16,xTPR,PDCM,PCID,SSE4.1,SSE4.2,x2APIC,MOVBE,POPCNT,DEADLINE,AES,XSAVE,AVX,F16C,RDRAND,NXE,LONG,LAHF,ABM,PERF,ITSC,FSGSBASE,BMI1,HLE,AVX2,SMEP,BMI2,ERMS,INVPCID,RTM cpu0: 256KB 64b/line 8-way L2 cache cpu0: smt 0, core 0, package 0 cpu0: apic clock running at 99MHz cpu1 at mainbus0: apid 2 (application processor) cpu1: Intel(R) Xeon(R) CPU E3-1270 v3 @ 3.50GHz, 3491.92 MHz cpu1:
Re: pf/queue questions
On Tue, Sep 23, 2014 at 9:39 AM, Dewey Hylton dewey.hyl...@gmail.com wrote: i have a site-to-site vpn setup across a 40Mbps wan link (average ~30ms latency). one of its uses is for san replication, but of course management traffic (ssh sessions, etc.) have to cross the link as well. without using queues, at times the replication traffic is such that management traffic suffers to the point of being unusable. so i setup queues, which fixed the management problem. but despite the management bandwidth requirements being minimal, the san replication traffic was then seen to plateau well below where i believe it should have been. one specific thing i'm seeing with this particular configuration is that san replication traffic tops out at 24Mbps, as seen on the wan circuit itself (outside of openbsd). removing the queues results in 100% wan utilization, even up to 100Mbps when the circuit is temporarily reconfigured to allow it. It's not clear to me in which direction or on what interface the SAN traffic is, but your 20Mb queue on $INETIF might be limiting your maximum throughput. That said, you might also want to consider configuring qlimit and you can tweak this based on QLEN in systat queues. Lastly, I recall henning@ saying queuing on VLANs is mostly useless, so you only want to apply altq to physical interfaces. pf.conf: === ## # macros LANIF = em0 WANIF = em1 PFSYNC = em2 INETIF = vlan2 TWP2PIF = vlan3 table vpnendpointspersist { $PUBLIC1 $PUBLIC2 $REMOTEPUB3 $REMOTEPUB4 172.30.255.240/28 } table carpmembers persist { $PUBLIC1 $PUBLIC2 172.28.0.251 172.28.0.252 } table trustednets persist { 10.200.0.0/16 192.168.0.0/16 172.16.0.0/12 } table recoverpointpersist { 10.200.80.0/24 172.28.0.247 172.28.0.248 10.200.72.0/24 172.28.0.10 172.28.0.11 172.28.0.12 172.28.2.0/24 } ## # queues altq on $INETIF cbq bandwidth 20Mb queue { ssh, sansync, std } altq on $TWP2PIF cbq bandwidth 35Mb queue { ssh, sansync, std } queue sansync bandwidth 35% priority 1 cbq(borrow ecn) queue std bandwidth 50% priority 2 cbq(default borrow) queue ssh bandwidth 15% priority 3 cbq(borrow ecn) ## # options set skip on lo set skip on enc0 set skip on gif set skip on $PFSYNC set block-policy return set loginterface $WANIF ## # ftp proxy anchor ftp-proxy/* pass in quick on $LANIF inet proto tcp to any port ftp \ divert-to 127.0.0.1 port 8021 ## # match rules match in from trustednets scrub (no-df random-id max-mss 1200) ## # filter rules block in log pass out pass out proto tcp all modulate state # site-to-site vpn pass in quick log proto esp from vpnendpoints pass in quick log proto udp from vpnendpoints port isakmp antispoof quick for { lo $LANIF } pass in quick proto carp from any to any pass in quick inet proto icmp from any to any icmp-type { echoreq echorep timex unreach } pass in quick on $LANIF to recoverpoint queue sansync label sansync pass in quick on $LANIF proto tcp to port { ssh } queue ssh label ssh pass in quick on $LANIF proto tcp to port { 3389 } queue ssh label rdp pass in log on $LANIF queue std label std dmesg: = OpenBSD 5.4 (GENERIC.MP) #41: Tue Jul 30 15:30:02 MDT 2013