Re: pf/queue questions

2014-09-24 Thread Dewey Hylton
 From: Daniel Melameth dan...@melameth.com
 Subject: Re: pf/queue questions
 
 On Tue, Sep 23, 2014 at 9:39 AM, Dewey Hylton dewey.hyl...@gmail.com wrote:
  i have a site-to-site vpn setup across a 40Mbps wan link (average ~30ms
  latency). one of its uses is for san replication, but of course management
  traffic (ssh sessions, etc.) have to cross the link as well. without using
  queues, at times the replication traffic is such that management traffic
  suffers to the point of being unusable.
 
  so i setup queues, which fixed the management problem. but despite the
  management bandwidth requirements being minimal, the san replication
  traffic was then seen to plateau well below where i believe it should have
  been.
 
  one specific thing i'm seeing with this particular configuration is that
  san replication traffic tops out at 24Mbps, as seen on the wan circuit
  itself (outside of openbsd). removing the queues results in 100% wan
  utilization, even up to 100Mbps when the circuit is temporarily
  reconfigured to allow it.
 
 It's not clear to me in which direction or on what interface the SAN
 traffic is, but your 20Mb queue on $INETIF might be limiting your
 maximum throughput.  That said, you might also want to consider
 configuring qlimit and you can tweak this based on QLEN in systat
 queues.  Lastly, I recall henning@ saying queuing on VLANs is mostly
 useless, so you only want to apply altq to physical interfaces.

daniel, thanks for your input. after going back and reading henning's comments 
regarding queuing on vlans, i moved the queue definition to the physical 
interface and things are now working as expected.



pf/queue questions

2014-09-23 Thread Dewey Hylton
i have a site-to-site vpn setup across a 40Mbps wan link (average ~30ms
latency). one of its uses is for san replication, but of course management
traffic (ssh sessions, etc.) have to cross the link as well. without using
queues, at times the replication traffic is such that management traffic
suffers to the point of being unusable.

so i setup queues, which fixed the management problem. but despite the
management bandwidth requirements being minimal, the san replication
traffic was then seen to plateau well below where i believe it should have
been.

one specific thing i'm seeing with this particular configuration is that
san replication traffic tops out at 24Mbps, as seen on the wan circuit
itself (outside of openbsd). removing the queues results in 100% wan
utilization, even up to 100Mbps when the circuit is temporarily
reconfigured to allow it.

i have to assume that i've misunderstood the documentation and am looking
for some help. i'll paste the pf.conf below, followed by dmesg. we have a
fairly complex network on each end of the vpn, and we do in fact need the
nat that you will see though not for reasons related to the san replication
traffic. i have no doubt that i've done things incorrectly even in areas
seemingly unrelated to san replication, so feel free to fire away ...


pf.conf:
===
##
# macros

LANIF   = em0
WANIF   = em1
PFSYNC  = em2
INETIF  = vlan2
TWP2PIF = vlan3

table vpnendpointspersist { $PUBLIC1 $PUBLIC2 $REMOTEPUB3 $REMOTEPUB4
172.30.255.240/28 }
table carpmembers persist { $PUBLIC1 $PUBLIC2 172.28.0.251
172.28.0.252 }
table trustednets persist { 10.200.0.0/16 192.168.0.0/16 172.16.0.0/12
}
table recoverpointpersist { 10.200.80.0/24 172.28.0.247 172.28.0.248
10.200.72.0/24 172.28.0.10 172.28.0.11 172.28.0.12 172.28.2.0/24 }

##
# queues

altq on $INETIF  cbq bandwidth 20Mb queue { ssh, sansync, std }
altq on $TWP2PIF cbq bandwidth 35Mb queue { ssh, sansync, std }

queue sansync   bandwidth 35%   priority 1  cbq(borrow ecn)
queue std   bandwidth 50%   priority 2  cbq(default borrow)
queue ssh   bandwidth 15%   priority 3  cbq(borrow ecn)

##
# options

set skip on lo
set skip on enc0
set skip on gif
set skip on $PFSYNC
set block-policy return
set loginterface $WANIF


##
# ftp proxy

anchor ftp-proxy/*
pass in quick on $LANIF inet proto tcp to any port ftp \
divert-to 127.0.0.1 port 8021


##
# match rules

match in from trustednets scrub (no-df random-id max-mss 1200)


##
# filter rules

block in log
pass out
pass out proto tcp all modulate state

# site-to-site vpn
pass in quick log proto esp from vpnendpoints
pass in quick log proto udp from vpnendpoints port isakmp

antispoof quick for { lo $LANIF }

pass in quick proto carp from any to any
pass in quick inet proto icmp from any to any icmp-type { echoreq echorep
timex unreach }

pass in quick on $LANIF to recoverpoint queue sansync label sansync
pass in quick on $LANIF proto tcp to port { ssh } queue ssh label ssh

pass in quick on $LANIF proto tcp to port { 3389 } queue ssh label rdp

pass in log on $LANIF queue std label std




dmesg:
=
OpenBSD 5.4 (GENERIC.MP) #41: Tue Jul 30 15:30:02 MDT 2013
dera...@amd64.openbsd.org:/usr/src/sys/arch/amd64/compile/GENERIC.MP
real mem = 8403169280 (8013MB)
avail mem = 8171749376 (7793MB)
mainbus0 at root
bios0 at mainbus0: SMBIOS rev. 2.8 @ 0xb7fcb000 (82 entries)
bios0: vendor HP version P80 date 11/08/2013
bios0: HP ProLiant DL320e Gen8 v2
acpi0 at bios0: rev 2
acpi0: sleep states S0 S4 S5
acpi0: tables DSDT FACP SPCR MCFG HPET  SPMI ERST APIC  BERT HEST
DMAR  SSDT SSDT SSDT SSDT SSDT
acpi0: wakeup devices PCI0(S4)
acpitimer0 at acpi0: 3579545 Hz, 24 bits
acpimcfg0 at acpi0 addr 0xb800, bus 0-63
acpihpet0 at acpi0: 14318179 Hz
acpimadt0 at acpi0 addr 0xfee0: PC-AT compat
cpu0 at mainbus0: apid 0 (boot processor)
cpu0: Intel(R) Xeon(R) CPU E3-1270 v3 @ 3.50GHz, 3492.44 MHz
cpu0:
FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUS
H,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE,SSE3,PCLMUL,DTES64,MWAIT,DS-CPL,VMX
,SMX,EST,TM2,SSSE3,FMA3,CX16,xTPR,PDCM,PCID,SSE4.1,SSE4.2,x2APIC,MOVBE,POPCNT
,DEADLINE,AES,XSAVE,AVX,F16C,RDRAND,NXE,LONG,LAHF,ABM,PERF,ITSC,FSGSBASE,BMI1
,HLE,AVX2,SMEP,BMI2,ERMS,INVPCID,RTM
cpu0: 256KB 64b/line 8-way L2 cache
cpu0: smt 0, core 0, package 0
cpu0: apic clock running at 99MHz
cpu1 at mainbus0: apid 2 (application processor)
cpu1: Intel(R) Xeon(R) CPU E3-1270 v3 @ 3.50GHz, 3491.92 MHz
cpu1:
FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUS
H,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE,SSE3,PCLMUL,DTES64,MWAIT,DS-CPL,VMX
,SMX,EST,TM2,SSSE3,FMA3,CX16,xTPR,PDCM,PCID,SSE4.1,SSE4.2,x2APIC,MOVBE,POPCNT
,DEADLINE,AES,XSAVE,AVX,F16C,RDRAND,NXE,LONG,LAHF,ABM,PERF,ITSC,FSGSBASE,BMI1

pf/queue questions

2014-09-23 Thread Dewey Hylton
i have a site-to-site vpn setup across a 40Mbps wan link (average ~30ms 
latency). one of its uses is for san replication, but of course management 
traffic (ssh sessions, etc.) have to cross the link as well. without using 
queues, at times the replication traffic is such that management traffic 
suffers to the point of being unusable. 

so i setup queues, which fixed the management problem. but despite the 
management bandwidth requirements being minimal, the san replication traffic 
was then seen to plateau well below where i believe it should have been.

one specific thing i'm seeing with this particular configuration is that san 
replication traffic tops out at 24Mbps, as seen on the wan circuit itself 
(outside of openbsd). removing the queues results in 100% wan utilization, even 
up to 100Mbps when the circuit is temporarily reconfigured to allow it.

i have to assume that i've misunderstood the documentation and am looking for 
some help. i'll paste the pf.conf below, followed by dmesg. we have a fairly 
complex network on each end of the vpn, and we do in fact need the nat that you 
will see though not for reasons related to the san replication traffic. i have 
no doubt that i've done things incorrectly even in areas seemingly unrelated to 
san replication, so feel free to fire away ...


pf.conf:
===
##
# macros

LANIF   = em0
WANIF   = em1
PFSYNC  = em2
INETIF  = vlan2
TWP2PIF = vlan3

table vpnendpointspersist { $PUBLIC1 $PUBLIC2 $REMOTEPUB3 $REMOTEPUB4 
172.30.255.240/28 }
table carpmembers persist { $PUBLIC1 $PUBLIC2 172.28.0.251 172.28.0.252 }
table trustednets persist { 10.200.0.0/16 192.168.0.0/16 172.16.0.0/12 }
table recoverpointpersist { 10.200.80.0/24 172.28.0.247 172.28.0.248 
10.200.72.0/24 172.28.0.10 172.28.0.11 172.28.0.12 172.28.2.0/24 }

##
# queues

altq on $INETIF  cbq bandwidth 20Mb queue { ssh, sansync, std }
altq on $TWP2PIF cbq bandwidth 35Mb queue { ssh, sansync, std }

queue sansync   bandwidth 35%   priority 1  cbq(borrow ecn) 
queue std   bandwidth 50%   priority 2  cbq(default borrow)
queue ssh   bandwidth 15%   priority 3  cbq(borrow ecn)

##
# options

set skip on lo
set skip on enc0
set skip on gif
set skip on $PFSYNC
set block-policy return
set loginterface $WANIF


##
# ftp proxy

anchor ftp-proxy/*
pass in quick on $LANIF inet proto tcp to any port ftp \
divert-to 127.0.0.1 port 8021


##
# match rules

match in from trustednets scrub (no-df random-id max-mss 1200)


##
# filter rules

block in log
pass out 
pass out proto tcp all modulate state

# site-to-site vpn
pass in quick log proto esp from vpnendpoints
pass in quick log proto udp from vpnendpoints port isakmp

antispoof quick for { lo $LANIF }

pass in quick proto carp from any to any
pass in quick inet proto icmp from any to any icmp-type { echoreq echorep timex 
unreach }

pass in quick on $LANIF to recoverpoint queue sansync label sansync
pass in quick on $LANIF proto tcp to port { ssh } queue ssh label ssh

pass in quick on $LANIF proto tcp to port { 3389 } queue ssh label rdp

pass in log on $LANIF queue std label std




dmesg:
=
OpenBSD 5.4 (GENERIC.MP) #41: Tue Jul 30 15:30:02 MDT 2013
dera...@amd64.openbsd.org:/usr/src/sys/arch/amd64/compile/GENERIC.MP
real mem = 8403169280 (8013MB)
avail mem = 8171749376 (7793MB)
mainbus0 at root
bios0 at mainbus0: SMBIOS rev. 2.8 @ 0xb7fcb000 (82 entries)
bios0: vendor HP version P80 date 11/08/2013
bios0: HP ProLiant DL320e Gen8 v2
acpi0 at bios0: rev 2
acpi0: sleep states S0 S4 S5
acpi0: tables DSDT FACP SPCR MCFG HPET  SPMI ERST APIC  BERT HEST DMAR 
 SSDT SSDT SSDT SSDT SSDT
acpi0: wakeup devices PCI0(S4)
acpitimer0 at acpi0: 3579545 Hz, 24 bits
acpimcfg0 at acpi0 addr 0xb800, bus 0-63
acpihpet0 at acpi0: 14318179 Hz
acpimadt0 at acpi0 addr 0xfee0: PC-AT compat
cpu0 at mainbus0: apid 0 (boot processor)
cpu0: Intel(R) Xeon(R) CPU E3-1270 v3 @ 3.50GHz, 3492.44 MHz
cpu0: 
FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE,SSE3,PCLMUL,DTES64,MWAIT,DS-CPL,VMX,SMX,EST,TM2,SSSE3,FMA3,CX16,xTPR,PDCM,PCID,SSE4.1,SSE4.2,x2APIC,MOVBE,POPCNT,DEADLINE,AES,XSAVE,AVX,F16C,RDRAND,NXE,LONG,LAHF,ABM,PERF,ITSC,FSGSBASE,BMI1,HLE,AVX2,SMEP,BMI2,ERMS,INVPCID,RTM
cpu0: 256KB 64b/line 8-way L2 cache
cpu0: smt 0, core 0, package 0
cpu0: apic clock running at 99MHz
cpu1 at mainbus0: apid 2 (application processor)
cpu1: Intel(R) Xeon(R) CPU E3-1270 v3 @ 3.50GHz, 3491.92 MHz
cpu1: 

Re: pf/queue questions

2014-09-23 Thread Daniel Melameth
On Tue, Sep 23, 2014 at 9:39 AM, Dewey Hylton dewey.hyl...@gmail.com wrote:
 i have a site-to-site vpn setup across a 40Mbps wan link (average ~30ms
 latency). one of its uses is for san replication, but of course management
 traffic (ssh sessions, etc.) have to cross the link as well. without using
 queues, at times the replication traffic is such that management traffic
 suffers to the point of being unusable.

 so i setup queues, which fixed the management problem. but despite the
 management bandwidth requirements being minimal, the san replication
 traffic was then seen to plateau well below where i believe it should have
 been.

 one specific thing i'm seeing with this particular configuration is that
 san replication traffic tops out at 24Mbps, as seen on the wan circuit
 itself (outside of openbsd). removing the queues results in 100% wan
 utilization, even up to 100Mbps when the circuit is temporarily
 reconfigured to allow it.

It's not clear to me in which direction or on what interface the SAN
traffic is, but your 20Mb queue on $INETIF might be limiting your
maximum throughput.  That said, you might also want to consider
configuring qlimit and you can tweak this based on QLEN in systat
queues.  Lastly, I recall henning@ saying queuing on VLANs is mostly
useless, so you only want to apply altq to physical interfaces.

 pf.conf:
 ===
 ##
 # macros

 LANIF   = em0
 WANIF   = em1
 PFSYNC  = em2
 INETIF  = vlan2
 TWP2PIF = vlan3

 table vpnendpointspersist { $PUBLIC1 $PUBLIC2 $REMOTEPUB3 $REMOTEPUB4
 172.30.255.240/28 }
 table carpmembers persist { $PUBLIC1 $PUBLIC2 172.28.0.251
 172.28.0.252 }
 table trustednets persist { 10.200.0.0/16 192.168.0.0/16 172.16.0.0/12
 }
 table recoverpointpersist { 10.200.80.0/24 172.28.0.247 172.28.0.248
 10.200.72.0/24 172.28.0.10 172.28.0.11 172.28.0.12 172.28.2.0/24 }

 ##
 # queues

 altq on $INETIF  cbq bandwidth 20Mb queue { ssh, sansync, std }
 altq on $TWP2PIF cbq bandwidth 35Mb queue { ssh, sansync, std }

 queue sansync   bandwidth 35%   priority 1  cbq(borrow ecn)
 queue std   bandwidth 50%   priority 2  cbq(default borrow)
 queue ssh   bandwidth 15%   priority 3  cbq(borrow ecn)

 ##
 # options

 set skip on lo
 set skip on enc0
 set skip on gif
 set skip on $PFSYNC
 set block-policy return
 set loginterface $WANIF


 ##
 # ftp proxy

 anchor ftp-proxy/*
 pass in quick on $LANIF inet proto tcp to any port ftp \
 divert-to 127.0.0.1 port 8021


 ##
 # match rules

 match in from trustednets scrub (no-df random-id max-mss 1200)


 ##
 # filter rules

 block in log
 pass out
 pass out proto tcp all modulate state

 # site-to-site vpn
 pass in quick log proto esp from vpnendpoints
 pass in quick log proto udp from vpnendpoints port isakmp

 antispoof quick for { lo $LANIF }

 pass in quick proto carp from any to any
 pass in quick inet proto icmp from any to any icmp-type { echoreq echorep
 timex unreach }

 pass in quick on $LANIF to recoverpoint queue sansync label sansync
 pass in quick on $LANIF proto tcp to port { ssh } queue ssh label ssh

 pass in quick on $LANIF proto tcp to port { 3389 } queue ssh label rdp

 pass in log on $LANIF queue std label std



 dmesg:
 =
 OpenBSD 5.4 (GENERIC.MP) #41: Tue Jul 30 15:30:02 MDT 2013