> > We are not tuning for fragmentation, nor are we setting mtu on
> > the endpoint.
>
> Doing that might be worth a try. i.e. try to avoid sending UDP packets
> that require extra kernel work (i.e. fragmentation) seeing as openvpn can
> handle that itself.
We messed around with MTU, inside OpenVPN. It didn't make a difference.
I will have to look at it again.
> I was really after absolute numbers from the counters if any are
> non-zero, not rate.
> cat pfctl_si.20140721
Status: Enabled for 4 days 16:08:32 Debug: err
State Table Total Rate
current entries 4022
searches 1562687138 3870.8/s
inserts 6895279 17.1/s
removals 6892292 17.1/s
Counters
match 7908562 19.6/s
bad-offset 0 0.0/s
fragment 477 0.0/s
short 28 0.0/s
normalize 588 0.0/s
memory 0 0.0/s
bad-timestamp 0 0.0/s
congestion 0 0.0/s
ip-option 6616 0.0/s
proto-cksum 14 0.0/s
state-mismatch 2685 0.0/s
state-insert 0 0.0/s
state-limit 0 0.0/s
src-limit 0 0.0/s
synproxy 0 0.0/s
translate 0 0.0/s
twenty hours later...
> sudo pfctl -si
Status: Enabled for 5 days 14:02:31 Debug: err
State Table Total Rate
current entries 4532
searches 1893576764 3924.1/s
inserts 8626160 17.9/s
removals 8622663 17.9/s
Counters
match 9881415 20.5/s
bad-offset 0 0.0/s
fragment 655 0.0/s
short 28 0.0/s
normalize 763 0.0/s
memory 0 0.0/s
bad-timestamp 0 0.0/s
congestion 0 0.0/s
ip-option 9440 0.0/s
proto-cksum 14 0.0/s
state-mismatch 3455 0.0/s
state-insert 0 0.0/s
state-limit 0 0.0/s
src-limit 0 0.0/s
synproxy 0 0.0/s
translate 0 0.0/s
> > > This is already a fairly large buffer though (especially as I think you
> > > mentioned 100Mb). How did you choose 1536?
> >
> > google and trial and error.
>
> Is that "1536 is the lowest value that avoid an increase in ifq.drops"
> or something else?
default is net.inet.ip.ifq.maxlen=256
I am gonna set it back to default and continue to monitor drops.
Temporary insanity might have set in.
After several hours there is no growth of ifq.drops from zero.
>
> > > > kern.bufcachepercent=90 # kernel buffer cache memory percentage
> > >
> > > This won't help OpenVPN. Is this box also doing other things?
> >
> > This box is running IPSEC
> >
> > It's got four openvpn tunnels terminated on it.
> >
> > We are running collectd, symon, dhcpd.
> >
> > The load lives between 2 - 4.
>
> Presumably a lot of disk i/o from rrd writes then. Hmm..
> Pity symon doesn't do rrdcache yet. Are you at least using rrdcache
> for collectd?
collectd is writing to network only.
pfstat is running out of cron.
symon/symux was installed recently to get more data on this problem.
We often have iftop running too, but only writing to STDOUT.
We have soft raid happening with bio. We're not sure without disabling
it on our node b, but that might be the cause of the high system usage
on the one core.
Here's a run of the openssl engine and openssl speed for verifying that
aesni is consumable from the CPUs.
> openssl engine -c -tt
(rsax) RSAX engine support
[RSA]
[ available ]
(rdrand) Intel RDRAND engine
[RAND]
[ available ]
(dynamic) Dynamic engine loading support
[ unavailable ]
> openssl speed -elapsed -evp aes-128-cbc
You have chosen to measure elapsed time instead of user CPU time.
Doing aes-128-cbc for 3s on 16 size blocks: 119249716 aes-128-cbc's in 3.01s
Doing aes-128-cbc for 3s on 64 size blocks: 32027308 aes-128-cbc's in 3.01s
Doing aes-128-cbc for 3s on 256 size blocks: 8157622 aes-128-cbc's in 3.01s
Doing aes-128-cbc for 3s on 1024 size blocks: 2048021 aes-128-cbc's in 3.01s
Doing aes-128-cbc for 3s on 8192 size blocks: 256374 aes-128-cbc's in 3.01s
OpenSSL 1.0.1c 10 May 2012
built on: date not available
options:bn(64,64) rc4(16x,int) des(idx,cisc,16,int) aes(partial) idea(int)
blowfish(idx)
compiler: information not available
The 'numbers' are in 1000s of bytes per second processed.
type 16 bytes 64 bytes 256 bytes 1024 bytes 8192 bytes
aes-128-cbc 633885.53k 680979.31k 693804.40k 696735.38k 697746.12k
Thanks,
-dkw