> > We are not tuning for fragmentation, nor are we setting mtu on
> > the endpoint.
> 
> Doing that might be worth a try. i.e. try to avoid sending UDP packets
> that require extra kernel work (i.e. fragmentation) seeing as openvpn can
> handle that itself.

We messed around with MTU, inside OpenVPN.  It didn't make a difference.
I will have to look at it again.

> I was really after absolute numbers from the counters if any are
> non-zero, not rate.        

> cat pfctl_si.20140721
Status: Enabled for 4 days 16:08:32              Debug: err

State Table                          Total             Rate
  current entries                     4022               
  searches                      1562687138         3870.8/s
  inserts                          6895279           17.1/s
  removals                         6892292           17.1/s
Counters
  match                            7908562           19.6/s
  bad-offset                             0            0.0/s
  fragment                             477            0.0/s
  short                                 28            0.0/s
  normalize                            588            0.0/s
  memory                                 0            0.0/s
  bad-timestamp                          0            0.0/s
  congestion                             0            0.0/s
  ip-option                           6616            0.0/s
  proto-cksum                           14            0.0/s
  state-mismatch                      2685            0.0/s
  state-insert                           0            0.0/s
  state-limit                            0            0.0/s
  src-limit                              0            0.0/s
  synproxy                               0            0.0/s
  translate                              0            0.0/s

twenty hours later...

> sudo pfctl -si       
Status: Enabled for 5 days 14:02:31              Debug: err

State Table                          Total             Rate
  current entries                     4532               
  searches                      1893576764         3924.1/s
  inserts                          8626160           17.9/s
  removals                         8622663           17.9/s
Counters
  match                            9881415           20.5/s
  bad-offset                             0            0.0/s
  fragment                             655            0.0/s
  short                                 28            0.0/s
  normalize                            763            0.0/s
  memory                                 0            0.0/s
  bad-timestamp                          0            0.0/s
  congestion                             0            0.0/s
  ip-option                           9440            0.0/s
  proto-cksum                           14            0.0/s
  state-mismatch                      3455            0.0/s
  state-insert                           0            0.0/s
  state-limit                            0            0.0/s
  src-limit                              0            0.0/s
  synproxy                               0            0.0/s
  translate                              0            0.0/s

> > > This is already a fairly large buffer though (especially as I think you
> > > mentioned 100Mb). How did you choose 1536?
> > 
> > google and trial and error.
> 
> Is that "1536 is the lowest value that avoid an increase in ifq.drops"
> or something else?

default is net.inet.ip.ifq.maxlen=256

I am gonna set it back to default and continue to monitor drops.
Temporary insanity might have set in.

After several hours there is no growth of ifq.drops from zero.

> 
> > > > kern.bufcachepercent=90         # kernel buffer cache memory percentage
> > > 
> > > This won't help OpenVPN. Is this box also doing other things?
> > 
> > This box is running IPSEC
> > 
> > It's got four openvpn tunnels terminated on it.
> > 
> > We are running collectd, symon, dhcpd.  
> > 
> > The load lives between 2 - 4.
> 
> Presumably a lot of disk i/o from rrd writes then. Hmm..
> Pity symon doesn't do rrdcache yet. Are you at least using rrdcache
> for collectd?

collectd is writing to network only.

pfstat is running out of cron.

symon/symux was installed recently to get more data on this problem. 

We often have iftop running too, but only writing to STDOUT.

We have soft raid happening with bio.  We're not sure without disabling
it on our node b, but that might be the cause of the high system usage
on the one core.

Here's a run of the openssl engine and openssl speed for verifying that
aesni is consumable from the CPUs.

> openssl engine -c -tt
(rsax) RSAX engine support
 [RSA]
     [ available ]
(rdrand) Intel RDRAND engine
 [RAND]
     [ available ]
(dynamic) Dynamic engine loading support
     [ unavailable ]

> openssl speed -elapsed -evp aes-128-cbc 
You have chosen to measure elapsed time instead of user CPU time.
Doing aes-128-cbc for 3s on 16 size blocks: 119249716 aes-128-cbc's in 3.01s
Doing aes-128-cbc for 3s on 64 size blocks: 32027308 aes-128-cbc's in 3.01s
Doing aes-128-cbc for 3s on 256 size blocks: 8157622 aes-128-cbc's in 3.01s
Doing aes-128-cbc for 3s on 1024 size blocks: 2048021 aes-128-cbc's in 3.01s
Doing aes-128-cbc for 3s on 8192 size blocks: 256374 aes-128-cbc's in 3.01s
OpenSSL 1.0.1c 10 May 2012
built on: date not available
options:bn(64,64) rc4(16x,int) des(idx,cisc,16,int) aes(partial) idea(int) 
blowfish(idx) 
compiler: information not available
The 'numbers' are in 1000s of bytes per second processed.
type             16 bytes     64 bytes    256 bytes   1024 bytes   8192 bytes
aes-128-cbc     633885.53k   680979.31k   693804.40k   696735.38k   697746.12k

Thanks,
-dkw

Reply via email to