Re: mutt and gmail

2014-10-27 Thread Darryl Wisneski
 without full debugging it is not possible to know which
 component is failing in the chain.  it's just that
 there is a lot of ongoing work on SSL in openbsd,
 so i thought i might bring it up here as the error
 message specifically mentioned SSL.
 

This is not an openbsd issue at all. 

I suffer through using mutt against gmail directly;  I guess there is a
load balancer up front and backend servers are continually rebooted or
taken in and out of service; the client has to reconnect.



Re: CARP cluster: howto keep pf.conf in sync?

2014-07-29 Thread Darryl Wisneski
On Mon, Jul 28, 2014 at 11:21:46PM -0400, sven falempin wrote:
 On Mon, Jul 28, 2014 at 11:19 PM, Leonardo Santagostini
 lsantagost...@gmail.com wrote:
  Maybe puppet?
 
 
 where are you storing the change history ?
 

My colleague and I (ab)use mercurial to this end, then blast the
configs out with ansible.  Ansible enables you to edit and blast files
from your own workstation; you commit/push with your own ssh key.
Puppet/cfengine/salt would have worked great too but you tend to have
to everything on a central machine, and it's harder to have separate
users commit.

We use templates and jinja conditionals to change up the dhcpd.conf
primary/secondary, for example.



Re: Dropping UDP Packets

2014-07-22 Thread Darryl Wisneski
  We are not tuning for fragmentation, nor are we setting mtu on
  the endpoint.
 
 Doing that might be worth a try. i.e. try to avoid sending UDP packets
 that require extra kernel work (i.e. fragmentation) seeing as openvpn can
 handle that itself.

We messed around with MTU, inside OpenVPN.  It didn't make a difference.
I will have to look at it again.

 I was really after absolute numbers from the counters if any are
 non-zero, not rate.

 cat pfctl_si.20140721
Status: Enabled for 4 days 16:08:32  Debug: err

State Table  Total Rate
  current entries 4022   
  searches  1562687138 3870.8/s
  inserts  6895279   17.1/s
  removals 6892292   17.1/s
Counters
  match7908562   19.6/s
  bad-offset 00.0/s
  fragment 4770.0/s
  short 280.0/s
  normalize5880.0/s
  memory 00.0/s
  bad-timestamp  00.0/s
  congestion 00.0/s
  ip-option   66160.0/s
  proto-cksum   140.0/s
  state-mismatch  26850.0/s
  state-insert   00.0/s
  state-limit00.0/s
  src-limit  00.0/s
  synproxy   00.0/s
  translate  00.0/s

twenty hours later...

 sudo pfctl -si   
Status: Enabled for 5 days 14:02:31  Debug: err

State Table  Total Rate
  current entries 4532   
  searches  1893576764 3924.1/s
  inserts  8626160   17.9/s
  removals 8622663   17.9/s
Counters
  match9881415   20.5/s
  bad-offset 00.0/s
  fragment 6550.0/s
  short 280.0/s
  normalize7630.0/s
  memory 00.0/s
  bad-timestamp  00.0/s
  congestion 00.0/s
  ip-option   94400.0/s
  proto-cksum   140.0/s
  state-mismatch  34550.0/s
  state-insert   00.0/s
  state-limit00.0/s
  src-limit  00.0/s
  synproxy   00.0/s
  translate  00.0/s

   This is already a fairly large buffer though (especially as I think you
   mentioned 100Mb). How did you choose 1536?
  
  google and trial and error.
 
 Is that 1536 is the lowest value that avoid an increase in ifq.drops
 or something else?

default is net.inet.ip.ifq.maxlen=256

I am gonna set it back to default and continue to monitor drops.
Temporary insanity might have set in.

After several hours there is no growth of ifq.drops from zero.

 
kern.bufcachepercent=90 # kernel buffer cache memory percentage
   
   This won't help OpenVPN. Is this box also doing other things?
  
  This box is running IPSEC
  
  It's got four openvpn tunnels terminated on it.
  
  We are running collectd, symon, dhcpd.  
  
  The load lives between 2 - 4.
 
 Presumably a lot of disk i/o from rrd writes then. Hmm..
 Pity symon doesn't do rrdcache yet. Are you at least using rrdcache
 for collectd?

collectd is writing to network only.

pfstat is running out of cron.

symon/symux was installed recently to get more data on this problem. 

We often have iftop running too, but only writing to STDOUT.

We have soft raid happening with bio.  We're not sure without disabling
it on our node b, but that might be the cause of the high system usage
on the one core.

Here's a run of the openssl engine and openssl speed for verifying that
aesni is consumable from the CPUs.

 openssl engine -c -tt
(rsax) RSAX engine support
 [RSA]
 [ available ]
(rdrand) Intel RDRAND engine
 [RAND]
 [ available ]
(dynamic) Dynamic engine loading support
 [ unavailable ]

 openssl speed -elapsed -evp aes-128-cbc 
You have chosen to measure elapsed time instead of user CPU time.
Doing aes-128-cbc for 3s on 16 size blocks: 119249716 aes-128-cbc's in 3.01s
Doing aes-128-cbc for 3s on 64 size blocks: 32027308 

Re: Dropping UDP Packets

2014-07-20 Thread Darryl Wisneski
On Fri, Jul 18, 2014 at 09:36:23AM +, Stuart Henderson wrote:
 On 2014-07-17, Darryl Wisneski s...@commwebworks.com wrote:
  netstat -s -p udp |grep dropped due to full socket 
  345197 dropped due to full socket buffers
 
 We're assuming this relates to openvpn packets but you can check which
 sockets have queues: (the sendq/recvq counters).
 
 netstat -an | grep -v ' 0  0'

Sometimes now, sockets will display a queue and netstat -s will show a counter
increase dropped due to full socket buffers.  We usually have a SIP
drop then.  The two are not always coordinated, but mostly are.

udp   8846  0  xxx.100.173.xxx.1195*.*   
udp   1448  0  xxx.100.173.xxx.1195*.*   
udp129  0  xxx.100.173.xxx.1194*.*   
udp177  0  xxx.100.173.xxx.1195*.*   
udp354  0  xxx.100.173.xxx.1195*.*   
udp  21115  0  xxx.100.173.xxx.1194*.*   
udp241  0  10.0.0.254.1195   *.*   
udp   2988  0  10.0.0.254.1195   *.*   
udp193  0  xxx.100.173.xxx.1195*.*   
udp  19591  0  xxx.100.173.xxx.1195*.*   
udp241  0  10.0.0.254.1195   *.*   
udp  20043  0  xxx.100.173.xxx.1195*.*   
udp  11878  0  xxx.100.173.xxx.1195*.*   
udp177  0  xxx.100.173.xxx.1195*.*   
udp193  0  xxx.100.173.xxx.1195*.*   
udp129  0  xxx.100.173.xxx.1194*.*  

 
 So if things are building up here rather than on the interface queue,
 there ought to be a reason why it's slow to drain.
 
 Are you doing queueing?

We are now doing queueing as a of very recently, but the same symptoms
were occuring before when we had no queueing.  We were at 50Mbit before
very recently as well.  The extra B/W did not help.

 
 How is fragmentation being handled? In OpenVPN or relying on the kernel
 to do it? Or are you using small mtu anyway to avoid frags?

We are not tuning for fragmentation, nor are we setting mtu on
the endpoint.

 
 How does pfctl -si look?

 sudo  pfctl -si
Status: Enabled for 1 days 15:58:58  Debug: err

State Table  Total Rate
  current entries 2694   
  searches   636512596 4422.1/s
  inserts  2978926   20.7/s
  removals 2977267   20.7/s
Counters
  match3349507   23.3/s

[snip]

everything else 0.0/s

 
  I'm not sure how to proceed on tuning as I read tuning via sysctl is
  becoming pointless.
 
 It's preferred if things can auto-tune without touching sysctl, but not
 everything is done that way.
 
  net.inet.udp.sendspace=131028   # udp send buffer
 
 This may possibly need increasing though is already quite large. (while
 researching this mail it seems FreeBSD doesn't have this, does anyone here
 know what they do instead?)

We have toggled net.inet.udp.sendspace and net.inet.udp.recvspace between
131028 and 262144 with no improvements.  Anything higher and we get a
hosed system...

 ifconfig 
ifconfig: socket: No buffer space available

 
  net.inet.ip.ifq.maxlen=1536
 
 Monitor net.inet.ip.ifq.drops, is there an increase?

No increases in net.inet.ip.ifq.drops through time.

 This is already a fairly large buffer though (especially as I think you
 mentioned 100Mb). How did you choose 1536?

google and trial and error.

 
  kern.bufcachepercent=90 # kernel buffer cache memory percentage
 
 This won't help OpenVPN. Is this box also doing other things?

This box is running IPSEC

It's got four openvpn tunnels terminated on it.

We are running collectd, symon, dhcpd.  

The load lives between 2 - 4.

2 usersLoad 3.09 3.07 2.91 Fri Jul 18 12:34:19 2014

 PID USER NAME  CPU10\   20\   30\  
 40\   50\   60\   70\   80\   90\  100\
8941 root acpi0   83.79 
#
  idle  66.65 
###
  22 root openvpn  2.49 #
   23727 root openvpn  1.37
5473 root openvpn  1.27


load averages:  3.82,  3.08,  2.77fw0.xxx.xxx 12:00:21
86 processes: 85 idle, 1 on processor
CPU0 states:  0.0% user,  0.0% nice, 86.6% system,  8.8% interrupt,  4.6% idle
CPU1 states:  0.4% user,  0.0% nice,  7.8% system,  0.2% interrupt, 91.6% idle
CPU2 states:  0.0% user,  0.0% nice,  5.2% system,  0.2% interrupt, 94.6% idle
CPU3 states:  0.6% user,  0.0% nice,  5.0% system

Dropping UDP Packets

2014-07-17 Thread Darryl Wisneski
Howdy:

I have a openbsd 5.5 Release box running a busy UDP openvpn endpoint
with a 100Mbit circuit.  We tunnel SIP traffic, et al, through the openvpn.
It's setup with PF, CARP, and IPSEC.  We have done little to tune the OS.

A couple of months ago running 5.2 (we are since upgraded to 5.5 and we
have new hardware) we started to notice dropped segments of SIP calls and
we discovered we simultaneously are dropping UDP packets on the endpoint.
We can watch the counter netstat increase as the sip dropout occurs.

netstat -s -p udp |grep dropped due to full socket 
345197 dropped due to full socket buffers

I'm not sure how to proceed on tuning as I read tuning via sysctl is
becoming pointless.  We have pf states set to:

set limit states 10

grep -v ^# /etc/sysctl.conf
net.inet.ip.forwarding=1# 1=Permit forwarding (routing) of IPv4 packets
net.inet.carp.preempt=1 # 1=Enable carp(4) preemption
net.inet.carp.log=7 # log level of carp(4) info, default 2
net.inet.udp.recvspace=131028   # udp receive space
net.inet.udp.sendspace=131028   # udp send buffer
kern.bufcachepercent=90 # kernel buffer cache memory percentage
net.inet.ip.ifq.maxlen=1536

Attached are the sysctl.conf and dmesg.

OpenBSD 5.5 (GENERIC.MP) #315: Wed Mar  5 09:37:46 MST 2014
dera...@amd64.openbsd.org:/usr/src/sys/arch/amd64/compile/GENERIC.MP
real mem = 34301018112 (32712MB)
avail mem = 33379323904 (31833MB)
mainbus0 at root
bios0 at mainbus0: SMBIOS rev. 2.7 @ 0xec0f0 (38 entries)
bios0: vendor American Megatrends Inc. version 2.00 date 04/24/2014
bios0: iXsystems iX1104-813M-350
acpi0 at bios0: rev 2
acpi0: sleep states S0 S4 S5
acpi0: tables DSDT FACP APIC FPDT SSDT SSDT SSDT MCFG PRAD HPET SSDT SSDT SPMI 
DMAR EINJ ERST HEST BERT
acpi0: wakeup devices PEGP(S4) PEG0(S4) PEGP(S4) PEG1(S4) PEGP(S4) PEG2(S4) 
PXSX(S4) RP01(S4) PXSX(S4) RP02(S4) PXSX(S4) RP03(S4) PXSX(S4) RP04(S4) 
PXSX(S4) RP05(S4) [...]
acpitimer0 at acpi0: 3579545 Hz, 24 bits
acpimadt0 at acpi0 addr 0xfee0: PC-AT compat
cpu0 at mainbus0: apid 0 (boot processor)
cpu0: Intel(R) Xeon(R) CPU E3-1230 v3 @ 3.30GHz, 3300.63 MHz
cpu0: 
FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE,SSE3,PCLMUL,DTES64,MWAIT,DS-CPL,VMX,SMX,EST,TM2,SSSE3,FMA3,CX16,xTPR,PDCM,PCID,SSE4.1,SSE4.2,x2APIC,MOVBE,POPCNT,DEADLINE,AES,XSAVE,AVX,F16C,RDRAND,NXE,LONG,LAHF,ABM,PERF,ITSC,FSGSBASE,BMI1,HLE,AVX2,SMEP,BMI2,ERMS,INVPCID,RTM
cpu0: 256KB 64b/line 8-way L2 cache
cpu0: smt 0, core 0, package 0
mtrr: Pentium Pro MTRR support, 10 var ranges, 88 fixed ranges
cpu0: apic clock running at 99MHz
cpu0: mwait min=64, max=64, C-substates=0.2.1.2.4, IBE
cpu1 at mainbus0: apid 2 (application processor)
cpu1: Intel(R) Xeon(R) CPU E3-1230 v3 @ 3.30GHz, 3300.00 MHz
cpu1: 
FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE,SSE3,PCLMUL,DTES64,MWAIT,DS-CPL,VMX,SMX,EST,TM2,SSSE3,FMA3,CX16,xTPR,PDCM,PCID,SSE4.1,SSE4.2,x2APIC,MOVBE,POPCNT,DEADLINE,AES,XSAVE,AVX,F16C,RDRAND,NXE,LONG,LAHF,ABM,PERF,ITSC,FSGSBASE,BMI1,HLE,AVX2,SMEP,BMI2,ERMS,INVPCID,RTM
cpu1: 256KB 64b/line 8-way L2 cache
cpu1: smt 0, core 1, package 0
cpu2 at mainbus0: apid 4 (application processor)
cpu2: Intel(R) Xeon(R) CPU E3-1230 v3 @ 3.30GHz, 3300.00 MHz
cpu2: 
FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE,SSE3,PCLMUL,DTES64,MWAIT,DS-CPL,VMX,SMX,EST,TM2,SSSE3,FMA3,CX16,xTPR,PDCM,PCID,SSE4.1,SSE4.2,x2APIC,MOVBE,POPCNT,DEADLINE,AES,XSAVE,AVX,F16C,RDRAND,NXE,LONG,LAHF,ABM,PERF,ITSC,FSGSBASE,BMI1,HLE,AVX2,SMEP,BMI2,ERMS,INVPCID,RTM
cpu2: 256KB 64b/line 8-way L2 cache
cpu2: smt 0, core 2, package 0
cpu3 at mainbus0: apid 6 (application processor)
cpu3: Intel(R) Xeon(R) CPU E3-1230 v3 @ 3.30GHz, 3300.00 MHz
cpu3: 
FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE,SSE3,PCLMUL,DTES64,MWAIT,DS-CPL,VMX,SMX,EST,TM2,SSSE3,FMA3,CX16,xTPR,PDCM,PCID,SSE4.1,SSE4.2,x2APIC,MOVBE,POPCNT,DEADLINE,AES,XSAVE,AVX,F16C,RDRAND,NXE,LONG,LAHF,ABM,PERF,ITSC,FSGSBASE,BMI1,HLE,AVX2,SMEP,BMI2,ERMS,INVPCID,RTM
cpu3: 256KB 64b/line 8-way L2 cache
cpu3: smt 0, core 3, package 0
ioapic0 at mainbus0: apid 8 pa 0xfec0, version 20, 24 pins
acpimcfg0 at acpi0 addr 0xf800, bus 0-63
acpihpet0 at acpi0: 14318179 Hz
acpiprt0 at acpi0: bus 0 (PCI0)
acpiprt1 at acpi0: bus 1 (PEG0)
acpiprt2 at acpi0: bus -1 (PEG1)
acpiprt3 at acpi0: bus -1 (PEG2)
acpiprt4 at acpi0: bus 2 (RP01)
acpiprt5 at acpi0: bus 4 (RP03)
acpiprt6 at acpi0: bus 5 (RP04)
acpiec0 at acpi0: Failed to read resource settings
acpicpu0 at acpi0: PSS
acpicpu1 at acpi0: PSS
acpicpu2 at acpi0: PSS
acpicpu3 at acpi0: PSS
acpipwrres0 at acpi0: PG00, resource for PEG0
acpipwrres1 at acpi0: PG01, resource for PEG1
acpipwrres2 at acpi0: PG02, resource for PEG2
acpipwrres3 at acpi0: FN00, resource for FAN0