Missed PCB cache and drops due to no socket errors

2011-03-07 Thread Steve Johnson
Hi,

I'm having some issues with network connectivity on a system. When doing
netstat -ns, I get a lot of errors with missed PCB cache, drops due to no
socket. I have already increased the TCP and UDP send and receive space up
to 262144, but the error counters are still increasing. One of the highest
impacts is a lot of false positive alerts in our monitoring system since
this generates a lot of errors with the SNMP monitoring, which has a lot of
high rate UDP bursts.

The systems are a pair of Dell 1950s with dual Xeon 5130 2GHz CPUs and the
network cards are Intel PRO/1000 PT (82571EB). The running version is 4.8
GENERIC.MP#335 amd64. All they are doing is routing and filtering with PF
and PFSync.

Any idea what else I could tweak or modify to rectify these errors? Let me
know if there is anything else that I should include to provide additional
information.

Thanks,
Steve Johnson



Re: Missed PCB cache and drops due to no socket errors

2011-03-07 Thread Claudio Jeker
On Mon, Mar 07, 2011 at 10:38:45AM -0500, Steve Johnson wrote:
 Hi,
 
 I'm having some issues with network connectivity on a system. When doing
 netstat -ns, I get a lot of errors with missed PCB cache, drops due to no
 socket. I have already increased the TCP and UDP send and receive space up
 to 262144, but the error counters are still increasing. One of the highest
 impacts is a lot of false positive alerts in our monitoring system since
 this generates a lot of errors with the SNMP monitoring, which has a lot of
 high rate UDP bursts.
 

What makes you believe that the PCB cache and send and receive space have
anything in common?
TCP and UDP send and receive space have absolutly no influence on the
number of missed PCB.
The missed PCB cache is not an error. It just a counter telling when no
PCB has been found for the incomming packet based on the src/dst IP and
port. In other words new connections will always increase this counter (as
will port scans).
For UDP it is probably even more common since in most cases recvfrom() and
sendto() is used and so the first lookup will always fail and a lookup in
the list of listening PCB needs to be done.
The dropped due to no socket counter will tell you how many packets are
dropped because nothing was listening on that port. The 1-mio pesos
question is why there is no socket listening on that port.

 The systems are a pair of Dell 1950s with dual Xeon 5130 2GHz CPUs and the
 network cards are Intel PRO/1000 PT (82571EB). The running version is 4.8
 GENERIC.MP#335 amd64. All they are doing is routing and filtering with PF
 and PFSync.

I don't get it. Are you running the SNMP server on the firewall or are you
just forwarding traffic on the firewall and some of it gets dropped?
PCB lookups do not matter when doing just routing and filtering with PF.
PCB lookups are only used for local connections.

 Any idea what else I could tweak or modify to rectify these errors? Let me
 know if there is anything else that I should include to provide additional
 information.

I guess your problem is somewhere else. Is it possible that pf runs out of
states?

-- 
:wq Claudio



Re: Missed PCB cache and drops due to no socket errors

2011-03-07 Thread Steve Johnson
On Mon, Mar 7, 2011 at 11:15 AM, Claudio Jeker cje...@diehard.n-r-g.comwrote:

 On Mon, Mar 07, 2011 at 10:38:45AM -0500, Steve Johnson wrote:
  Hi,
 
  I'm having some issues with network connectivity on a system. When doing
  netstat -ns, I get a lot of errors with missed PCB cache, drops due to no
  socket. I have already increased the TCP and UDP send and receive space
 up
  to 262144, but the error counters are still increasing. One of the
 highest
  impacts is a lot of false positive alerts in our monitoring system since
  this generates a lot of errors with the SNMP monitoring, which has a lot
 of
  high rate UDP bursts.
 

 What makes you believe that the PCB cache and send and receive space have
 anything in common?


I didn't know if they did, but these were the 2 counters that seemed
problematic.


 TCP and UDP send and receive space have absolutly no influence on the
 number of missed PCB.
 The missed PCB cache is not an error. It just a counter telling when no
 PCB has been found for the incomming packet based on the src/dst IP and
 port. In other words new connections will always increase this counter (as
 will port scans).
 For UDP it is probably even more common since in most cases recvfrom() and
 sendto() is used and so the first lookup will always fail and a lookup in
 the list of listening PCB needs to be done.
 The dropped due to no socket counter will tell you how many packets are
 dropped because nothing was listening on that port.


Ok, thanks. Good to know all that information.


 The 1-mio pesos
 question is why there is no socket listening on that port.




  The systems are a pair of Dell 1950s with dual Xeon 5130 2GHz CPUs and
 the
  network cards are Intel PRO/1000 PT (82571EB). The running version is 4.8
  GENERIC.MP#335 amd64. All they are doing is routing and filtering with
 PF
  and PFSync.

 I don't get it. Are you running the SNMP server on the firewall or are you
 just forwarding traffic on the firewall and some of it gets dropped?
 PCB lookups do not matter when doing just routing and filtering with PF.
 PCB lookups are only used for local connections.


No, the SNMP traffic is just going through the system and some of it seems
to get dropped. We've only started to have this issue in the monitoring
system after replacing the old IPTables back-end systems with the new PF
ones.



  Any idea what else I could tweak or modify to rectify these errors? Let
 me
  know if there is anything else that I should include to provide
 additional
  information.

 I guess your problem is somewhere else. Is it possible that pf runs out of
 states?


The stats from pfctl seem to be fine, which is what led me to believe that
it would be more of a tuning parameter somewhere. State limit is at 300K and
we just had another important issue and I had only about 20K entries in the
stats. There are no errors on the interfaces either, nor on the switch
ports. PFSync stats do show a lot of failed lookup/inserts though. Here are
the counters for PF:

 Counters
  match  639843438  729.1/s
  bad-offset 00.0/s
  fragment  140.0/s
  short 770.0/s
  normalize6880.0/s
  memory   14809331.7/s
  bad-timestamp  00.0/s
  congestion 00.0/s
  ip-option6120.0/s
  proto-cksum00.0/s
  state-mismatch1643030.2/s
  state-insert 2880.0/s
  state-limit   160.0/s
  src-limit  00.0/s
  synproxy   00.0/s
Limit Counters
  max states per rule   160.0/s
  max-src-states 00.0/s
  max-src-nodes  00.0/s
  max-src-conn   00.0/s
  max-src-conn-rate  00.0/s
  overload table insertion   00.0/s
  overload flush states  00.0/s


 --
 :wq Claudio



Thanks



Re: Missed PCB cache and drops due to no socket errors

2011-03-07 Thread Stuart Henderson
On 2011-03-07, Steve Johnson maill...@sjohnson.info wrote:

 The stats from pfctl seem to be fine

   memory   14809331.7/s

that's a problem ..

netstat -m
vmstat -m
dmesg



Re: Missed PCB cache and drops due to no socket errors

2011-03-07 Thread Steve Johnson
Ok, thanks. Here's the output:


#netstat -m
338 mbufs in use:
306 mbufs allocated to data
8 mbufs allocated to packet headers
24 mbufs allocated to socket names and addresses
196/1634/128000 mbuf 2048 byte clusters in use (current/peak/max)
0/8/128000 mbuf 4096 byte clusters in use (current/peak/max)
0/8/128000 mbuf 8192 byte clusters in use (current/peak/max)
0/8/128000 mbuf 9216 byte clusters in use (current/peak/max)
0/8/128000 mbuf 12288 byte clusters in use (current/peak/max)
0/8/128000 mbuf 16384 byte clusters in use (current/peak/max)
0/8/128000 mbuf 65536 byte clusters in use (current/peak/max)
3900 Kbytes allocated to network (12% in use)
0 requests for memory denied
0 requests for memory delayed
0 calls to protocol drain routines

-

#vmstat -m
Memory statistics by bucket size
Size   In Use   Free   Requests  HighWater  Couldfree
  16 1265   462360889111280112
  32  581315   10404301 640  0
  64  934   20743032561 320  31375
 128 2019 93 676644 160 46
 256  4824623128304  80  10123
 512 1373 59 647300  40 53
1024  292 362434758  20 484389
2048   35 23   12844515  108170616
4096 2568 14 217060   5 197124
8192   21  7  68247   5  51974
   16384  300  0301   5  0
   327686  0  11603   5  0
   655362  0  2   5  0
  1310723  0  3   5  0
  2621441  0  1   5  0

Memory usage type by bucket size
Size  Type(s)
  16  devbuf, pcb, routetbl, sysctl, UFS mount, dirhash, ACPI, exec,
  xform_data, VM swap, UVM amap, UVM aobj, USB, USB device, temp
  32  devbuf, pcb, routetbl, ifaddr, vnodes, UFS mount, sem, dirhash,
ACPI,
  in_multi, exec, xform_data, UVM amap, USB, USB device, temp, DRM
  64  devbuf, routetbl, ifaddr, vnodes, dirhash, ACPI, proc, VFS
cluster,
  in_multi, ether_multi, VM swap, UVM amap, USB, USB device, NDP,
temp,
  DRM
 128  devbuf, pcb, routetbl, ifaddr, mount, sem, dirhash, ACPI, NFS
srvsock,
  ip_moptions, ttys, pfkey data, UVM amap, USB, USB device, NDP,
temp
 256  devbuf, routetbl, ifaddr, sysctl, ioctlops, vnodes, shm, VM map,
ACPI,
  exec, UVM amap, USB, USB device, temp, DRM
 512  devbuf, ifaddr, ioctlops, vnodes, dirhash, file desc, NFS daemon,
  ttys, newblk, UVM amap, USB, USB device, temp
1024  devbuf, pcb, sysctl, ioctlops, mount, UFS mount, shm, file desc,
proc,
  ttys, exec, UVM amap, crypto data, temp
2048  devbuf, ioctlops, UFS mount, ACPI, file desc, VM swap, UVM amap,
  UVM aobj, temp
4096  devbuf, ifaddr, ioctlops, proc, UVM amap, memdesc, temp, DRM
8192  devbuf, ttys, pagedep, UVM amap, USB, temp
   16384  devbuf, UFS mount, MSDOSFS mount, temp
   32768  devbuf, UFS quota, UFS mount, ISOFS mount, inodedep
   65536  devbuf
  131072  devbuf, VM swap
  262144  VM swap

Memory statistics by type   Type  Kern
  Type InUse MemUse HighUse  Limit Requests Limit Limit Size(s)
devbuf  5433 16169K  16363K 78644K   6086960 0
 16,32,64,128,256,512,1024,2048,4096,8192,16384,32768,65536,131072
   pcb   11021K 23K 78644K   1073730 0
 16,32,128,1024
  routetbl   52064K239K 78644K   3090820 0
 16,32,64,128,256
ifaddr   21746K 46K 78644K  2490 0
 32,64,128,256,512,4096
sysctl 3 2K  2K 78644K30 0  16,256,1024
  ioctlops 0 0K  4K 78644K   2524900 0
 256,512,1024,2048,4096
 mount14 7K  7K 78644K   140 0  128,1024
vnodes5013K105K 78644K   1011510 0
 32,64,256,512
 UFS quota 132K 32K 78644K10 0  32768
 UFS mount2574K 74K 78644K   250 0
 16,32,1024,2048,16384,32768
   shm 2 2K  2K 78644K20 0  256,1024
VM map 2 1K  1K 78644K20 0  256
   sem 2 1K  1K 78644K20 0  32,128
   dirhash45 9K 20K 78644K 19920 0
 16,32,64,128,512
  ACPI   63981K123K 78644K 37730 0
 16,32,64,128,256,2048
 file desc 811K 27K 78644K136740 0
 512,1024,2048
  proc2111K 11K 78644K   210 0  64,1024,4096
   VFS cluster 0 0K  1K 78644K403010 0  64
   NFS srvsock 1 1K  1K 78644K10 

Re: Missed PCB cache and drops due to no socket errors

2011-03-07 Thread Stuart Henderson
On 2011-03-07, Steve Johnson maill...@sjohnson.info wrote:
 Ok, thanks. Here's the output:


 #netstat -m
 338 mbufs in use:
 306 mbufs allocated to data
 8 mbufs allocated to packet headers
 24 mbufs allocated to socket names and addresses
 196/1634/128000 mbuf 2048 byte clusters in use (current/peak/max)

When you tweaked kern.maxclusters you haven't allowed yourself
enough kernel memory for PF states.



 0/8/128000 mbuf 4096 byte clusters in use (current/peak/max)
 0/8/128000 mbuf 8192 byte clusters in use (current/peak/max)
 0/8/128000 mbuf 9216 byte clusters in use (current/peak/max)
 0/8/128000 mbuf 12288 byte clusters in use (current/peak/max)
 0/8/128000 mbuf 16384 byte clusters in use (current/peak/max)
 0/8/128000 mbuf 65536 byte clusters in use (current/peak/max)
 3900 Kbytes allocated to network (12% in use)
 0 requests for memory denied
 0 requests for memory delayed
 0 calls to protocol drain routines

 -

 #vmstat -m
 Memory statistics by bucket size
 Size   In Use   Free   Requests  HighWater  Couldfree
   16 1265   462360889111280112
   32  581315   10404301 640  0
   64  934   20743032561 320  31375
  128 2019 93 676644 160 46
  256  4824623128304  80  10123
  512 1373 59 647300  40 53
 1024  292 362434758  20 484389
 2048   35 23   12844515  108170616
 4096 2568 14 217060   5 197124
 8192   21  7  68247   5  51974
16384  300  0301   5  0
327686  0  11603   5  0
655362  0  2   5  0
   1310723  0  3   5  0
   2621441  0  1   5  0

 Memory usage type by bucket size
 Size  Type(s)
   16  devbuf, pcb, routetbl, sysctl, UFS mount, dirhash, ACPI, exec,
   xform_data, VM swap, UVM amap, UVM aobj, USB, USB device, temp
   32  devbuf, pcb, routetbl, ifaddr, vnodes, UFS mount, sem, dirhash,
 ACPI,
   in_multi, exec, xform_data, UVM amap, USB, USB device, temp, DRM
   64  devbuf, routetbl, ifaddr, vnodes, dirhash, ACPI, proc, VFS
 cluster,
   in_multi, ether_multi, VM swap, UVM amap, USB, USB device, NDP,
 temp,
   DRM
  128  devbuf, pcb, routetbl, ifaddr, mount, sem, dirhash, ACPI, NFS
 srvsock,
   ip_moptions, ttys, pfkey data, UVM amap, USB, USB device, NDP,
 temp
  256  devbuf, routetbl, ifaddr, sysctl, ioctlops, vnodes, shm, VM map,
 ACPI,
   exec, UVM amap, USB, USB device, temp, DRM
  512  devbuf, ifaddr, ioctlops, vnodes, dirhash, file desc, NFS daemon,
   ttys, newblk, UVM amap, USB, USB device, temp
 1024  devbuf, pcb, sysctl, ioctlops, mount, UFS mount, shm, file desc,
 proc,
   ttys, exec, UVM amap, crypto data, temp
 2048  devbuf, ioctlops, UFS mount, ACPI, file desc, VM swap, UVM amap,
   UVM aobj, temp
 4096  devbuf, ifaddr, ioctlops, proc, UVM amap, memdesc, temp, DRM
 8192  devbuf, ttys, pagedep, UVM amap, USB, temp
16384  devbuf, UFS mount, MSDOSFS mount, temp
32768  devbuf, UFS quota, UFS mount, ISOFS mount, inodedep
65536  devbuf
   131072  devbuf, VM swap
   262144  VM swap

 Memory statistics by type   Type  Kern
   Type InUse MemUse HighUse  Limit Requests Limit Limit Size(s)
 devbuf  5433 16169K  16363K 78644K   6086960 0
  16,32,64,128,256,512,1024,2048,4096,8192,16384,32768,65536,131072
pcb   11021K 23K 78644K   1073730 0
  16,32,128,1024
   routetbl   52064K239K 78644K   3090820 0
  16,32,64,128,256
 ifaddr   21746K 46K 78644K  2490 0
  32,64,128,256,512,4096
 sysctl 3 2K  2K 78644K30 0  16,256,1024
   ioctlops 0 0K  4K 78644K   2524900 0
  256,512,1024,2048,4096
  mount14 7K  7K 78644K   140 0  128,1024
 vnodes5013K105K 78644K   1011510 0
  32,64,256,512
  UFS quota 132K 32K 78644K10 0  32768
  UFS mount2574K 74K 78644K   250 0
  16,32,1024,2048,16384,32768
shm 2 2K  2K 78644K20 0  256,1024
 VM map 2 1K  1K 78644K20 0  256
sem 2 1K  1K 78644K20 0  32,128
dirhash45 9K 20K 78644K 19920 0
  16,32,64,128,512
   ACPI   63981K123K 78644K 37730 0
  16,32,64,128,256,2048
  file desc 8

Re: Missed PCB cache and drops due to no socket errors

2011-03-07 Thread Steve Johnson
Ok, thanks. But this would mean that I have increased too much the
maxclusters for the available memory or that there is another setting that
would need to be increased in parallel (which I imagine would be the case)?

On Mon, Mar 7, 2011 at 1:04 PM, Stuart Henderson s...@spacehopper.orgwrote:

 On 2011-03-07, Steve Johnson maill...@sjohnson.info wrote:
  Ok, thanks. Here's the output:
 
 
  #netstat -m
  338 mbufs in use:
  306 mbufs allocated to data
  8 mbufs allocated to packet headers
  24 mbufs allocated to socket names and addresses
  196/1634/128000 mbuf 2048 byte clusters in use (current/peak/max)

 When you tweaked kern.maxclusters you haven't allowed yourself
 enough kernel memory for PF states.



  0/8/128000 mbuf 4096 byte clusters in use (current/peak/max)
  0/8/128000 mbuf 8192 byte clusters in use (current/peak/max)
  0/8/128000 mbuf 9216 byte clusters in use (current/peak/max)
  0/8/128000 mbuf 12288 byte clusters in use (current/peak/max)
  0/8/128000 mbuf 16384 byte clusters in use (current/peak/max)
  0/8/128000 mbuf 65536 byte clusters in use (current/peak/max)
  3900 Kbytes allocated to network (12% in use)
  0 requests for memory denied
  0 requests for memory delayed
  0 calls to protocol drain routines
 
  -
 
  #vmstat -m
  Memory statistics by bucket size
  Size   In Use   Free   Requests  HighWater  Couldfree
16 1265   462360889111280112
32  581315   10404301 640  0
64  934   20743032561 320  31375
   128 2019 93 676644 160 46
   256  4824623128304  80  10123
   512 1373 59 647300  40 53
  1024  292 362434758  20 484389
  2048   35 23   12844515  108170616
  4096 2568 14 217060   5 197124
  8192   21  7  68247   5  51974
 16384  300  0301   5  0
 327686  0  11603   5  0
 655362  0  2   5  0
1310723  0  3   5  0
2621441  0  1   5  0
 
  Memory usage type by bucket size
  Size  Type(s)
16  devbuf, pcb, routetbl, sysctl, UFS mount, dirhash, ACPI, exec,
xform_data, VM swap, UVM amap, UVM aobj, USB, USB device, temp
32  devbuf, pcb, routetbl, ifaddr, vnodes, UFS mount, sem, dirhash,
  ACPI,
in_multi, exec, xform_data, UVM amap, USB, USB device, temp,
 DRM
64  devbuf, routetbl, ifaddr, vnodes, dirhash, ACPI, proc, VFS
  cluster,
in_multi, ether_multi, VM swap, UVM amap, USB, USB device, NDP,
  temp,
DRM
   128  devbuf, pcb, routetbl, ifaddr, mount, sem, dirhash, ACPI, NFS
  srvsock,
ip_moptions, ttys, pfkey data, UVM amap, USB, USB device, NDP,
  temp
   256  devbuf, routetbl, ifaddr, sysctl, ioctlops, vnodes, shm, VM
 map,
  ACPI,
exec, UVM amap, USB, USB device, temp, DRM
   512  devbuf, ifaddr, ioctlops, vnodes, dirhash, file desc, NFS
 daemon,
ttys, newblk, UVM amap, USB, USB device, temp
  1024  devbuf, pcb, sysctl, ioctlops, mount, UFS mount, shm, file
 desc,
  proc,
ttys, exec, UVM amap, crypto data, temp
  2048  devbuf, ioctlops, UFS mount, ACPI, file desc, VM swap, UVM
 amap,
UVM aobj, temp
  4096  devbuf, ifaddr, ioctlops, proc, UVM amap, memdesc, temp, DRM
  8192  devbuf, ttys, pagedep, UVM amap, USB, temp
 16384  devbuf, UFS mount, MSDOSFS mount, temp
 32768  devbuf, UFS quota, UFS mount, ISOFS mount, inodedep
 65536  devbuf
131072  devbuf, VM swap
262144  VM swap
 
  Memory statistics by type   Type  Kern
Type InUse MemUse HighUse  Limit Requests Limit Limit Size(s)
  devbuf  5433 16169K  16363K 78644K   6086960 0
   16,32,64,128,256,512,1024,2048,4096,8192,16384,32768,65536,131072
 pcb   11021K 23K 78644K   1073730 0
   16,32,128,1024
routetbl   52064K239K 78644K   3090820 0
   16,32,64,128,256
  ifaddr   21746K 46K 78644K  2490 0
   32,64,128,256,512,4096
  sysctl 3 2K  2K 78644K30 0
  16,256,1024
ioctlops 0 0K  4K 78644K   2524900 0
   256,512,1024,2048,4096
   mount14 7K  7K 78644K   140 0  128,1024
  vnodes5013K105K 78644K   1011510 0
   32,64,256,512
   UFS quota 132K 32K 78644K10 0  32768
   UFS mount2574K 74K 78644K   250 0
   16,32,1024,2048,16384,32768