So, different tact today, namely the monitoring of '/proc/net/softnet_stat' to 
try reduce potential errors on the interface.

End result: 517k qps.

Final changes for the day:
sysctl -w net.core.netdev_max_backlog=32768
sysctl -w net.core.netdev_budget=2700
/root/nic_balance.sh em1 0 2

netdev_max_backlog:

An increase to this value is indicated by an increase in the 2nd column of 
/proc/net/softnet_stat. The default value starts at a reasonable amount, 
however even 500k qps pushes the limits of this buffer when pinning IRQ's to 
cores. Doubled it.

netdev_budget:

An increase to this value is indicated by an increase in the 3rd column of 
/proc/net/softnet_stat. The default value is quite low (300) and this is easily 
blown away, especially if all of the NIC IRQ's are pinned to a single CPU core. 
Tried various values until the increase was small (at 2700).

As the best numbers have been when using 2 cores however, this number can 
probably be lowered. It seems stable at 2700 however, so didn't re-test at 
lower numbers.

'/root/nic_balance.sh em1 0 2':
(Custom Script based off of RH 20150325_network_performance_tuning.pdf)

Pin all the IRQ's for the 'em1' NIC to the first 2 CPU cores of the local NUMA 
node.

This had the most noticeable effects. By default, the 'irqbalance' service and 
the system in general will create numerous rx/tx listening threads for the NIC, 
each with a soft interrupt. When spread across the multiple NUMA nodes, each 
ingress packet gets delayed as it gets switched to the NUMA node where the rest 
of the process is living.

At low throughput, this isn't a concern. At high throughput, this becomes quite 
noticeable; roughly 100k qps difference.

I tried various levels of tuning (spread across 12 cores, spread across 8, 4 
and pinned to a single core), finding 2 cores the best on the bare-metal node.

...

Whilst 'softnet_stat' didn't show any dropped packets (2nd column), 'netstat -s 
-u' still shows 'packet receive errors'. Still uncertain how they differ and 
how I can fix netstat's problem.

Stuart
_______________________________________________
Please visit https://lists.isc.org/mailman/listinfo/bind-users to unsubscribe 
from this list

bind-users mailing list
bind-users@lists.isc.org
https://lists.isc.org/mailman/listinfo/bind-users

Reply via email to