Most Excellent!
-daw-
On 10/17/16 1:35 PM, Dave Barach (dbarach) wrote:
Folks,
I finally decided to quad-loop ip4-lookup, just to see what would
happen. Various folks have suggested giving it a try, etc.
Note the 12 clocks/pkt improvement on a single core, with an offered
load of around ~15mpps. See below for CPU details. That’s
approximately a 25% performance gain. Quad-looping [only] ip4-lookup
increases the ip4 racetrack forwarding NDR by something like 800 kpps.
It’s not worth retrofitting every node - even though the work is
mechanical - but the ip speed path nodes are obvious candidates.
In case enquiring minds want to know... D.
vpp# sh ru
Time 3.8, average vectors/node 251.55, last 128 main loops 12.00 per
node 256.00
vector rates in 1.4877e7, out 1.4877e7, drop 0.0000e0, punt 0.0000e0
Name Clocks Vectors/Call
FortyGigabitEthernet84/0/1-out 7.56e0 251.55
FortyGigabitEthernet84/0/1-tx 4.79e1 251.55
dpdk-input 5.85e1 251.55
ip4-input-no-checksum 2.94e1 251.55
ip4-lookup 4.54e1 251.55
ip4-rewrite-transit 2.42e1 251.55
vpp# sh ru
Time 3.2, average vectors/node 149.42, last 128 main loops 3.09 per
node 66.00
vector rates in 1.5601e7, out 1.5601e7, drop 0.0000e0, punt 0.0000e0
Name Clocks Vectors/Call
FortyGigabitEthernet84/0/1-out 7.68e0 149.42
FortyGigabitEthernet84/0/1-tx 4.74e1 149.42
dpdk-input 5.96e1 149.42
ip4-input-no-checksum 3.04e1 149.42
ip4-lookup 3.34e1 149.42
ip4-rewrite-transit 2.41e1 149.42
Thanks… Dave
$ cat /proc/cpuinfo
processor : 31
vendor_id : GenuineIntel
cpu family : 6
model : 63
model name : Intel(R) Xeon(R) CPU E5-2667 v3 @ 3.20GHz
stepping : 2
microcode : 0x27
cpu MHz : 1201.000
cache size : 20480 KB
physical id : 1
siblings : 16
core id : 7
cpu cores : 8
apicid : 31
initial apicid : 31
fpu : yes
fpu_exception : yes
cpuid level : 15
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge
mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe
syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts
rep_good nopl xtopology nonstop_tsc aperfmperf eagerfpu pni pclmulqdq
dtes64 monitor ds_cpl vmx smx est tm2 ssse3 fma cx16 xtpr pdcm pcid
dca sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx
f16c rdrand lahf_lm abm ida arat epb xsaveopt pln pts dtherm
tpr_shadow vnmi flexpriority ept vpid fsgsbase tsc_adjust bmi1 avx2
smep bmi2 erms invpcid
bogomips : 6401.16
clflush size : 64
cache_alignment : 64
address sizes : 46 bits physical, 48 bits virtual
power management:
_______________________________________________
vpp-dev mailing list
[email protected]
https://lists.fd.io/mailman/listinfo/vpp-dev
_______________________________________________
vpp-dev mailing list
[email protected]
https://lists.fd.io/mailman/listinfo/vpp-dev