Hi,
Finally I got around to testing mbuf tag merging patch by Henning
that Theo suggested. For the details on the test setup see my
original post [1], only difference now is that the interfaces are
all on different interrupts.
Only i386 results now, I didn't have the time to test amd64.
Firstly, some reference results with NICs each on its own interrupt:
clients: 3.8-beta, i386, sp kernel
router: 3.8-beta, i386, sp kernel, routing on PCI-X adapter
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
max TCP bandwidth: 941 Mbits/sec
with TCP window size: 96-128KB
(larger windows sizes caused a drop in speed
probably due to CPU being at 100% interrupt)
max UDP bandwidth: 905 Mbits/sec
UDP packet size: 1470
dropped packets: 0%
(you can't set higher UDP bandwidth with iperf)
UDP pps results with 128 byte packet size:
pps %dropped
19608 0%
40000 0%
83328 0.00096%
99980 0.0022%
124950 0.0026%
142772 0.0085%
166501 0.039%
196351 0.22%
225851 1.4%
240826 4.2%
clients: 3.8-beta, i386, sp kernel
router: 3.8-beta, i386, mp kernel, routing on PCI-X adapter
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
max TCP bandwidth: 941 Mbits/sec
with TCP window size: 96-128KB
(larger windows sizes caused a drop in speed
probably due to CPU being at 100% interrupt)
max UDP bandwidth: 905 Mbits/sec
UDP packet size: 1470
dropped packets: 0%
(you can't set higher UDP bandwidth with iperf)
UDP pps results with 128 byte packet size:
pps %dropped
19608 0%
40000 0%
83328 0%
99983 0.0012%
124947 0.00096%
142775 1.4%
166493 0.62%
196226 14%
225451 32%
241131 39%
- Now some -current results with the router:
clients: 3.8-beta, i386, sp kernel
router: 3.8-current, i386, sp kernel, routing on PCI-X adapter
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
max TCP bandwidth: 941 Mbits/sec
with TCP window size: 96-256KB
(no drop in speed with larger window size)
max UDP bandwidth: 905 Mbits/sec
UDP packet size: 1470
dropped packets: 0%
(you can't set higher UDP bandwidth with iperf)
UDP pps results with 128 byte packet size:
pps %dropped
19608 0%
40000 0%
83328 0%
99985 0.0008%
124948 0.006%
142764 0.0059%
166459 0.053%
196448 0.2%
222766 1.7%
231909 1.1%
clients: 3.8-beta, i386, sp kernel
router: 3.8-current, i386, sp kernel, routing on integrated adapter
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
TCP bandwidth, win size: 750 Mbits/sec, 64KB
460 Mbits/sec, 96KB
751 Mbits/sec, 128KB
755 Mbits/sec, 192KB
760 Mbits/sec, 256KB
(strange drop at 96KB window, but no decrease at larger sizes)
max UDP bandwidth: 784 Mbits/sec
UDP packet size: 1470
dropped packets: 0%
(larger bandwidth tests failed)
UDP pps results with 128 byte packet size:
pps %dropped
19608 0%
40000 0%
83328 0%
99983 0%
124949 0.0008%
142755 0.0017%
166433 0.099%
196415 0.22%
220741 1.9%
229492 2.6%
clients: 3.8-beta, i386, sp kernel
router: 3.8-current, i386, mp kernel, routing on integrated adapter
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
TCP bandwidth, win size: 770 Mbits/sec, 64KB
652 Mbits/sec, 96KB
783 Mbits/sec, 128KB
783 Mbits/sec, 192KB
786 Mbits/sec, 256KB
(strange drop at 96KB window, but no decrease at larger sizes)
max UDP bandwidth: 784 Mbits/sec
UDP packet size: 1470
dropped packets: 0%
(larger bandwidth tests failed)
UDP pps results with 128 byte packet size:
pps %dropped
19608 0%
40000 0%
83328 0%
99985 0%
124946 0.0004%
142758 0.00056%
166428 0.0061%
196229 15%
222272 31%
232224 37%
- And now some results with the mbuf tag merging patch
clients: 3.8-beta, i386, sp kernel
router: 3.8-patched, i386, sp kernel, routing on integrated adapter
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
TCP bandwidth, win size: 722 Mbits/sec, 64KB
727 Mbits/sec, 96KB
756 Mbits/sec, 128KB
757 Mbits/sec, 192KB
753 Mbits/sec, 256KB
(no drop at 96KB window and no decrease at larger sizes)
max UDP bandwidth: 784 Mbits/sec
UDP packet size: 1470
dropped packets: 0%
(larger bandwidth test failed)
UDP pps results with 128 byte packet size:
pps %dropped
19608 0%
40000 0%
83328 0%
99982 0.0007%
124944 0.0017%
142764 0.00035%
166432 0.019%
196373 0.099%
222594 0.31%
231337 0.68%
clients: 3.8-beta, i386, sp kernel
router: 3.8-patched, i386, mp kernel, routing on integrated adapter
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
TCP bandwidth, win size: 738 Mbits/sec, 64KB
529 Mbits/sec, 96KB
750 Mbits/sec, 128KB
754 Mbits/sec, 192KB
742 Mbits/sec, 256KB
(strange drop at 96KB window, but no decrease at larger sizes)
max UDP bandwidth: 784 Mbits/sec
UDP packet size: 1470
dropped packets: 0%
(larger bandwidth test failed)
UDP pps results with 128 byte packet size:
pps %dropped
19608 0%
40000 0%
83328 0%
99982 0%
124948 0.0012%
142760 0.0017%
166454 0.46%
196614 7.1%
222543 1.5%
231195 0.76%
clients: 3.8-beta, i386, sp kernel
router: 3.8-patched, i386, sp kernel, routing on PCI-X adapter
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
max TCP bandwidth: 941 Mbits/sec
with TCP window size: 96-256KB
(no drop in speed with larger window size)
max UDP bandwidth: 905 Mbits/sec
UDP packet size: 1470
dropped packets: 0%
(you can't set higher UDP bandwidth with iperf)
UDP pps results with 128 byte packet size:
pps %dropped
19608 0%
40000 0%
83328 0%
99982 0%
124948 0.00072%
142762 0.28%
166438 0.42%
196330 1.7%
221982 0.74%
230845 1%
clients: 3.8-patched, i386, sp kernel
router: 3.8-patched, i386, sp kernel, routing on PCI-X adapter
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
max TCP bandwidth: 941 Mbits/sec
with TCP window size: 96-256KB
(no drop in speed with larger window size)
max UDP bandwidth: 905 Mbits/sec
UDP packet size: 1470
dropped packets: 0%
(you can't set higher UDP bandwidth with iperf)
UDP pps results with 128 byte packet size:
pps %dropped
19608 0%
40000 0%
83328 0%
99981 0.0006%
124948 0.0013%
142754 0.34%
166391 0.25%
195806 1.2%
220620 0.62%
225440 0.72%
clients: 3.8-patched, i386, sp kernel
router: 3.8-patched, i386, mp kernel, routing on PCI-X adapter
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
max TCP bandwidth: 941 Mbits/sec
with TCP window size: 96-256KB
(no drop in speed with larger window size)
max UDP bandwidth: 905 Mbits/sec
UDP packet size: 1470
dropped packets: 0%
(you can't set higher UDP bandwidth with iperf)
UDP pps results with 128 byte packet size:
pps %dropped
19608 0%
40000 0%
83328 0%
99981 0.0003%
124946 0.009%
142759 1.9%
166398 1.6%
195397 11%
219772 25%
225575 28%
Some new conslusions:
- integrated NIC isn't so crappy on i386 but still can't
forward at wire speed
- 3.8-current is better at handling NICs on different
interrupts than 3.8
- 3.8-current helps TCP bandwidth, see TCP window behaviour
- mbus tag merging patch helps in all cases and especially
with the integrated NIC and i386 mp kernel -> almost the same
performance as i386 sp kernel
- patching the clients didn't really help generate more pps
for testing -> really need something else than iperf for testing
- pps wise integrated NIC gave the best results
One of the boxes is entering production any day now, so my test
system will be gone in a day or two.
Well, I did what I could, hope you like this batch of tests too.
Daniel.
[1] http://marc.theaimsgroup.com/?l=openbsd-misc&m=112799668301101&w=2
PS.
As an extra some openssl speed results:
(pay attention to AES compared between 3.7 and 3.8 ;)
3.7_i386_sp:
~~~~~~~~~~~~
type 16 bytes 64 bytes 256 bytes 1024 bytes 8192 bytes
md2 1548.57k 3313.72k 4635.10k 5135.58k 5321.86k
mdc2 0.00 0.00 0.00 0.00 0.00
md4 12714.06k 45591.07k 132259.07k 270387.53k 388210.14k
md5 11666.25k 39277.10k 130997.55k 313924.23k 519818.53k
hmac(md5) 16551.11k 61901.17k 187540.01k 384052.60k 552534.09k
sha1 11424.05k 38394.59k 105503.58k 198112.81k 265333.19k
rmd160 11059.85k 36210.68k 90449.36k 150354.59k 186291.54k
rc4 98820.55k 115246.97k 120167.77k 121558.20k 121670.13k
des cbc 46027.11k 52908.69k 54425.36k 54627.82k 54896.35k
des ede3 20050.67k 21263.24k 21231.11k 21159.44k 21475.38k
idea cbc 0.00 0.00 0.00 0.00 0.00
rc2 cbc 22596.92k 24772.70k 25229.48k 25181.44k 25274.77k
rc5-32/12 cbc 0.00 0.00 0.00 0.00 0.00
blowfish cbc 73503.51k 86577.94k 89202.00k 88579.46k 88578.81k
cast cbc 63132.90k 76590.61k 79500.97k 78885.99k 79155.22k
aes-128 cbc 48604.32k 48520.34k 49077.19k 49214.63k 49128.23k
aes-192 cbc 43625.36k 43265.07k 43566.07k 43786.23k 43814.13k
aes-256 cbc 39313.08k 38787.85k 39195.71k 39295.39k 39210.22k
sign verify sign/s verify/s
rsa 512 bits 0.0009s 0.0001s 1164.7 9579.3
rsa 1024 bits 0.0041s 0.0003s 241.8 3781.1
rsa 2048 bits 0.0248s 0.0008s 40.3 1227.6
rsa 4096 bits 0.1641s 0.0028s 6.1 353.5
sign verify sign/s verify/s
dsa 512 bits 0.0007s 0.0008s 1480.2 1214.8
dsa 1024 bits 0.0020s 0.0025s 491.5 407.4
dsa 2048 bits 0.0068s 0.0083s 147.4 120.3
3.7_i386_mp:
~~~~~~~~~~~~
type 16 bytes 64 bytes 256 bytes 1024 bytes 8192 bytes
md2 1553.11k 3328.70k 4649.76k 5163.21k 5334.33k
mdc2 0.00 0.00 0.00 0.00 0.00
md4 12420.50k 45159.40k 132458.48k 273321.93k 393801.14k
md5 11920.80k 40367.63k 133683.54k 320229.97k 525112.64k
hmac(md5) 16461.23k 61837.10k 188761.39k 386985.25k 553558.75k
sha1 11503.76k 39367.06k 108206.35k 200407.69k 265937.90k
rmd160 11096.09k 36410.41k 91091.43k 150764.44k 186421.07k
rc4 99019.40k 115866.37k 120211.56k 121819.26k 122262.20k
des cbc 44064.39k 49438.09k 50203.56k 49350.34k 49513.97k
des ede3 20552.70k 21382.55k 21592.88k 21429.16k 21429.84k
idea cbc 0.00 0.00 0.00 0.00 0.00
rc2 cbc 22542.44k 24817.61k 25289.31k 25292.12k 25327.16k
rc5-32/12 cbc 0.00 0.00 0.00 0.00 0.00
blowfish cbc 73662.93k 86988.84k 89388.91k 88763.79k 88808.36k
cast cbc 63411.49k 76755.37k 79548.22k 79190.58k 79255.56k
aes-128 cbc 46710.67k 46782.66k 46033.82k 47684.04k 47358.47k
aes-192 cbc 42281.06k 42017.08k 42048.47k 42209.89k 41942.50k
aes-256 cbc 37951.67k 37936.52k 38096.12k 38230.92k 38271.06k
sign verify sign/s verify/s
rsa 512 bits 0.0009s 0.0001s 1167.8 9428.1
rsa 1024 bits 0.0041s 0.0003s 242.3 3822.4
rsa 2048 bits 0.0248s 0.0008s 40.4 1233.1
rsa 4096 bits 0.1635s 0.0028s 6.1 355.1
sign verify sign/s verify/s
dsa 512 bits 0.0007s 0.0008s 1467.5 1210.8
dsa 1024 bits 0.0020s 0.0024s 492.6 411.9
dsa 2048 bits 0.0068s 0.0083s 147.7 119.8
3.8-beta_i386_sp:
~~~~~~~~~~~~~~~~~
type 16 bytes 64 bytes 256 bytes 1024 bytes 8192 bytes
md2 1535.24k 3285.73k 4619.95k 5137.34k 5327.53k
mdc2 0.00 0.00 0.00 0.00 0.00
md4 12680.32k 46136.25k 132422.15k 272758.06k 391952.20k
md5 10971.94k 41069.23k 132725.51k 318642.89k 527921.16k
hmac(md5) 16619.68k 61174.97k 185029.80k 381320.85k 549603.23k
sha1 11442.32k 37328.82k 105131.30k 196801.21k 264075.06k
rmd160 10460.22k 34965.94k 88920.76k 149080.30k 185649.67k
rc4 99875.08k 115382.09k 118588.39k 119634.10k 120362.93k
des cbc 45570.15k 53500.94k 55750.59k 55792.06k 55713.44k
des ede3 20426.78k 21104.12k 21334.95k 20833.98k 20900.89k
idea cbc 0.00 0.00 0.00 0.00 0.00
rc2 cbc 22262.25k 24523.35k 25151.19k 25236.97k 25263.88k
rc5-32/12 cbc 0.00 0.00 0.00 0.00 0.00
blowfish cbc 73296.41k 86546.76k 88918.83k 88390.07k 88458.95k
cast cbc 63286.79k 76581.72k 78886.16k 79230.29k 79029.96k
aes-128 cbc 81707.88k 82470.32k 84064.61k 85016.72k 85062.65k
aes-192 cbc 71794.65k 71908.65k 73023.94k 72237.71k 72889.61k
aes-256 cbc 64194.17k 64428.52k 65797.45k 66163.52k 66204.65k
sign verify sign/s verify/s
rsa 512 bits 0.0009s 0.0001s 1151.2 9444.5
rsa 1024 bits 0.0041s 0.0003s 241.0 3744.1
rsa 2048 bits 0.0249s 0.0008s 40.2 1226.1
rsa 4096 bits 0.1642s 0.0028s 6.1 353.6
sign verify sign/s verify/s
dsa 512 bits 0.0007s 0.0008s 1462.3 1205.5
dsa 1024 bits 0.0020s 0.0025s 489.0 405.4
dsa 2048 bits 0.0068s 0.0083s 147.0 120.9
3.8-beta_i386_mp:
~~~~~~~~~~~~~~~~~
type 16 bytes 64 bytes 256 bytes 1024 bytes 8192 bytes
md2 1531.90k 3298.49k 4627.48k 5146.54k 5315.27k
mdc2 0.00 0.00 0.00 0.00 0.00
md4 12663.28k 45258.59k 131442.90k 273197.76k 401168.50k
md5 10760.89k 40506.17k 133189.66k 319263.81k 529973.41k
hmac(md5) 16830.35k 61450.42k 187658.46k 384014.29k 552004.72k
sha1 10997.20k 36561.80k 102719.70k 196863.83k 264332.16k
rmd160 10301.32k 35163.11k 89088.09k 149640.76k 186157.08k
rc4 101259.75k 115630.65k 119519.51k 120788.79k 121227.99k
des cbc 46791.65k 53578.91k 55387.09k 54989.14k 55234.76k
des ede3 20065.78k 20825.64k 21060.47k 20771.21k 21225.72k
idea cbc 0.00 0.00 0.00 0.00 0.00
rc2 cbc 21741.81k 24360.72k 24198.04k 24996.15k 24448.09k
rc5-32/12 cbc 0.00 0.00 0.00 0.00 0.00
blowfish cbc 73657.78k 86936.58k 89067.42k 88866.87k 88792.03k
cast cbc 63426.38k 76743.82k 79383.64k 79272.57k 79274.61k
aes-128 cbc 81564.36k 81619.67k 84539.36k 85188.30k 85245.79k
aes-192 cbc 73181.34k 72115.03k 72416.79k 72776.46k 72184.86k
aes-256 cbc 64034.54k 65671.04k 65370.07k 65656.09k 65982.34k
sign verify sign/s verify/s
rsa 512 bits 0.0009s 0.0001s 1154.6 9533.7
rsa 1024 bits 0.0041s 0.0003s 242.5 3763.5
rsa 2048 bits 0.0248s 0.0008s 40.4 1225.5
rsa 4096 bits 0.1637s 0.0028s 6.1 354.7
sign verify sign/s verify/s
dsa 512 bits 0.0007s 0.0008s 1462.9 1218.5
dsa 1024 bits 0.0020s 0.0025s 489.9 403.6
dsa 2048 bits 0.0068s 0.0083s 147.1 120.5
3.8-current_i386_sp:
~~~~~~~~~~~~~~~~~~~~
type 16 bytes 64 bytes 256 bytes 1024 bytes 8192 bytes
md2 1540.50k 3289.64k 4624.46k 5144.15k 5305.36k
mdc2 0.00 0.00 0.00 0.00 0.00
md4 12876.77k 46252.28k 132591.61k 270247.71k 399411.88k
md5 10877.97k 40438.73k 131453.15k 315996.84k 527098.82k
hmac(md5) 16472.63k 61075.20k 185845.32k 380995.06k 552236.37k
sha1 11190.41k 37445.62k 105509.31k 197343.33k 265009.24k
rmd160 10327.34k 34560.47k 90274.66k 150372.24k 185195.91k
rc4 105920.53k 116991.15k 119232.57k 120682.27k 120951.18k
des cbc 46479.53k 53203.54k 55340.46k 53897.84k 53792.49k
des ede3 20657.56k 20810.68k 21176.81k 21275.20k 21262.94k
idea cbc 0.00 0.00 0.00 0.00 0.00
rc2 cbc 22083.99k 24570.90k 25190.84k 25239.36k 25203.86k
rc5-32/12 cbc 0.00 0.00 0.00 0.00 0.00
blowfish cbc 73472.56k 86799.89k 89215.25k 88565.54k 88581.51k
cast cbc 63286.19k 76377.62k 79386.21k 79090.22k 79144.32k
aes-128 cbc 81318.87k 82727.04k 83883.17k 84646.31k 85071.19k
aes-192 cbc 72475.49k 72493.32k 72843.45k 72434.94k 72174.70k
aes-256 cbc 62498.76k 64973.91k 64868.08k 67339.36k 66196.46k
sign verify sign/s verify/s
rsa 512 bits 0.0009s 0.0001s 1148.4 9464.7
rsa 1024 bits 0.0041s 0.0003s 241.4 3762.3
rsa 2048 bits 0.0248s 0.0008s 40.3 1222.4
rsa 4096 bits 0.1643s 0.0028s 6.1 354.5
sign verify sign/s verify/s
dsa 512 bits 0.0007s 0.0008s 1476.4 1216.6
dsa 1024 bits 0.0020s 0.0025s 490.1 403.5
dsa 2048 bits 0.0068s 0.0083s 147.3 120.3