Hi,

Finally I got around to testing mbuf tag merging patch by Henning
that Theo suggested. For the details on the test setup see my
original post [1], only difference now is that the interfaces are
all on different interrupts.

Only i386 results now, I didn't have the time to test amd64.

Firstly, some reference results with NICs each on its own interrupt:

clients: 3.8-beta, i386, sp kernel
router: 3.8-beta, i386, sp kernel, routing on PCI-X adapter
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
max TCP bandwidth:              941 Mbits/sec
with TCP window size:           96-128KB
(larger windows sizes caused a drop in speed
probably due to CPU being at 100% interrupt)

max UDP bandwidth:      905 Mbits/sec
UDP packet size:                1470
dropped packets:                0%
(you can't set higher UDP bandwidth with iperf)

UDP pps results with 128 byte packet size:
        pps             %dropped
    19608               0%
    40000               0%
    83328               0.00096%
    99980               0.0022%
   124950               0.0026%
   142772               0.0085%
   166501               0.039%
   196351               0.22%
   225851               1.4%
   240826               4.2%

clients: 3.8-beta, i386, sp kernel
router: 3.8-beta, i386, mp kernel, routing on PCI-X adapter
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
max TCP bandwidth:              941 Mbits/sec
with TCP window size:           96-128KB
(larger windows sizes caused a drop in speed
probably due to CPU being at 100% interrupt)

max UDP bandwidth:      905 Mbits/sec
UDP packet size:                1470
dropped packets:                0%
(you can't set higher UDP bandwidth with iperf)

UDP pps results with 128 byte packet size:
        pps             %dropped
    19608               0%
    40000               0%
    83328               0%
    99983               0.0012%
   124947               0.00096%
   142775               1.4%
   166493               0.62%
   196226               14%
   225451               32%
   241131               39%

- Now some -current results with the router:

clients: 3.8-beta, i386, sp kernel
router: 3.8-current, i386, sp kernel, routing on PCI-X adapter
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
max TCP bandwidth:              941 Mbits/sec
with TCP window size:           96-256KB
(no drop in speed with larger window size)

max UDP bandwidth:      905 Mbits/sec
UDP packet size:                1470
dropped packets:                0%
(you can't set higher UDP bandwidth with iperf)

UDP pps results with 128 byte packet size:
        pps             %dropped
    19608               0%
    40000               0%
    83328               0%
    99985               0.0008%
   124948               0.006%
   142764               0.0059%
   166459               0.053%
   196448               0.2%
   222766               1.7%
   231909               1.1%

clients: 3.8-beta, i386, sp kernel
router: 3.8-current, i386, sp kernel, routing on integrated adapter
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
TCP bandwidth, win size:        750 Mbits/sec,  64KB
                                        460 Mbits/sec,  96KB
                                        751 Mbits/sec,  128KB
                                        755 Mbits/sec,  192KB
                                        760 Mbits/sec,  256KB
(strange drop at 96KB window, but no decrease at larger sizes)

max UDP bandwidth:      784 Mbits/sec
UDP packet size:                1470
dropped packets:                0%
(larger bandwidth tests failed)

UDP pps results with 128 byte packet size:
        pps             %dropped
    19608               0%
    40000               0%
    83328               0%
    99983               0%
   124949               0.0008%
   142755               0.0017%
   166433               0.099%
   196415               0.22%
   220741               1.9%
   229492               2.6%

clients: 3.8-beta, i386, sp kernel
router: 3.8-current, i386, mp kernel, routing on integrated adapter
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
TCP bandwidth, win size:        770 Mbits/sec,  64KB
                                        652 Mbits/sec,  96KB
                                        783 Mbits/sec,  128KB
                                        783 Mbits/sec,  192KB
                                        786 Mbits/sec,  256KB
(strange drop at 96KB window, but no decrease at larger sizes)

max UDP bandwidth:      784 Mbits/sec
UDP packet size:                1470
dropped packets:                0%
(larger bandwidth tests failed)

UDP pps results with 128 byte packet size:
        pps             %dropped
    19608               0%
    40000               0%
    83328               0%
    99985               0%
   124946               0.0004%
   142758               0.00056%
   166428               0.0061%
   196229               15%
   222272               31%
   232224               37%

- And now some results with the mbuf tag merging patch

clients: 3.8-beta, i386, sp kernel
router: 3.8-patched, i386, sp kernel, routing on integrated adapter
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
TCP bandwidth, win size:        722 Mbits/sec,  64KB
                                        727 Mbits/sec,  96KB
                                        756 Mbits/sec,  128KB
                                        757 Mbits/sec,  192KB
                                        753 Mbits/sec,  256KB
(no drop at 96KB window and no decrease at larger sizes)

max UDP bandwidth:      784 Mbits/sec
UDP packet size:                1470
dropped packets:                0%
(larger bandwidth test failed)

UDP pps results with 128 byte packet size:
        pps             %dropped
    19608               0%
    40000               0%
    83328               0%
    99982               0.0007%
   124944               0.0017%
   142764               0.00035%
   166432               0.019%
   196373               0.099%
   222594               0.31%
   231337               0.68%

clients: 3.8-beta, i386, sp kernel
router: 3.8-patched, i386, mp kernel, routing on integrated adapter
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
TCP bandwidth, win size:        738 Mbits/sec,  64KB
                                        529 Mbits/sec,  96KB
                                        750 Mbits/sec,  128KB
                                        754 Mbits/sec,  192KB
                                        742 Mbits/sec,  256KB
(strange drop at 96KB window, but no decrease at larger sizes)

max UDP bandwidth:      784 Mbits/sec
UDP packet size:                1470
dropped packets:                0%
(larger bandwidth test failed)

UDP pps results with 128 byte packet size:
        pps             %dropped
    19608               0%
    40000               0%
    83328               0%
    99982               0%
   124948               0.0012%
   142760               0.0017%
   166454               0.46%
   196614               7.1%
   222543               1.5%
   231195               0.76%

clients: 3.8-beta, i386, sp kernel
router: 3.8-patched, i386, sp kernel, routing on PCI-X adapter
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
max TCP bandwidth:              941 Mbits/sec
with TCP window size:           96-256KB
(no drop in speed with larger window size)

max UDP bandwidth:      905 Mbits/sec
UDP packet size:                1470
dropped packets:                0%
(you can't set higher UDP bandwidth with iperf)

UDP pps results with 128 byte packet size:
        pps             %dropped
    19608               0%
    40000               0%
    83328               0%
    99982               0%
   124948               0.00072%
   142762               0.28%
   166438               0.42%
   196330               1.7%
   221982               0.74%
   230845               1%

clients: 3.8-patched, i386, sp kernel
router: 3.8-patched, i386, sp kernel, routing on PCI-X adapter
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
max TCP bandwidth:              941 Mbits/sec
with TCP window size:           96-256KB
(no drop in speed with larger window size)

max UDP bandwidth:      905 Mbits/sec
UDP packet size:                1470
dropped packets:                0%
(you can't set higher UDP bandwidth with iperf)

UDP pps results with 128 byte packet size:
        pps             %dropped
    19608               0%
    40000               0%
    83328               0%
    99981               0.0006%
   124948               0.0013%
   142754               0.34%
   166391               0.25%
   195806               1.2%
   220620               0.62%
   225440               0.72%

clients: 3.8-patched, i386, sp kernel
router: 3.8-patched, i386, mp kernel, routing on PCI-X adapter
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
max TCP bandwidth:              941 Mbits/sec
with TCP window size:           96-256KB
(no drop in speed with larger window size)

max UDP bandwidth:      905 Mbits/sec
UDP packet size:                1470
dropped packets:                0%
(you can't set higher UDP bandwidth with iperf)

UDP pps results with 128 byte packet size:
        pps             %dropped
    19608               0%
    40000               0%
    83328               0%
    99981               0.0003%
   124946               0.009%
   142759               1.9%
   166398               1.6%
   195397               11%
   219772               25%
   225575               28%

Some new conslusions:
 - integrated NIC isn't so crappy on i386 but still can't 
forward at wire speed
 - 3.8-current is better at handling NICs on different 
interrupts than 3.8
 - 3.8-current helps TCP bandwidth, see TCP window behaviour
 - mbus tag merging patch helps in all cases and especially
with the integrated NIC and i386 mp kernel -> almost the same
performance as i386 sp kernel
 - patching the clients didn't really help generate more pps
for testing -> really need something else than iperf for testing
 - pps wise integrated NIC gave the best results

One of the boxes is entering production any day now, so my test 
system will be gone in a day or two. 

Well, I did what I could, hope you like this batch of tests too.

Daniel.

[1] http://marc.theaimsgroup.com/?l=openbsd-misc&m=112799668301101&w=2

PS.
As an extra some openssl speed results:
(pay attention to AES compared between 3.7 and 3.8 ;)

3.7_i386_sp:
~~~~~~~~~~~~
type             16 bytes     64 bytes    256 bytes   1024 bytes   8192 bytes
md2               1548.57k     3313.72k     4635.10k     5135.58k     5321.86k
mdc2                 0.00         0.00         0.00         0.00         0.00
md4              12714.06k    45591.07k   132259.07k   270387.53k   388210.14k
md5              11666.25k    39277.10k   130997.55k   313924.23k   519818.53k
hmac(md5)        16551.11k    61901.17k   187540.01k   384052.60k   552534.09k
sha1             11424.05k    38394.59k   105503.58k   198112.81k   265333.19k
rmd160           11059.85k    36210.68k    90449.36k   150354.59k   186291.54k
rc4              98820.55k   115246.97k   120167.77k   121558.20k   121670.13k
des cbc          46027.11k    52908.69k    54425.36k    54627.82k    54896.35k
des ede3         20050.67k    21263.24k    21231.11k    21159.44k    21475.38k
idea cbc             0.00         0.00         0.00         0.00         0.00
rc2 cbc          22596.92k    24772.70k    25229.48k    25181.44k    25274.77k
rc5-32/12 cbc        0.00         0.00         0.00         0.00         0.00
blowfish cbc     73503.51k    86577.94k    89202.00k    88579.46k    88578.81k
cast cbc         63132.90k    76590.61k    79500.97k    78885.99k    79155.22k
aes-128 cbc      48604.32k    48520.34k    49077.19k    49214.63k    49128.23k
aes-192 cbc      43625.36k    43265.07k    43566.07k    43786.23k    43814.13k
aes-256 cbc      39313.08k    38787.85k    39195.71k    39295.39k    39210.22k
                  sign    verify    sign/s verify/s
rsa  512 bits   0.0009s   0.0001s   1164.7   9579.3
rsa 1024 bits   0.0041s   0.0003s    241.8   3781.1
rsa 2048 bits   0.0248s   0.0008s     40.3   1227.6
rsa 4096 bits   0.1641s   0.0028s      6.1    353.5
                  sign    verify    sign/s verify/s
dsa  512 bits   0.0007s   0.0008s   1480.2   1214.8
dsa 1024 bits   0.0020s   0.0025s    491.5    407.4
dsa 2048 bits   0.0068s   0.0083s    147.4    120.3

3.7_i386_mp:
~~~~~~~~~~~~
type             16 bytes     64 bytes    256 bytes   1024 bytes   8192 bytes
md2               1553.11k     3328.70k     4649.76k     5163.21k     5334.33k
mdc2                 0.00         0.00         0.00         0.00         0.00
md4              12420.50k    45159.40k   132458.48k   273321.93k   393801.14k
md5              11920.80k    40367.63k   133683.54k   320229.97k   525112.64k
hmac(md5)        16461.23k    61837.10k   188761.39k   386985.25k   553558.75k
sha1             11503.76k    39367.06k   108206.35k   200407.69k   265937.90k
rmd160           11096.09k    36410.41k    91091.43k   150764.44k   186421.07k
rc4              99019.40k   115866.37k   120211.56k   121819.26k   122262.20k
des cbc          44064.39k    49438.09k    50203.56k    49350.34k    49513.97k
des ede3         20552.70k    21382.55k    21592.88k    21429.16k    21429.84k
idea cbc             0.00         0.00         0.00         0.00         0.00
rc2 cbc          22542.44k    24817.61k    25289.31k    25292.12k    25327.16k
rc5-32/12 cbc        0.00         0.00         0.00         0.00         0.00
blowfish cbc     73662.93k    86988.84k    89388.91k    88763.79k    88808.36k
cast cbc         63411.49k    76755.37k    79548.22k    79190.58k    79255.56k
aes-128 cbc      46710.67k    46782.66k    46033.82k    47684.04k    47358.47k
aes-192 cbc      42281.06k    42017.08k    42048.47k    42209.89k    41942.50k
aes-256 cbc      37951.67k    37936.52k    38096.12k    38230.92k    38271.06k
                  sign    verify    sign/s verify/s
rsa  512 bits   0.0009s   0.0001s   1167.8   9428.1
rsa 1024 bits   0.0041s   0.0003s    242.3   3822.4
rsa 2048 bits   0.0248s   0.0008s     40.4   1233.1
rsa 4096 bits   0.1635s   0.0028s      6.1    355.1
                  sign    verify    sign/s verify/s
dsa  512 bits   0.0007s   0.0008s   1467.5   1210.8
dsa 1024 bits   0.0020s   0.0024s    492.6    411.9
dsa 2048 bits   0.0068s   0.0083s    147.7    119.8

3.8-beta_i386_sp:
~~~~~~~~~~~~~~~~~
type             16 bytes     64 bytes    256 bytes   1024 bytes   8192 bytes
md2               1535.24k     3285.73k     4619.95k     5137.34k     5327.53k
mdc2                 0.00         0.00         0.00         0.00         0.00
md4              12680.32k    46136.25k   132422.15k   272758.06k   391952.20k
md5              10971.94k    41069.23k   132725.51k   318642.89k   527921.16k
hmac(md5)        16619.68k    61174.97k   185029.80k   381320.85k   549603.23k
sha1             11442.32k    37328.82k   105131.30k   196801.21k   264075.06k
rmd160           10460.22k    34965.94k    88920.76k   149080.30k   185649.67k
rc4              99875.08k   115382.09k   118588.39k   119634.10k   120362.93k
des cbc          45570.15k    53500.94k    55750.59k    55792.06k    55713.44k
des ede3         20426.78k    21104.12k    21334.95k    20833.98k    20900.89k
idea cbc             0.00         0.00         0.00         0.00         0.00
rc2 cbc          22262.25k    24523.35k    25151.19k    25236.97k    25263.88k
rc5-32/12 cbc        0.00         0.00         0.00         0.00         0.00
blowfish cbc     73296.41k    86546.76k    88918.83k    88390.07k    88458.95k
cast cbc         63286.79k    76581.72k    78886.16k    79230.29k    79029.96k
aes-128 cbc      81707.88k    82470.32k    84064.61k    85016.72k    85062.65k
aes-192 cbc      71794.65k    71908.65k    73023.94k    72237.71k    72889.61k
aes-256 cbc      64194.17k    64428.52k    65797.45k    66163.52k    66204.65k
                  sign    verify    sign/s verify/s
rsa  512 bits   0.0009s   0.0001s   1151.2   9444.5
rsa 1024 bits   0.0041s   0.0003s    241.0   3744.1
rsa 2048 bits   0.0249s   0.0008s     40.2   1226.1
rsa 4096 bits   0.1642s   0.0028s      6.1    353.6
                  sign    verify    sign/s verify/s
dsa  512 bits   0.0007s   0.0008s   1462.3   1205.5
dsa 1024 bits   0.0020s   0.0025s    489.0    405.4
dsa 2048 bits   0.0068s   0.0083s    147.0    120.9

3.8-beta_i386_mp:
~~~~~~~~~~~~~~~~~
type             16 bytes     64 bytes    256 bytes   1024 bytes   8192 bytes
md2               1531.90k     3298.49k     4627.48k     5146.54k     5315.27k
mdc2                 0.00         0.00         0.00         0.00         0.00
md4              12663.28k    45258.59k   131442.90k   273197.76k   401168.50k
md5              10760.89k    40506.17k   133189.66k   319263.81k   529973.41k
hmac(md5)        16830.35k    61450.42k   187658.46k   384014.29k   552004.72k
sha1             10997.20k    36561.80k   102719.70k   196863.83k   264332.16k
rmd160           10301.32k    35163.11k    89088.09k   149640.76k   186157.08k
rc4             101259.75k   115630.65k   119519.51k   120788.79k   121227.99k
des cbc          46791.65k    53578.91k    55387.09k    54989.14k    55234.76k
des ede3         20065.78k    20825.64k    21060.47k    20771.21k    21225.72k
idea cbc             0.00         0.00         0.00         0.00         0.00
rc2 cbc          21741.81k    24360.72k    24198.04k    24996.15k    24448.09k
rc5-32/12 cbc        0.00         0.00         0.00         0.00         0.00
blowfish cbc     73657.78k    86936.58k    89067.42k    88866.87k    88792.03k
cast cbc         63426.38k    76743.82k    79383.64k    79272.57k    79274.61k
aes-128 cbc      81564.36k    81619.67k    84539.36k    85188.30k    85245.79k
aes-192 cbc      73181.34k    72115.03k    72416.79k    72776.46k    72184.86k
aes-256 cbc      64034.54k    65671.04k    65370.07k    65656.09k    65982.34k
                  sign    verify    sign/s verify/s
rsa  512 bits   0.0009s   0.0001s   1154.6   9533.7
rsa 1024 bits   0.0041s   0.0003s    242.5   3763.5
rsa 2048 bits   0.0248s   0.0008s     40.4   1225.5
rsa 4096 bits   0.1637s   0.0028s      6.1    354.7
                  sign    verify    sign/s verify/s
dsa  512 bits   0.0007s   0.0008s   1462.9   1218.5
dsa 1024 bits   0.0020s   0.0025s    489.9    403.6
dsa 2048 bits   0.0068s   0.0083s    147.1    120.5

3.8-current_i386_sp:
~~~~~~~~~~~~~~~~~~~~
type             16 bytes     64 bytes    256 bytes   1024 bytes   8192 bytes
md2               1540.50k     3289.64k     4624.46k     5144.15k     5305.36k
mdc2                 0.00         0.00         0.00         0.00         0.00
md4              12876.77k    46252.28k   132591.61k   270247.71k   399411.88k
md5              10877.97k    40438.73k   131453.15k   315996.84k   527098.82k
hmac(md5)        16472.63k    61075.20k   185845.32k   380995.06k   552236.37k
sha1             11190.41k    37445.62k   105509.31k   197343.33k   265009.24k
rmd160           10327.34k    34560.47k    90274.66k   150372.24k   185195.91k
rc4             105920.53k   116991.15k   119232.57k   120682.27k   120951.18k
des cbc          46479.53k    53203.54k    55340.46k    53897.84k    53792.49k
des ede3         20657.56k    20810.68k    21176.81k    21275.20k    21262.94k
idea cbc             0.00         0.00         0.00         0.00         0.00
rc2 cbc          22083.99k    24570.90k    25190.84k    25239.36k    25203.86k
rc5-32/12 cbc        0.00         0.00         0.00         0.00         0.00
blowfish cbc     73472.56k    86799.89k    89215.25k    88565.54k    88581.51k
cast cbc         63286.19k    76377.62k    79386.21k    79090.22k    79144.32k
aes-128 cbc      81318.87k    82727.04k    83883.17k    84646.31k    85071.19k
aes-192 cbc      72475.49k    72493.32k    72843.45k    72434.94k    72174.70k
aes-256 cbc      62498.76k    64973.91k    64868.08k    67339.36k    66196.46k
                  sign    verify    sign/s verify/s
rsa  512 bits   0.0009s   0.0001s   1148.4   9464.7
rsa 1024 bits   0.0041s   0.0003s    241.4   3762.3
rsa 2048 bits   0.0248s   0.0008s     40.3   1222.4
rsa 4096 bits   0.1643s   0.0028s      6.1    354.5
                  sign    verify    sign/s verify/s
dsa  512 bits   0.0007s   0.0008s   1476.4   1216.6
dsa 1024 bits   0.0020s   0.0025s    490.1    403.5
dsa 2048 bits   0.0068s   0.0083s    147.3    120.3

Reply via email to