Hi Tony,

On 02/12/20 15:51, Jan Just Keijser wrote:

On 02/12/20 15:22, Tony He wrote:
Hi Jan,

Welcome to join the discussion.

>the second set of numbers doesn't make sense, and a much better test is to do an actual encryption test I don't compile cryptodev kernel module for my PC and can not reproduce this issue for now.  You don't understand  the reason why the performance is much worse with cryptodev module for *big* blocks, right? If yes, I guess the reason maybe kernel assign the work to multi cores while OpenSSL uses one core. Would you share the output of command "mpstat -P ALL 2"?

sure, while using the cryptodev I see this:

15:28:36     CPU    %usr   %nice    %sys %iowait    %irq   %soft %steal  %guest  %gnice   %idle 15:28:38     all    1.87    0.00   23.19    0.12    0.00 0.00    0.00    0.00    0.00   74.81 15:28:38       0    0.00    0.00    0.00    0.50    0.00 0.00    0.00    0.00    0.00   99.50 15:28:38       1    7.00    0.00   93.00    0.00    0.00 0.00    0.00    0.00    0.00    0.00 15:28:38       2    0.00    0.00    0.00    0.00    0.00 0.00    0.00    0.00    0.00  100.00 15:28:38       3    0.00    0.00    0.00    0.00    0.00 0.00    0.00    0.00    0.00  100.00

15:28:38     CPU    %usr   %nice    %sys %iowait    %irq   %soft %steal  %guest  %gnice   %idle 15:28:40     all    0.75    0.00   24.19    0.00    0.00 0.00    0.00    0.00    0.00   75.06 15:28:40       0    0.00    0.00    0.00    0.50    0.00 0.00    0.00    0.00    0.00   99.50 15:28:40       1    3.50    0.00   96.50    0.00    0.00 0.00    0.00    0.00    0.00    0.00 15:28:40       2    0.00    0.00    0.00    0.00    0.00 0.00    0.00    0.00    0.00  100.00 15:28:40       3    0.00    0.00    0.00    0.00    0.00 0.00    0.00    0.00    0.00  100.00

on a 4 core box; this means that 1 core is used 100% (which is what I expected).


I suspect the main reason the cryptodev results on my i5-6800 go off the rails is due to this:
(look at the "Doing aes-128-cbc lines")

$ ./openssl speed -evp aes-128-cbc
Doing aes-128-cbc for 3s on 16 size blocks: 2835368 aes-128-cbc's in 1.14s
Doing aes-128-cbc for 3s on 64 size blocks: 2720745 aes-128-cbc's in 1.01s
Doing aes-128-cbc for 3s on 256 size blocks: 2377830 aes-128-cbc's in *0.74s* Doing aes-128-cbc for 3s on 1024 size blocks: 1538693 aes-128-cbc's in *0.40s* Doing aes-128-cbc for 3s on 8192 size blocks: 370202 aes-128-cbc's in *0.11s*
OpenSSL 1.0.2m  2 Nov 2017
built on: reproducible build, date unspecified
options:bn(64,64) rc4(16x,int) des(idx,cisc,16,int) aes(partial) idea(int) blowfish(idx) compiler: gcc -I. -I.. -I../include  -DOPENSSL_THREADS -D_REENTRANT -DDSO_DLFCN -DHAVE_DLFCN_H -DHAVE_CRYPTODEV -DUSE_CRYPTODEV_DIGESTS -Wa,--noexecstack -m64 -DL_ENDIAN -O3 -Wall -DOPENSSL_IA32_SSE2 -DOPENSSL_BN_ASM_MONT -DOPENSSL_BN_ASM_MONT5 -DOPENSSL_BN_ASM_GF2m -DRC4_ASM -DSHA1_ASM -DSHA256_ASM -DSHA512_ASM -DMD5_ASM -DAES_ASM -DVPAES_ASM -DBSAES_ASM -DWHIRLPOOL_ASM -DGHASH_ASM -DECP_NISTZ256_ASM
The 'numbers' are in 1000s of bytes per second processed.
type             16 bytes     64 bytes    256 bytes   1024 bytes 8192 bytes aes-128-cbc      39794.64k   172403.64k   822600.65k  3939054.08k 27569952.58k


The timing for how quickly the results are returned are way off and probably just wrong. The Openssl speed test is supposed to run for 3 seconds. The actual results returned for 8192 byte blocks is

Doing aes-128-cbc for 3s on 8192 size blocks: 370202 aes-128-cbc's in *0.11s*

whereas without cryptodev I see

Doing aes-128-cbc for 3s on 8192 size blocks: 457255 aes-128-cbc's in *3.00s*

So you can see that without cryptodev the i5-6800 actually says it's doing more blocks (457,255 vs 370,202) but with cryptodev it is doing it in WAY less time.  This leads me to believe the openssl speed code when using cryptodev just "goes wrong". It will be very interesting to see what the encryption test will bring - that is a much better real-life-like example than a simple speed test.

as a follow-up : someone whispered in my ear (thanks, André ;) ) that one should use the -elapsed option for this, so here are new results:

*with* cryptodev:

./openssl speed -evp aes-128-cbc -elapsed
You have chosen to measure elapsed time instead of user CPU time.
Doing aes-128-cbc for 3s on 16 size blocks: 2825786 aes-128-cbc's in 3.00s
Doing aes-128-cbc for 3s on 64 size blocks: 2716822 aes-128-cbc's in 3.00s
Doing aes-128-cbc for 3s on 256 size blocks: 2369723 aes-128-cbc's in 3.00s
Doing aes-128-cbc for 3s on 1024 size blocks: 1536054 aes-128-cbc's in 3.00s
Doing aes-128-cbc for 3s on 8192 size blocks: 369984 aes-128-cbc's in 3.00s
[...]
aes-128-cbc      15,070.86k    57,958.87k   202,216.36k 524,306.43k  1,010,302.98k

*without* cryptodev:

$ openssl speed -evp aes-128-cbc -elapsed
You have chosen to measure elapsed time instead of user CPU time.
Doing aes-128-cbc for 3s on 16 size blocks: 207188725 aes-128-cbc's in 3.00s
Doing aes-128-cbc for 3s on 64 size blocks: 56855717 aes-128-cbc's in 3.00s
Doing aes-128-cbc for 3s on 256 size blocks: 14382122 aes-128-cbc's in 3.00s
Doing aes-128-cbc for 3s on 1024 size blocks: 3618996 aes-128-cbc's in 3.00s
Doing aes-128-cbc for 3s on 8192 size blocks: 456727 aes-128-cbc's in 3.00s
[...]
aes-128-cbc    1,105,006.53k  1,212,921.96k  1,227,274.41k 1,235,283.97k  1,247,169.19k

which more or less reflects the encryption test results I posted earlier.
The question becomes, what are you results when using the -elapsed flag?

JJK


>My advice is to rerun your tests *without* the cryptodev module and then decide wheter you really need CBC+CCM hmacs. Yes, I confirm that without the cryptodev the performance is very bad for my device. I don't have that device in my hand right now. But I saved one aes-256-cbc result in my web notebook as below:

type 16 bytes 64 bytes 256 bytes 1024 bytes 8192 bytes
aes-256-cbc 19626.95k 24289.71k 25054.46k 25347.75k 25337.86k
Please note, there are two modes to accelerate encryption/decryption.
1. HW instructions like intel x86 CPU.
2. Using a crypto engine.
When your device is 2 and its CPU is not powerful, normally with cryptodev speed is much faster at least for big blocks. Maybe for small blocks it's slower because it needs the time to push the work to kernel and then HW engine and the time spent is may longer than the time costed by OpenSSL directly does the encryption/decryption.
Tony

Jan Just Keijser <janj...@nikhef.nl <mailto:janj...@nikhef.nl>> 于2020年12月2日周三 下午7:24写道:

    hi Tony,

    On 01/12/20 02:50, Tony He wrote:
    Hi Arne,

    openssl speed -evp aes-128-cbc
    type 16 bytes 64 bytes 256 bytes 1024 bytes 8192 bytes
    aes-128-cbc 20035.60k 123261.54k 267081.60k 1094764.09k 9181370.18k
    openssl speed -evp aes-128-gcm
    type 16 bytes 64 bytes 256 bytes 1024 bytes 8192 bytes
    aes-128-gcm 18738.76k 19284.91k 19524.44k 19606.87k 19685.46k
    openssl speed -evp aes-128-ccm
    type 16 bytes 64 bytes 256 bytes 1024 bytes 8192 bytes
    aes-128-ccm 53859.07k 215581.12k 862070.02k 3460786.43k
    27566347.61k
    openssl speed -evp sha1
    type 16 bytes 64 bytes 256 bytes 1024 bytes 8192 bytes sha1
    3108.57k 12177.79k 57325.18k 181610.34k 1207364.27k
    openssl speed -evp chacha20-poly1305
    chacha20-poly1305 is an unknown cipher or digest
    Using old openssl, so chacha20-poly1305 is not supported.

    these numbers look suspiciously like you're using the linux
    cryptodev module. Openssl speed results for the linux cryptodev
    module are totally unreliable and I'd even go so far as to say
    that the *only* numbers I trust in the output above are for
    aes-128-gcm

    For example, if I do the same on an i5-6800 I get *without* the
    cryptodev module:
      $ openssl speed -evp aes-128-cbc
      aes-128-cbc    1,104,599.38k  1,208,651.07k 1,231,766.70k 
    1,237,545.64k  1,248,793.94k

    and with the module I get
      aes-128-cbc      45,087.41k   127,822.72k   581,517.17k
    2,256,593.19k 27,583,804.51k

    the second set of numbers doesn't make sense, and a much better
    test is to do an actual encryption test, e.g.

    *without* the module
    cat BIGFILE | openssl aes-256-cbc -e -pass
    pass:thisisabadpassword |  pv > /dev/null
    2.93GB 0:00:05 [ 549MB/s] [ <=> ]

    ('pv' aka 'pipeview' is a handy tool to measure the throughput of
    a UNIX pipe)

    and with the module:
    cat BIGFILE | ./openssl aes-256-cbc -e -pass
    pass:thisisabadpassword -engine cryptodev|  pv > /dev/null
    engine "cryptodev" set.
    2.93GB 0:00:07 [ 426MB/s] [              <=>

    so you see that using the cryptodev module actually slows things
    down - which is to be expected, as the application needs to do
    more work using the cryptodev module.

    My advice is to rerun your tests *without* the cryptodev module
    and then decide wheter you really need CBC+CCM hmacs.

    HTH,

    JJK


    Arne Schwabe <a...@rfc2549.org <mailto:a...@rfc2549.org>>
    于2020年11月26日周四 下午6:40写道:

        Am 26.11.20 um 10:41 schrieb Tony He:
        > Hi Arne,
        >
        >>Since the original thread was not on the mailing list I am
        missing your
        >>goal but if your crypto acelator already works with
        OpenSSL, then it
        >>will also work with the "normal" OpenVPN
        >
        > Yes, it wokrs with "normal" OpenVPN(OpenVPN2), but
        according to the test
        > result, it's still not fast(about 60Mbps).
        > The bottleneck is not encryption operation any more. It
        comes from the
        > switch of user space and kernel space in the OpenVPN2,
        > which makes the poor CPU of  embedded device very
        busy. That's why we
        > need OpenVPN3 running in the kernel space.


        What numbers are we are talking in crypto speed? Could you
        provide from
        your "poor" device:


        openssl speed -evp aes-128-cbc
        openssl speed -evp aes-128-gcm
        openssl speed -evp aes-128-ccm
        openssl speed -evp sha1
        openssl speed -evp chacha20-poly1305

        I want to what difference/gain in terms of raw crypto speed
        we are
        talking here.

        Arne






    _______________________________________________
    Openvpn-devel mailing list
    Openvpn-devel@lists.sourceforge.net
    <mailto:Openvpn-devel@lists.sourceforge.net>
    https://lists.sourceforge.net/lists/listinfo/openvpn-devel



_______________________________________________
Openvpn-devel mailing list
Openvpn-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/openvpn-devel

Reply via email to