The performance of the 1.0.0 AES algorithm as reported by "openssl speed",
appears to be much lower with block sizes of 16, 64 and 256 bytes than
with previous releases. Larger block sizes of 1024 and 8192 bytes show
good performance. Is this to be expected? Tests were run on a RedHat
Linux system with an Intel X86-64 CPU for both the 32-bit and 64-bit
builds of the code. Should lower AES performance for small buffer sizes
be expected with 1.0.0, or is "speed aes" reporting inaccurate data?

Initial measurements (not yet tabulated) also suggest slower "asm" AES
performance on Windows, than with non-ASM C-code (all other algorithms
seem to speed up with ASM).  Is the 1.0.0 ASM code somehow optimized
for large blocks at the expense of performance for smaller blocks?

Here are the Linux measurements:

      16 aes-128 cbc       48396.56k        1.0.0   IA32
      16 aes-128 cbc       62149.03k        0.9.7m  IA32
      16 aes-128 cbc       68019.43k        1.0.0   X86-64
      16 aes-128 cbc       88312.45k        0.9.8i  IA32
      16 aes-128 cbc       88318.84k        0.9.8g  IA32
      16 aes-128 cbc       88327.42k        0.9.8n  IA32
      16 aes-128 cbc       93778.14k        0.9.8n  X86-64
      16 aes-128 cbc       95182.37k        0.9.8g  X86-64
      16 aes-128 cbc      108289.21k        0.9.7m  X86-64
      16 aes-128 cbc      118355.21k        0.9.8i  X86-64

      16 aes-192 cbc       41264.04k        1.0.0   IA32
      16 aes-192 cbc       53496.43k        0.9.7m  IA32
      16 aes-192 cbc       58719.91k        1.0.0   X86-64
      16 aes-192 cbc       62074.93k        0.9.8n  IA32
      16 aes-192 cbc       77750.58k        0.9.8i  IA32
      16 aes-192 cbc       78042.00k        0.9.8g  IA32
      16 aes-192 cbc       83391.87k        0.9.8g  X86-64
      16 aes-192 cbc       97393.73k        0.9.7m  X86-64
      16 aes-192 cbc      102274.46k        0.9.8n  X86-64
      16 aes-192 cbc      103036.48k        0.9.8i  X86-64

      16 aes-256 cbc       35870.91k        1.0.0   IA32
      16 aes-256 cbc       47263.12k        0.9.7m  IA32
      16 aes-256 cbc       50904.49k        1.0.0   X86-64
      16 aes-256 cbc       56657.66k        0.9.8n  IA32
      16 aes-256 cbc       69572.85k        0.9.8i  IA32
      16 aes-256 cbc       70551.44k        0.9.8g  IA32
      16 aes-256 cbc       79190.54k        0.9.8g  X86-64
      16 aes-256 cbc       83132.01k        0.9.7m  X86-64
      16 aes-256 cbc       94170.49k        0.9.8n  X86-64
      16 aes-256 cbc       94397.19k        0.9.8i  X86-64

      64 aes-128 cbc       52209.00k        1.0.0   IA32
      64 aes-128 cbc       63627.39k        0.9.7m  IA32
      64 aes-128 cbc       73060.07k        1.0.0   X86-64
      64 aes-128 cbc       99894.63k        0.9.8g  X86-64
      64 aes-128 cbc      112159.53k        0.9.7m  X86-64
      64 aes-128 cbc      116034.03k        0.9.8g  IA32
      64 aes-128 cbc      116508.86k        0.9.8i  IA32
      64 aes-128 cbc      116515.52k        0.9.8n  IA32
      64 aes-128 cbc      145195.20k        0.9.8n  X86-64
      64 aes-128 cbc      158655.85k        0.9.8i  X86-64

      64 aes-192 cbc       43721.28k        1.0.0   IA32
      64 aes-192 cbc       55000.85k        0.9.7m  IA32
      64 aes-192 cbc       61552.15k        1.0.0   X86-64
      64 aes-192 cbc       93222.16k        0.9.8g  X86-64
      64 aes-192 cbc       93684.63k        0.9.8n  IA32
      64 aes-192 cbc       99082.52k        0.9.7m  X86-64
      64 aes-192 cbc      101047.74k        0.9.8i  IA32
      64 aes-192 cbc      101152.17k        0.9.8g  IA32
      64 aes-192 cbc      135756.22k        0.9.8n  X86-64
      64 aes-192 cbc      137924.99k        0.9.8i  X86-64

      64 aes-256 cbc       37898.22k        1.0.0   IA32
      64 aes-256 cbc       48498.52k        0.9.7m  IA32
      64 aes-256 cbc       53026.86k        1.0.0   X86-64
      64 aes-256 cbc       81260.37k        0.9.8g  X86-64
      64 aes-256 cbc       82552.66k        0.9.8n  IA32
      64 aes-256 cbc       88551.06k        0.9.8i  IA32
      64 aes-256 cbc       89076.07k        0.9.8g  IA32
      64 aes-256 cbc       89409.30k        0.9.7m  X86-64
      64 aes-256 cbc      122172.65k        0.9.8n  X86-64
      64 aes-256 cbc      122190.59k        0.9.8i  X86-64

     256 aes-128 cbc       53574.14k        1.0.0   IA32
     256 aes-128 cbc       64837.03k        0.9.7m  IA32
     256 aes-128 cbc       74748.16k        1.0.0   X86-64
     256 aes-128 cbc      102532.35k        0.9.8g  X86-64
     256 aes-128 cbc      116806.55k        0.9.7m  X86-64
     256 aes-128 cbc      126397.87k        0.9.8g  IA32
     256 aes-128 cbc      127568.81k        0.9.8i  IA32
     256 aes-128 cbc      127617.28k        0.9.8n  IA32
     256 aes-128 cbc      172727.04k        0.9.8n  X86-64
     256 aes-128 cbc      176899.24k        0.9.8i  X86-64

     256 aes-192 cbc       44478.29k        1.0.0   IA32
     256 aes-192 cbc       55660.18k        0.9.7m  IA32
     256 aes-192 cbc       62680.58k        1.0.0   X86-64
     256 aes-192 cbc       96900.10k        0.9.8g  X86-64
     256 aes-192 cbc      103281.66k        0.9.7m  X86-64
     256 aes-192 cbc      107901.61k        0.9.8n  IA32
     256 aes-192 cbc      110372.44k        0.9.8i  IA32
     256 aes-192 cbc      110426.45k        0.9.8g  IA32
     256 aes-192 cbc      150373.21k        0.9.8n  X86-64
     256 aes-192 cbc      152013.91k        0.9.8i  X86-64

     256 aes-256 cbc       38578.09k        1.0.0   IA32
     256 aes-256 cbc       49249.88k        0.9.7m  IA32
     256 aes-256 cbc       53894.49k        1.0.0   X86-64
     256 aes-256 cbc       82765.23k        0.9.8g  X86-64
     256 aes-256 cbc       93133.91k        0.9.7m  X86-64
     256 aes-256 cbc       94662.91k        0.9.8n  IA32
     256 aes-256 cbc       95682.39k        0.9.8g  IA32
     256 aes-256 cbc       95718.31k        0.9.8i  IA32
     256 aes-256 cbc      133965.57k        0.9.8n  X86-64
     256 aes-256 cbc      134034.09k        0.9.8i  X86-64

    1024 aes-128 cbc       65167.36k        0.9.7m  IA32
    1024 aes-128 cbc      102270.63k        0.9.8g  X86-64
    1024 aes-128 cbc      117155.13k        0.9.7m  X86-64
    1024 aes-128 cbc      129688.23k        0.9.8g  IA32
    1024 aes-128 cbc      130639.87k        1.0.0   IA32
    1024 aes-128 cbc      130673.32k        0.9.8n  IA32
    1024 aes-128 cbc      130711.89k        0.9.8i  IA32
    1024 aes-128 cbc      182214.66k        0.9.8i  X86-64
    1024 aes-128 cbc      182685.01k        0.9.8n  X86-64
    1024 aes-128 cbc      182722.56k        1.0.0   X86-64

    1024 aes-192 cbc       56220.68k        0.9.7m  IA32
    1024 aes-192 cbc       92319.40k        0.9.8g  X86-64
    1024 aes-192 cbc      103504.21k        0.9.7m  X86-64
    1024 aes-192 cbc      111468.20k        0.9.8n  IA32
    1024 aes-192 cbc      111897.60k        1.0.0   IA32
    1024 aes-192 cbc      112902.49k        0.9.8g  IA32
    1024 aes-192 cbc      112948.57k        0.9.8i  IA32
    1024 aes-192 cbc      154645.85k        0.9.8n  X86-64
    1024 aes-192 cbc      156428.97k        1.0.0   X86-64
    1024 aes-192 cbc      157285.37k        0.9.8i  X86-64

    1024 aes-256 cbc       49435.31k        0.9.7m  IA32
    1024 aes-256 cbc       83221.85k        0.9.8g  X86-64
    1024 aes-256 cbc       94171.36k        0.9.7m  X86-64
    1024 aes-256 cbc       97642.50k        0.9.8g  IA32
    1024 aes-256 cbc       98040.15k        0.9.8n  IA32
    1024 aes-256 cbc       98308.44k        0.9.8i  IA32
    1024 aes-256 cbc       98555.56k        1.0.0   IA32
    1024 aes-256 cbc      136102.91k        1.0.0   X86-64
    1024 aes-256 cbc      136150.02k        0.9.8n  X86-64
    1024 aes-256 cbc      137167.53k        0.9.8i  X86-64

    8192 aes-128 cbc       65213.78k        0.9.7m  IA32
    8192 aes-128 cbc      102479.19k        0.9.8g  X86-64
    8192 aes-128 cbc      117910.00k        0.9.7m  X86-64
    8192 aes-128 cbc      131528.02k        0.9.8n  IA32
    8192 aes-128 cbc      131552.60k        0.9.8g  IA32
    8192 aes-128 cbc      131555.33k        1.0.0   IA32
    8192 aes-128 cbc      131601.75k        0.9.8i  IA32
    8192 aes-128 cbc      184802.20k        0.9.8i  X86-64
    8192 aes-128 cbc      185281.19k        0.9.8n  X86-64
    8192 aes-128 cbc      185476.20k        1.0.0   X86-64

    8192 aes-192 cbc       56011.43k        0.9.7m  IA32
    8192 aes-192 cbc       92312.92k        0.9.8g  X86-64
    8192 aes-192 cbc      103623.34k        0.9.7m  X86-64
    8192 aes-192 cbc      112574.46k        0.9.8n  IA32
    8192 aes-192 cbc      112765.61k        1.0.0   IA32
    8192 aes-192 cbc      113751.38k        0.9.8g  IA32
    8192 aes-192 cbc      113879.76k        0.9.8i  IA32
    8192 aes-192 cbc      156132.94k        0.9.8n  X86-64
    8192 aes-192 cbc      158081.02k        0.9.8i  X86-64
    8192 aes-192 cbc      158291.29k        1.0.0   X86-64

    8192 aes-256 cbc       49504.26k        0.9.7m  IA32
    8192 aes-256 cbc       83320.83k        0.9.8g  X86-64
    8192 aes-256 cbc       94254.42k        0.9.7m  X86-64
    8192 aes-256 cbc       98284.89k        0.9.8g  IA32
    8192 aes-256 cbc       99057.66k        0.9.8n  IA32
    8192 aes-256 cbc       99098.62k        1.0.0   IA32
    8192 aes-256 cbc       99172.35k        0.9.8i  IA32
    8192 aes-256 cbc      137052.16k        0.9.8n  X86-64
    8192 aes-256 cbc      137783.98k        0.9.8i  X86-64
    8192 aes-256 cbc      137907.80k        1.0.0   X86-64

32-bit build compiler switches:
-------------------------------

=== OpenSSL 0.9.7m 23 Feb 2007 ===
options:bn(64,32) md2(int) rc4(idx,int) des(ptr,risc1,16,long) aes(partial) 
blowfish(idx)
compiler: gcc -fPIC -DOPENSSL_PIC -DOPENSSL_THREADS -D_REENTRANT -DDSO_DLFCN 
-DHAVE_DLFCN_H -DOPENSSL_NO_KRB5 -DOPENSSL_NO_HW -DOPENSSL_NO_RC5 
-DOPENSSL_NO_IDEA -m32 -DL_ENDIAN -DTERMIO -O3 -fomit-frame-pointer -m486 -Wall 
-DSHA1_ASM -DMD5_ASM -DRMD160_ASM

=== OpenSSL 0.9.8g 19 Oct 2007 ===
options:bn(64,32) md2(int) rc4(idx,int) des(ptr,risc1,16,long) aes(partial) 
blowfish(idx)
compiler: gcc -fPIC -DOPENSSL_PIC -DOPENSSL_THREADS -D_REENTRANT -DDSO_DLFCN 
-DHAVE_DLFCN_H -m32 -DL_ENDIAN -DTERMIO -O3 -fomit-frame-pointer -Wall 
-DOPENSSL_BN_ASM_PART_WORDS -DOPENSSL_IA32_SSE2 -DSHA1_ASM -DMD5_ASM 
-DRMD160_ASM -DAES_ASM

=== OpenSSL 0.9.8i 15 Sep 2008 ===
options:bn(64,32) md2(int) rc4(idx,int) des(ptr,risc1,16,long) aes(partial) 
blowfish(idx)
compiler: gcc -fPIC -DOPENSSL_PIC -DOPENSSL_THREADS -D_REENTRANT -DDSO_DLFCN 
-DHAVE_DLFCN_H -m32 -DL_ENDIAN -DTERMIO -O3 -fomit-frame-pointer -Wall 
-DOPENSSL_BN_ASM_PART_WORDS -DOPENSSL_IA32_SSE2 -DSHA1_ASM -DMD5_ASM 
-DRMD160_ASM -DAES_ASM

=== OpenSSL 0.9.8n 24 Mar 2010 ===
options:bn(64,32) md2(int) rc4(idx,int) des(ptr,risc1,16,long) aes(partial) 
blowfish(idx)
compiler: gcc -fPIC -DOPENSSL_PIC -DOPENSSL_THREADS -D_REENTRANT -DDSO_DLFCN 
-DHAVE_DLFCN_H -m32 -DL_ENDIAN -DTERMIO -O3 -fomit-frame-pointer -Wall 
-DOPENSSL_BN_ASM_PART_WORDS -DOPENSSL_IA32_SSE2 -DSHA1_ASM -DMD5_ASM 
-DRMD160_ASM -DAES_ASM

=== OpenSSL 1.0.0 29 Mar 2010 ===
options:bn(64,32) rc4(4x,int) des(ptr,risc1,16,long) aes(partial) blowfish(idx)
compiler: gcc -fPIC -DOPENSSL_PIC -DOPENSSL_THREADS -D_REENTRANT -DDSO_DLFCN 
-DHAVE_DLFCN_H -m32 -DL_ENDIAN -DTERMIO -O3 -fomit-frame-pointer -Wall 
-DOPENSSL_BN_ASM_PART_WORDS -DOPENSSL_IA32_SSE2 -DOPENSSL_BN_ASM_MONT 
-DSHA1_ASM -DSHA256_ASM -DSHA512_ASM -DMD5_ASM -DRMD160_ASM -DAES_ASM 
-DWHIRLPOOL_ASM

64-bit build compiler switches:
-------------------------------

=== OpenSSL 0.9.7m 23 Feb 2007 ===
options:bn(64,64) md2(int) rc4(ptr,char) des(idx,cisc,16,int) aes(partial) 
blowfish(ptr2)
compiler: gcc -fPIC -DOPENSSL_PIC -DOPENSSL_THREADS -D_REENTRANT -DDSO_DLFCN 
-DHAVE_DLFCN_H -DOPENSSL_NO_KRB5 -DOPENSSL_NO_HW -DOPENSSL_NO_RC5 
-DOPENSSL_NO_IDEA -m64 -DL_ENDIAN -DTERMIO -O3 -Wall -DMD32_REG_T=int

=== OpenSSL 0.9.8g 19 Oct 2007 ===
options:bn(64,64) md2(int) rc4(ptr,int) des(idx,cisc,16,int) aes(partial) 
blowfish(ptr2)
compiler: gcc -fPIC -DOPENSSL_PIC -DOPENSSL_THREADS -D_REENTRANT -DDSO_DLFCN 
-DHAVE_DLFCN_H -m64 -DL_ENDIAN -DTERMIO -O3 -Wall -DMD32_REG_T=int -DMD5_ASM

=== OpenSSL 0.9.8i 15 Sep 2008 ===
options:bn(64,64) md2(int) rc4(8x,int) des(idx,cisc,16,int) aes(partial) 
blowfish(ptr2)
compiler: gcc -fPIC -DOPENSSL_PIC -DOPENSSL_THREADS -D_REENTRANT -DDSO_DLFCN 
-DHAVE_DLFCN_H -m64 -DL_ENDIAN -DTERMIO -O3 -Wall -DMD32_REG_T=int 
-DOPENSSL_BN_ASM_MONT -DSHA1_ASM -DSHA256_ASM -DSHA512_ASM -DMD5_ASM -DAES_ASM

=== OpenSSL 0.9.8n 24 Mar 2010 ===
options:bn(64,64) md2(int) rc4(8x,int) des(idx,cisc,16,int) aes(partial) 
blowfish(idx)
compiler: gcc -fPIC -DOPENSSL_PIC -DOPENSSL_THREADS -D_REENTRANT -DDSO_DLFCN 
-DHAVE_DLFCN_H -m64 -DL_ENDIAN -DTERMIO -O3 -Wall -DMD32_REG_T=int 
-DOPENSSL_BN_ASM_MONT -DSHA1_ASM -DSHA256_ASM -DSHA512_ASM -DMD5_ASM -DAES_ASM

=== OpenSSL 1.0.0 29 Mar 2010 ===
options:bn(64,64) rc4(8x,int) des(idx,cisc,16,int) aes(partial) blowfish(idx)
compiler: gcc -fPIC -DOPENSSL_PIC -DOPENSSL_THREADS -D_REENTRANT -DDSO_DLFCN 
-DHAVE_DLFCN_H -m64 -DL_ENDIAN -DTERMIO -O3 -Wall -DMD32_REG_T=int 
-DOPENSSL_IA32_SSE2 -DOPENSSL_BN_ASM_MONT -DSHA1_ASM -DSHA256_ASM -DSHA512_ASM 
-DMD5_ASM -DAES_ASM -DWHIRLPOOL_ASM

-- 
        Viktor.
______________________________________________________________________
OpenSSL Project                                 http://www.openssl.org
User Support Mailing List                    openssl-users@openssl.org
Automated List Manager                           majord...@openssl.org

Reply via email to