The performance of the 1.0.0 AES algorithm as reported by "openssl speed", appears to be much lower with block sizes of 16, 64 and 256 bytes than with previous releases. Larger block sizes of 1024 and 8192 bytes show good performance. Is this to be expected? Tests were run on a RedHat Linux system with an Intel X86-64 CPU for both the 32-bit and 64-bit builds of the code. Should lower AES performance for small buffer sizes be expected with 1.0.0, or is "speed aes" reporting inaccurate data?
Initial measurements (not yet tabulated) also suggest slower "asm" AES performance on Windows, than with non-ASM C-code (all other algorithms seem to speed up with ASM). Is the 1.0.0 ASM code somehow optimized for large blocks at the expense of performance for smaller blocks? Here are the Linux measurements: 16 aes-128 cbc 48396.56k 1.0.0 IA32 16 aes-128 cbc 62149.03k 0.9.7m IA32 16 aes-128 cbc 68019.43k 1.0.0 X86-64 16 aes-128 cbc 88312.45k 0.9.8i IA32 16 aes-128 cbc 88318.84k 0.9.8g IA32 16 aes-128 cbc 88327.42k 0.9.8n IA32 16 aes-128 cbc 93778.14k 0.9.8n X86-64 16 aes-128 cbc 95182.37k 0.9.8g X86-64 16 aes-128 cbc 108289.21k 0.9.7m X86-64 16 aes-128 cbc 118355.21k 0.9.8i X86-64 16 aes-192 cbc 41264.04k 1.0.0 IA32 16 aes-192 cbc 53496.43k 0.9.7m IA32 16 aes-192 cbc 58719.91k 1.0.0 X86-64 16 aes-192 cbc 62074.93k 0.9.8n IA32 16 aes-192 cbc 77750.58k 0.9.8i IA32 16 aes-192 cbc 78042.00k 0.9.8g IA32 16 aes-192 cbc 83391.87k 0.9.8g X86-64 16 aes-192 cbc 97393.73k 0.9.7m X86-64 16 aes-192 cbc 102274.46k 0.9.8n X86-64 16 aes-192 cbc 103036.48k 0.9.8i X86-64 16 aes-256 cbc 35870.91k 1.0.0 IA32 16 aes-256 cbc 47263.12k 0.9.7m IA32 16 aes-256 cbc 50904.49k 1.0.0 X86-64 16 aes-256 cbc 56657.66k 0.9.8n IA32 16 aes-256 cbc 69572.85k 0.9.8i IA32 16 aes-256 cbc 70551.44k 0.9.8g IA32 16 aes-256 cbc 79190.54k 0.9.8g X86-64 16 aes-256 cbc 83132.01k 0.9.7m X86-64 16 aes-256 cbc 94170.49k 0.9.8n X86-64 16 aes-256 cbc 94397.19k 0.9.8i X86-64 64 aes-128 cbc 52209.00k 1.0.0 IA32 64 aes-128 cbc 63627.39k 0.9.7m IA32 64 aes-128 cbc 73060.07k 1.0.0 X86-64 64 aes-128 cbc 99894.63k 0.9.8g X86-64 64 aes-128 cbc 112159.53k 0.9.7m X86-64 64 aes-128 cbc 116034.03k 0.9.8g IA32 64 aes-128 cbc 116508.86k 0.9.8i IA32 64 aes-128 cbc 116515.52k 0.9.8n IA32 64 aes-128 cbc 145195.20k 0.9.8n X86-64 64 aes-128 cbc 158655.85k 0.9.8i X86-64 64 aes-192 cbc 43721.28k 1.0.0 IA32 64 aes-192 cbc 55000.85k 0.9.7m IA32 64 aes-192 cbc 61552.15k 1.0.0 X86-64 64 aes-192 cbc 93222.16k 0.9.8g X86-64 64 aes-192 cbc 93684.63k 0.9.8n IA32 64 aes-192 cbc 99082.52k 0.9.7m X86-64 64 aes-192 cbc 101047.74k 0.9.8i IA32 64 aes-192 cbc 101152.17k 0.9.8g IA32 64 aes-192 cbc 135756.22k 0.9.8n X86-64 64 aes-192 cbc 137924.99k 0.9.8i X86-64 64 aes-256 cbc 37898.22k 1.0.0 IA32 64 aes-256 cbc 48498.52k 0.9.7m IA32 64 aes-256 cbc 53026.86k 1.0.0 X86-64 64 aes-256 cbc 81260.37k 0.9.8g X86-64 64 aes-256 cbc 82552.66k 0.9.8n IA32 64 aes-256 cbc 88551.06k 0.9.8i IA32 64 aes-256 cbc 89076.07k 0.9.8g IA32 64 aes-256 cbc 89409.30k 0.9.7m X86-64 64 aes-256 cbc 122172.65k 0.9.8n X86-64 64 aes-256 cbc 122190.59k 0.9.8i X86-64 256 aes-128 cbc 53574.14k 1.0.0 IA32 256 aes-128 cbc 64837.03k 0.9.7m IA32 256 aes-128 cbc 74748.16k 1.0.0 X86-64 256 aes-128 cbc 102532.35k 0.9.8g X86-64 256 aes-128 cbc 116806.55k 0.9.7m X86-64 256 aes-128 cbc 126397.87k 0.9.8g IA32 256 aes-128 cbc 127568.81k 0.9.8i IA32 256 aes-128 cbc 127617.28k 0.9.8n IA32 256 aes-128 cbc 172727.04k 0.9.8n X86-64 256 aes-128 cbc 176899.24k 0.9.8i X86-64 256 aes-192 cbc 44478.29k 1.0.0 IA32 256 aes-192 cbc 55660.18k 0.9.7m IA32 256 aes-192 cbc 62680.58k 1.0.0 X86-64 256 aes-192 cbc 96900.10k 0.9.8g X86-64 256 aes-192 cbc 103281.66k 0.9.7m X86-64 256 aes-192 cbc 107901.61k 0.9.8n IA32 256 aes-192 cbc 110372.44k 0.9.8i IA32 256 aes-192 cbc 110426.45k 0.9.8g IA32 256 aes-192 cbc 150373.21k 0.9.8n X86-64 256 aes-192 cbc 152013.91k 0.9.8i X86-64 256 aes-256 cbc 38578.09k 1.0.0 IA32 256 aes-256 cbc 49249.88k 0.9.7m IA32 256 aes-256 cbc 53894.49k 1.0.0 X86-64 256 aes-256 cbc 82765.23k 0.9.8g X86-64 256 aes-256 cbc 93133.91k 0.9.7m X86-64 256 aes-256 cbc 94662.91k 0.9.8n IA32 256 aes-256 cbc 95682.39k 0.9.8g IA32 256 aes-256 cbc 95718.31k 0.9.8i IA32 256 aes-256 cbc 133965.57k 0.9.8n X86-64 256 aes-256 cbc 134034.09k 0.9.8i X86-64 1024 aes-128 cbc 65167.36k 0.9.7m IA32 1024 aes-128 cbc 102270.63k 0.9.8g X86-64 1024 aes-128 cbc 117155.13k 0.9.7m X86-64 1024 aes-128 cbc 129688.23k 0.9.8g IA32 1024 aes-128 cbc 130639.87k 1.0.0 IA32 1024 aes-128 cbc 130673.32k 0.9.8n IA32 1024 aes-128 cbc 130711.89k 0.9.8i IA32 1024 aes-128 cbc 182214.66k 0.9.8i X86-64 1024 aes-128 cbc 182685.01k 0.9.8n X86-64 1024 aes-128 cbc 182722.56k 1.0.0 X86-64 1024 aes-192 cbc 56220.68k 0.9.7m IA32 1024 aes-192 cbc 92319.40k 0.9.8g X86-64 1024 aes-192 cbc 103504.21k 0.9.7m X86-64 1024 aes-192 cbc 111468.20k 0.9.8n IA32 1024 aes-192 cbc 111897.60k 1.0.0 IA32 1024 aes-192 cbc 112902.49k 0.9.8g IA32 1024 aes-192 cbc 112948.57k 0.9.8i IA32 1024 aes-192 cbc 154645.85k 0.9.8n X86-64 1024 aes-192 cbc 156428.97k 1.0.0 X86-64 1024 aes-192 cbc 157285.37k 0.9.8i X86-64 1024 aes-256 cbc 49435.31k 0.9.7m IA32 1024 aes-256 cbc 83221.85k 0.9.8g X86-64 1024 aes-256 cbc 94171.36k 0.9.7m X86-64 1024 aes-256 cbc 97642.50k 0.9.8g IA32 1024 aes-256 cbc 98040.15k 0.9.8n IA32 1024 aes-256 cbc 98308.44k 0.9.8i IA32 1024 aes-256 cbc 98555.56k 1.0.0 IA32 1024 aes-256 cbc 136102.91k 1.0.0 X86-64 1024 aes-256 cbc 136150.02k 0.9.8n X86-64 1024 aes-256 cbc 137167.53k 0.9.8i X86-64 8192 aes-128 cbc 65213.78k 0.9.7m IA32 8192 aes-128 cbc 102479.19k 0.9.8g X86-64 8192 aes-128 cbc 117910.00k 0.9.7m X86-64 8192 aes-128 cbc 131528.02k 0.9.8n IA32 8192 aes-128 cbc 131552.60k 0.9.8g IA32 8192 aes-128 cbc 131555.33k 1.0.0 IA32 8192 aes-128 cbc 131601.75k 0.9.8i IA32 8192 aes-128 cbc 184802.20k 0.9.8i X86-64 8192 aes-128 cbc 185281.19k 0.9.8n X86-64 8192 aes-128 cbc 185476.20k 1.0.0 X86-64 8192 aes-192 cbc 56011.43k 0.9.7m IA32 8192 aes-192 cbc 92312.92k 0.9.8g X86-64 8192 aes-192 cbc 103623.34k 0.9.7m X86-64 8192 aes-192 cbc 112574.46k 0.9.8n IA32 8192 aes-192 cbc 112765.61k 1.0.0 IA32 8192 aes-192 cbc 113751.38k 0.9.8g IA32 8192 aes-192 cbc 113879.76k 0.9.8i IA32 8192 aes-192 cbc 156132.94k 0.9.8n X86-64 8192 aes-192 cbc 158081.02k 0.9.8i X86-64 8192 aes-192 cbc 158291.29k 1.0.0 X86-64 8192 aes-256 cbc 49504.26k 0.9.7m IA32 8192 aes-256 cbc 83320.83k 0.9.8g X86-64 8192 aes-256 cbc 94254.42k 0.9.7m X86-64 8192 aes-256 cbc 98284.89k 0.9.8g IA32 8192 aes-256 cbc 99057.66k 0.9.8n IA32 8192 aes-256 cbc 99098.62k 1.0.0 IA32 8192 aes-256 cbc 99172.35k 0.9.8i IA32 8192 aes-256 cbc 137052.16k 0.9.8n X86-64 8192 aes-256 cbc 137783.98k 0.9.8i X86-64 8192 aes-256 cbc 137907.80k 1.0.0 X86-64 32-bit build compiler switches: ------------------------------- === OpenSSL 0.9.7m 23 Feb 2007 === options:bn(64,32) md2(int) rc4(idx,int) des(ptr,risc1,16,long) aes(partial) blowfish(idx) compiler: gcc -fPIC -DOPENSSL_PIC -DOPENSSL_THREADS -D_REENTRANT -DDSO_DLFCN -DHAVE_DLFCN_H -DOPENSSL_NO_KRB5 -DOPENSSL_NO_HW -DOPENSSL_NO_RC5 -DOPENSSL_NO_IDEA -m32 -DL_ENDIAN -DTERMIO -O3 -fomit-frame-pointer -m486 -Wall -DSHA1_ASM -DMD5_ASM -DRMD160_ASM === OpenSSL 0.9.8g 19 Oct 2007 === options:bn(64,32) md2(int) rc4(idx,int) des(ptr,risc1,16,long) aes(partial) blowfish(idx) compiler: gcc -fPIC -DOPENSSL_PIC -DOPENSSL_THREADS -D_REENTRANT -DDSO_DLFCN -DHAVE_DLFCN_H -m32 -DL_ENDIAN -DTERMIO -O3 -fomit-frame-pointer -Wall -DOPENSSL_BN_ASM_PART_WORDS -DOPENSSL_IA32_SSE2 -DSHA1_ASM -DMD5_ASM -DRMD160_ASM -DAES_ASM === OpenSSL 0.9.8i 15 Sep 2008 === options:bn(64,32) md2(int) rc4(idx,int) des(ptr,risc1,16,long) aes(partial) blowfish(idx) compiler: gcc -fPIC -DOPENSSL_PIC -DOPENSSL_THREADS -D_REENTRANT -DDSO_DLFCN -DHAVE_DLFCN_H -m32 -DL_ENDIAN -DTERMIO -O3 -fomit-frame-pointer -Wall -DOPENSSL_BN_ASM_PART_WORDS -DOPENSSL_IA32_SSE2 -DSHA1_ASM -DMD5_ASM -DRMD160_ASM -DAES_ASM === OpenSSL 0.9.8n 24 Mar 2010 === options:bn(64,32) md2(int) rc4(idx,int) des(ptr,risc1,16,long) aes(partial) blowfish(idx) compiler: gcc -fPIC -DOPENSSL_PIC -DOPENSSL_THREADS -D_REENTRANT -DDSO_DLFCN -DHAVE_DLFCN_H -m32 -DL_ENDIAN -DTERMIO -O3 -fomit-frame-pointer -Wall -DOPENSSL_BN_ASM_PART_WORDS -DOPENSSL_IA32_SSE2 -DSHA1_ASM -DMD5_ASM -DRMD160_ASM -DAES_ASM === OpenSSL 1.0.0 29 Mar 2010 === options:bn(64,32) rc4(4x,int) des(ptr,risc1,16,long) aes(partial) blowfish(idx) compiler: gcc -fPIC -DOPENSSL_PIC -DOPENSSL_THREADS -D_REENTRANT -DDSO_DLFCN -DHAVE_DLFCN_H -m32 -DL_ENDIAN -DTERMIO -O3 -fomit-frame-pointer -Wall -DOPENSSL_BN_ASM_PART_WORDS -DOPENSSL_IA32_SSE2 -DOPENSSL_BN_ASM_MONT -DSHA1_ASM -DSHA256_ASM -DSHA512_ASM -DMD5_ASM -DRMD160_ASM -DAES_ASM -DWHIRLPOOL_ASM 64-bit build compiler switches: ------------------------------- === OpenSSL 0.9.7m 23 Feb 2007 === options:bn(64,64) md2(int) rc4(ptr,char) des(idx,cisc,16,int) aes(partial) blowfish(ptr2) compiler: gcc -fPIC -DOPENSSL_PIC -DOPENSSL_THREADS -D_REENTRANT -DDSO_DLFCN -DHAVE_DLFCN_H -DOPENSSL_NO_KRB5 -DOPENSSL_NO_HW -DOPENSSL_NO_RC5 -DOPENSSL_NO_IDEA -m64 -DL_ENDIAN -DTERMIO -O3 -Wall -DMD32_REG_T=int === OpenSSL 0.9.8g 19 Oct 2007 === options:bn(64,64) md2(int) rc4(ptr,int) des(idx,cisc,16,int) aes(partial) blowfish(ptr2) compiler: gcc -fPIC -DOPENSSL_PIC -DOPENSSL_THREADS -D_REENTRANT -DDSO_DLFCN -DHAVE_DLFCN_H -m64 -DL_ENDIAN -DTERMIO -O3 -Wall -DMD32_REG_T=int -DMD5_ASM === OpenSSL 0.9.8i 15 Sep 2008 === options:bn(64,64) md2(int) rc4(8x,int) des(idx,cisc,16,int) aes(partial) blowfish(ptr2) compiler: gcc -fPIC -DOPENSSL_PIC -DOPENSSL_THREADS -D_REENTRANT -DDSO_DLFCN -DHAVE_DLFCN_H -m64 -DL_ENDIAN -DTERMIO -O3 -Wall -DMD32_REG_T=int -DOPENSSL_BN_ASM_MONT -DSHA1_ASM -DSHA256_ASM -DSHA512_ASM -DMD5_ASM -DAES_ASM === OpenSSL 0.9.8n 24 Mar 2010 === options:bn(64,64) md2(int) rc4(8x,int) des(idx,cisc,16,int) aes(partial) blowfish(idx) compiler: gcc -fPIC -DOPENSSL_PIC -DOPENSSL_THREADS -D_REENTRANT -DDSO_DLFCN -DHAVE_DLFCN_H -m64 -DL_ENDIAN -DTERMIO -O3 -Wall -DMD32_REG_T=int -DOPENSSL_BN_ASM_MONT -DSHA1_ASM -DSHA256_ASM -DSHA512_ASM -DMD5_ASM -DAES_ASM === OpenSSL 1.0.0 29 Mar 2010 === options:bn(64,64) rc4(8x,int) des(idx,cisc,16,int) aes(partial) blowfish(idx) compiler: gcc -fPIC -DOPENSSL_PIC -DOPENSSL_THREADS -D_REENTRANT -DDSO_DLFCN -DHAVE_DLFCN_H -m64 -DL_ENDIAN -DTERMIO -O3 -Wall -DMD32_REG_T=int -DOPENSSL_IA32_SSE2 -DOPENSSL_BN_ASM_MONT -DSHA1_ASM -DSHA256_ASM -DSHA512_ASM -DMD5_ASM -DAES_ASM -DWHIRLPOOL_ASM -- Viktor. ______________________________________________________________________ OpenSSL Project http://www.openssl.org User Support Mailing List openssl-users@openssl.org Automated List Manager majord...@openssl.org