Hello, As previously noted on this mailing list, the AES performance (without AES-NI) of v1.0.o on Intel Westmere chips seems a bit slow. In addition, RC4 seems a bit slow compared to previous Intel chips.
I've included below the speed output for several versions of OpenSSL for comparison. For simplicity, I choose a subset of the more interesting algorithms. All tests were done on an Intel Westmere running at 3.0 GHz. For reference, the output from cupid.c is below: 0000000b:756e6547:6c65746e:49656e69 000206c2:00200800:029ee3ff:bfebfbff 3c004121:01c0003f:0000003f:00000000 OpenSSL 0.9.8n 24 Mar 2010 built on: Tue Mar 30 17:29:17 PDT 2010 options:bn(64,64) md2(int) rc4(1x,char) des(idx,cisc,16,int) aes(partial) idea(int) blowfish(idx) compiler: gcc -DOPENSSL_THREADS -D_REENTRANT -DDSO_DLFCN -DHAVE_DLFCN_H -m64 -DL_ENDIAN -DTERMIO -O3 -Wall -DMD32_REG_T=int -DOPENSSL_BN_ASM_MONT -DSHA1_ASM -DSHA256_ASM -DSHA512_ASM -DMD5_ASM -DAES_ASM available timing options: TIMES TIMEB HZ=100 [sysconf value] timing function used: times The 'numbers' are in 1000s of bytes per second processed. type 16 bytes 64 bytes 256 bytes 1024 bytes 8192 bytes md5 40045.37k 125791.36k 304323.58k 470531.07k 560005.12k sha1 37855.97k 108750.44k 234854.06k 330731.18k 375278.25k rc4 319218.02k 344750.72k 291458.47k 292768.77k 293093.38k blowfish cbc 99528.55k 103911.65k 105671.68k 105717.08k 105851.56k aes-128 cbc 132494.65k 176655.57k 191974.14k 196217.86k 197528.23k aes-192 cbc 116965.39k 150690.07k 162185.98k 165767.17k 166857.39k aes-256 cbc 105400.30k 131841.28k 140635.48k 143151.79k 143720.45k OpenSSL 1.0.0 29 Mar 2010 built on: Mon Mar 29 10:24:52 PDT 2010 options:bn(64,64) rc4(1x,char) des(idx,cisc,16,int) aes(partial) idea(int) blowfish(idx) compiler: gcc -DOPENSSL_THREADS -D_REENTRANT -DDSO_DLFCN -DHAVE_DLFCN_H -m64 -DL_ENDIAN -DTERMIO -O3 -Wall -DMD32_REG_T=int -DOPENSSL_IA32_SSE2 -DOPENSSL_BN_ASM_MONT -DSHA1_ASM -DSHA256_ASM -DSHA512_ASM -DMD5_ASM -DAES_ASM -DWHIRLPOOL_ASM The 'numbers' are in 1000s of bytes per second processed. type 16 bytes 64 bytes 256 bytes 1024 bytes 8192 bytes md5 25110.83k 88038.31k 245494.95k 442503.85k 578172.25k sha1 26626.11k 83364.31k 201984.26k 312384.17k 372091.56k rc4 314936.59k 342759.70k 290525.35k 293123.94k 292855.81k blowfish cbc 99449.09k 104068.71k 105179.99k 105560.06k 105769.64k aes-128 cbc 85512.84k 92059.22k 93251.07k 94923.43k 94836.05k aes-192 cbc 72161.34k 77346.69k 78668.03k 78866.09k 79207.60k aes-256 cbc 62707.46k 66420.74k 67309.91k 67600.73k 67908.67k OpenSSL 1.1.0-dev xx XXX xxxx built on: Tue Apr 20 16:50:15 PDT 2010 options:bn(64,64) rc4(1x,char) des(idx,cisc,16,int) aes(partial) idea(int) blowfish(idx) compiler: gcc -DOPENSSL_THREADS -D_REENTRANT -DDSO_DLFCN -DHAVE_DLFCN_H -m64 -DL_ENDIAN -DTERMIO -O3 -Wall -DMD32_REG_T=int -DOPENSSL_IA32_SSE2 -DOPENSSL_BN_ASM_MONT -DSHA1_ASM -DSHA256_ASM -DSHA512_ASM -DMD5_ASM -DAES_ASM -DWHIRLPOOL_ASM The 'numbers' are in 1000s of bytes per second processed. type 16 bytes 64 bytes 256 bytes 1024 bytes 8192 bytes md5 27442.73k 93375.47k 255399.94k 450797.23k 580198.40k sha1 28238.43k 87491.35k 214036.31k 335593.23k 399572.99k rc4 312835.40k 341760.87k 290699.35k 292358.14k 292571.82k blowfish cbc 99555.57k 103852.71k 105283.07k 105647.79k 105690.45k aes-128 cbc 85725.71k 91975.21k 93809.92k 94533.97k 94857.33k aes-192 cbc 72206.59k 76802.57k 78558.72k 78909.78k 78963.76k aes-256 cbc 62567.38k 66193.28k 67344.30k 67723.26k 67758.76k (The 1.1 versions is from the 20100420 snaphsot.) For comparison, here is the the 1.0.0 speed output for a somewhat older 3.0 GHz Intel chip. OpenSSL 1.0.0 29 Mar 2010 built on: Mon Mar 29 10:24:52 PDT 2010 options:bn(64,64) rc4(1x,char) des(idx,cisc,16,int) aes(partial) idea(int) blowfish(idx) compiler: gcc -DOPENSSL_THREADS -D_REENTRANT -DDSO_DLFCN -DHAVE_DLFCN_H -m64 -DL_ENDIAN -DTERMIO -O3 -Wall -DMD32_REG_T=int -DOPENSSL_IA32_SSE2 -DOPENSSL_BN_ASM_MONT -DSHA1_ASM -DSHA256_ASM -DSHA512_ASM -DMD5_ASM -DAES_ASM -DWHIRLPOOL_ASM The 'numbers' are in 1000s of bytes per second processed. type 16 bytes 64 bytes 256 bytes 1024 bytes 8192 bytes md5 27344.09k 94055.85k 249754.62k 428658.51k 539028.05k sha1 21816.47k 80017.71k 207663.70k 342875.14k 428610.35k rc4 343765.07k 396592.60k 376180.05k 386428.59k 424370.09k blowfish cbc 102917.62k 108506.22k 109450.89k 109555.71k 109770.07k aes-128 cbc 87576.06k 96398.03k 98596.52k 207228.25k 210130.26k aes-192 cbc 74433.59k 80153.79k 82086.91k 175156.05k 177280.34k aes-256 cbc 64822.50k 69178.15k 70595.58k 150416.73k 151677.19k Note the difference in the RC4 performance between these two systems which are both nominally running at 3.0 GHz. -- Iain Morgan ______________________________________________________________________ OpenSSL Project http://www.openssl.org Development Mailing List openssl-dev@openssl.org Automated List Manager majord...@openssl.org