[openssl.org #2794] [PATCH] Remove branch hint to improve crypto algorithms performance on Power
The not-taken branch hint in the assembly code causes performance degradation as the hardware always predict the specific branch that way. The branch hint is not necessary as the hardware prediction is very good and getting better. The patch attached removed the branch hint to let the hardware do the prediction. To see the performance improvements build with -mcpu=power7 (or whatever hardware it's running on), since the hints may get ignored if the compiler defaults to targeting an older version of the hardware (Power4). Below is the performance results built with -mcpu=power7. The positive number shows performance improvements percentage after the branch hint is removed. The performance test used openssl speed then calculate the percentage using the results from the branch hint removed and the results from the base (with branch hint). Percentage=(withoutHint/withHint) * 100 - 100 sha512 shows 32% performance improvements. sha256, sha1, md4, and md5 also benefit from this change. There are some negative numbers but they are very small (less than 1%). type 16bytes 64bytes 256bytes 1024bytes 8192bytes mdc2 1.57 0.43 0.07 0.030.01 md46.6 6.6 4.47 2.5 0.35 md56.9 5.65 3.68 1.440.23 hmac(md5) 0.35 0.01 0.42 0.120 sha1 7.29 6.46 4.35 2.210.42 sha256 18.35 10.99 5.05 1.640.24 sha512 31.85 32.08 13.72 4.950.67 whirlpool 0.69 0.66 0.44 0.330.31 rmd160 6.61 4.96 3.08 1.320.36 rc4-0.01 -0.02 -0.19 -0.14 -0.22 descbc 0.04 -00.02 0.070.04 desede3-0.02 0.01 0 -0 0.01 aes-1280.05 -0-0 0 0.01 aes-1920.04 -0.01 -0 0.010 aes-2560.08 -0.01 0 0.010.01 aes-1280.02 0.03 -0.01 -0.01 -0.1 aes-1920 0.01 0.02 0.01-0.08 aes-2560 0.02 0.01 -0 -0.07 ghash 0.51 0.36 0.06 0.02-0.01 camellia-128 -0.34 -0.03 -0.02 -0.18 -0.69 camellia-192 -0.26 0.22 0.11 0.07-0.26 camellia-256 -0.23 0.03 -0.01 -0.04 -0.32 idea 0.18 0.07 0.020 0.02 seed 0.02 0.06-0.040.040.06 rc20.04 0 -0 0 0.01 blowfish 0.26 0.09-0.09 -0.04-0 cast 0.14 0.040.020.01 0 Please let me know if you have any questions. Thanks, Ashley Lai diff -ur openssl-1.0.1/crypto/ppccpuid.pl openssl-1.0.1-wp/crypto/ppccpuid.pl --- openssl-1.0.1/crypto/ppccpuid.pl 2011-11-14 14:52:33.0 -0600 +++ openssl-1.0.1-wp/crypto/ppccpuid.pl 2012-04-18 16:47:53.098711478 -0500 @@ -105,7 +105,7 @@ Little: mtctr r4 stb r0,0(r3) addi r3,r3,1 - bdnz- \$-8 + bdnz \$-8 blr Lot: andi. r5,r3,3 beq Laligned @@ -118,7 +118,7 @@ mtctr r5 stw r0,0(r3) addi r3,r3,4 - bdnz- \$-8 + bdnz \$-8 andi. r4,r4,3 bne Little blr
Re: [openssl.org #2794] [PATCH] Remove branch hint to improve crypto algorithms performance on Power
The alignments of the performance results I did before sending it out did not come out right, my apologies. Please find my performance results spreadsheet attached. Regards, Ashley Lai On Wed, 2012-04-18 at 18:52 -0500, Ashley Lai wrote: The not-taken branch hint in the assembly code causes performance degradation as the hardware always predict the specific branch that way. The branch hint is not necessary as the hardware prediction is very good and getting better. The patch attached removed the branch hint to let the hardware do the prediction. To see the performance improvements build with -mcpu=power7 (or whatever hardware it's running on), since the hints may get ignored if the compiler defaults to targeting an older version of the hardware (Power4). Below is the performance results built with -mcpu=power7. The positive number shows performance improvements percentage after the branch hint is removed. The performance test used openssl speed then calculate the percentage using the results from the branch hint removed and the results from the base (with branch hint). Percentage=(withoutHint/withHint) * 100 - 100 sha512 shows 32% performance improvements. sha256, sha1, md4, and md5 also benefit from this change. There are some negative numbers but they are very small (less than 1%). type 16bytes 64bytes 256bytes 1024bytes 8192bytes mdc2 1.57 0.43 0.07 0.030.01 md4 6.6 6.6 4.47 2.5 0.35 md5 6.9 5.65 3.68 1.440.23 hmac(md5) 0.35 0.010.42 0.120 sha1 7.29 6.46 4.35 2.210.42 sha256 18.35 10.99 5.05 1.640.24 sha512 31.85 32.08 13.72 4.950.67 whirlpool 0.690.66 0.44 0.330.31 rmd160 6.61 4.96 3.08 1.320.36 rc4 -0.01 -0.02 -0.19 -0.14 -0.22 descbc 0.04 -00.02 0.070.04 desede3 -0.02 0.01 0 -0 0.01 aes-128 0.05 -0-0 0 0.01 aes-192 0.04 -0.01 -0 0.010 aes-256 0.08 -0.01 0 0.010.01 aes-128 0.02 0.03 -0.01 -0.01 -0.1 aes-192 0 0.01 0.02 0.01-0.08 aes-256 0 0.02 0.01 -0 -0.07 ghash0.51 0.36 0.06 0.02-0.01 camellia-128 -0.34 -0.03 -0.02 -0.18 -0.69 camellia-192 -0.26 0.22 0.11 0.07-0.26 camellia-256 -0.23 0.03 -0.01 -0.04 -0.32 idea 0.18 0.07 0.020 0.02 seed 0.02 0.06-0.040.040.06 rc2 0.04 0 -0 0 0.01 blowfish 0.260.09-0.09 -0.04-0 cast 0.14 0.040.020.01 0 Please let me know if you have any questions. Thanks, Ashley Lai opensslPerfRmHint.ods Description: application/vnd.oasis.opendocument.spreadsheet
[openssl.org #1957] OpenSSL 0.9.8k Solaris build failure in apps; Makefile variables not quoted
FIPSLD_CC and CC need to be quoted, probably in more than one Makefile than this to be safe, but certainly this one to allow the build to complete. This is required because CC is cc -m64 -xcode=pic32 -w make(1) will try to grok the CC arguments '-m64 -xcode=pic32 -w' after assigning the initial 'cc' part. openssl-0.9.8k/apps root# diff Makefile Makefile.orig 156c156 FIPSLD_CC=$(CC); CC=$(TOP)/fips/fipsld; export CC FIPSLD_CC; \ --- FIPSLD_CC=$(CC); CC=$(TOP)/fips/fipsld; export CC FIPSLD_CC; \ 161c161 CC=$${CC} APPNAME=$(EXE) OBJECTS=$(PROGRAM).o $(E_OBJ) \ --- CC=$${CC} APPNAME=$(EXE) OBJECTS=$(PROGRAM).o $(E_OBJ) \ __ OpenSSL Project http://www.openssl.org Development Mailing List openssl-dev@openssl.org Automated List Manager majord...@openssl.org