[openssl.org #2794] [PATCH] Remove branch hint to improve crypto algorithms performance on Power

2012-04-20 Thread Ashley via RT
The not-taken branch hint in the assembly code causes performance
degradation as the hardware always predict the specific branch that way.
The branch hint is not necessary as the hardware prediction is very good
and getting better.  The patch attached removed the branch hint to let
the hardware do the prediction. 

To see the performance improvements build with -mcpu=power7 (or whatever
hardware it's running on), since the hints may get ignored if the
compiler defaults to targeting an older version of the hardware
(Power4).

Below is the performance results built with -mcpu=power7.  The positive
number shows performance improvements percentage after the branch hint
is removed.  The performance test used openssl speed then calculate
the percentage using the results from the branch hint removed and the
results from the base (with branch hint).

Percentage=(withoutHint/withHint) * 100 - 100

sha512 shows 32% performance improvements.  sha256, sha1, md4, and md5
also benefit from this change. There are some negative numbers but they
are very small (less than 1%).

type   16bytes   64bytes   256bytes   1024bytes   8192bytes
mdc2   1.57  0.43  0.07   0.030.01
md46.6   6.6   4.47   2.5 0.35
md56.9   5.65  3.68   1.440.23
hmac(md5)  0.35  0.01  0.42   0.120
sha1   7.29  6.46  4.35   2.210.42
sha256 18.35 10.99 5.05   1.640.24
sha512 31.85 32.08 13.72  4.950.67
whirlpool  0.69  0.66  0.44   0.330.31
rmd160 6.61  4.96  3.08   1.320.36
rc4-0.01 -0.02 -0.19  -0.14   -0.22
descbc 0.04  -00.02   0.070.04
desede3-0.02 0.01  0  -0  0.01
aes-1280.05  -0-0 0   0.01
aes-1920.04  -0.01 -0 0.010
aes-2560.08  -0.01 0  0.010.01
aes-1280.02  0.03  -0.01  -0.01   -0.1
aes-1920 0.01  0.02   0.01-0.08
aes-2560 0.02  0.01   -0  -0.07
ghash  0.51  0.36  0.06   0.02-0.01
camellia-128   -0.34 -0.03 -0.02  -0.18   -0.69
camellia-192   -0.26 0.22  0.11   0.07-0.26
camellia-256   -0.23 0.03 -0.01   -0.04   -0.32
idea   0.18  0.07 0.020   0.02
seed   0.02  0.06-0.040.040.06
rc20.04  0   -0   0   0.01
blowfish   0.26  0.09-0.09   -0.04-0
cast   0.14  0.040.020.01 0

Please let me know if you have any questions.

Thanks,
Ashley Lai


diff -ur openssl-1.0.1/crypto/ppccpuid.pl openssl-1.0.1-wp/crypto/ppccpuid.pl
--- openssl-1.0.1/crypto/ppccpuid.pl	2011-11-14 14:52:33.0 -0600
+++ openssl-1.0.1-wp/crypto/ppccpuid.pl	2012-04-18 16:47:53.098711478 -0500
@@ -105,7 +105,7 @@
 Little:	mtctr	r4
 	stb	r0,0(r3)
 	addi	r3,r3,1
-	bdnz-	\$-8
+	bdnz	\$-8
 	blr
 Lot:	andi.	r5,r3,3
 	beq	Laligned
@@ -118,7 +118,7 @@
 	mtctr	r5
 	stw	r0,0(r3)
 	addi	r3,r3,4
-	bdnz-	\$-8
+	bdnz	\$-8
 	andi.	r4,r4,3
 	bne	Little
 	blr


Re: [openssl.org #2794] [PATCH] Remove branch hint to improve crypto algorithms performance on Power

2012-04-20 Thread Ashley via RT
The alignments of the performance results I did before sending it out
did not come out right, my apologies.  Please find my performance
results spreadsheet attached.

Regards,
Ashley Lai 

On Wed, 2012-04-18 at 18:52 -0500, Ashley Lai wrote:
 The not-taken branch hint in the assembly code causes performance
 degradation as the hardware always predict the specific branch that way.
 The branch hint is not necessary as the hardware prediction is very good
 and getting better.  The patch attached removed the branch hint to let
 the hardware do the prediction. 
 
 To see the performance improvements build with -mcpu=power7 (or whatever
 hardware it's running on), since the hints may get ignored if the
 compiler defaults to targeting an older version of the hardware
 (Power4).
 
 Below is the performance results built with -mcpu=power7.  The positive
 number shows performance improvements percentage after the branch hint
 is removed.  The performance test used openssl speed then calculate
 the percentage using the results from the branch hint removed and the
 results from the base (with branch hint).
 
 Percentage=(withoutHint/withHint) * 100 - 100
 
 sha512 shows 32% performance improvements.  sha256, sha1, md4, and md5
 also benefit from this change. There are some negative numbers but they
 are very small (less than 1%).
 
 type 16bytes   64bytes   256bytes   1024bytes   8192bytes
 mdc2 1.57  0.43  0.07   0.030.01
 md4  6.6   6.6   4.47   2.5 0.35
 md5  6.9   5.65  3.68   1.440.23
 hmac(md5)  0.35  0.010.42   0.120
 sha1 7.29  6.46  4.35   2.210.42
 sha256   18.35 10.99 5.05   1.640.24
 sha512   31.85 32.08 13.72  4.950.67
 whirlpool  0.690.66  0.44   0.330.31
 rmd160   6.61  4.96  3.08   1.320.36
 rc4  -0.01 -0.02 -0.19  -0.14   -0.22
 descbc   0.04  -00.02   0.070.04
 desede3  -0.02 0.01  0  -0  0.01
 aes-128  0.05  -0-0 0   0.01
 aes-192  0.04  -0.01 -0 0.010
 aes-256  0.08  -0.01 0  0.010.01
 aes-128  0.02  0.03  -0.01  -0.01   -0.1
 aes-192  0 0.01  0.02   0.01-0.08
 aes-256  0 0.02  0.01   -0  -0.07
 ghash0.51  0.36  0.06   0.02-0.01
 camellia-128   -0.34   -0.03 -0.02  -0.18   -0.69
 camellia-192   -0.26   0.22  0.11   0.07-0.26
 camellia-256   -0.23   0.03 -0.01   -0.04   -0.32
 idea 0.18  0.07 0.020   0.02
 seed 0.02  0.06-0.040.040.06
 rc2  0.04  0   -0   0   0.01
 blowfish   0.260.09-0.09   -0.04-0
 cast 0.14  0.040.020.01 0
 
 Please let me know if you have any questions.
 
 Thanks,
 Ashley Lai
 




opensslPerfRmHint.ods
Description: application/vnd.oasis.opendocument.spreadsheet


[openssl.org #1957] OpenSSL 0.9.8k Solaris build failure in apps; Makefile variables not quoted

2009-06-16 Thread Mark Ashley via RT
FIPSLD_CC and CC need to be quoted, probably in more than one Makefile
than this to be safe, but certainly this one to allow the build to
complete.

This is required because CC is cc -m64 -xcode=pic32 -w
make(1) will try to grok the CC arguments '-m64 -xcode=pic32 -w' after
assigning the initial 'cc' part.


openssl-0.9.8k/apps root# diff Makefile Makefile.orig
156c156
 FIPSLD_CC=$(CC); CC=$(TOP)/fips/fipsld; export CC FIPSLD_CC; \
---
 FIPSLD_CC=$(CC); CC=$(TOP)/fips/fipsld; export CC FIPSLD_CC; \
161c161
   CC=$${CC} APPNAME=$(EXE) OBJECTS=$(PROGRAM).o $(E_OBJ) \
---
   CC=$${CC} APPNAME=$(EXE) OBJECTS=$(PROGRAM).o $(E_OBJ) \

__
OpenSSL Project http://www.openssl.org
Development Mailing List   openssl-dev@openssl.org
Automated List Manager   majord...@openssl.org