The alignments of the performance results I did before sending it out
did not come out right, my apologies.  Please find my performance
results spreadsheet attached.

Regards,
Ashley Lai 

On Wed, 2012-04-18 at 18:52 -0500, Ashley Lai wrote:
> The not-taken branch hint in the assembly code causes performance
> degradation as the hardware always predict the specific branch that way.
> The branch hint is not necessary as the hardware prediction is very good
> and getting better.  The patch attached removed the branch hint to let
> the hardware do the prediction. 
> 
> To see the performance improvements build with -mcpu=power7 (or whatever
> hardware it's running on), since the hints may get ignored if the
> compiler defaults to targeting an older version of the hardware
> (Power4).
> 
> Below is the performance results built with -mcpu=power7.  The positive
> number shows performance improvements percentage after the branch hint
> is removed.  The performance test used "openssl speed" then calculate
> the percentage using the results from the branch hint removed and the
> results from the base (with branch hint).
> 
> Percentage=(withoutHint/withHint) * 100 - 100
> 
> sha512 shows 32% performance improvements.  sha256, sha1, md4, and md5
> also benefit from this change. There are some negative numbers but they
> are very small (less than 1%).
> 
> type         16bytes   64bytes   256bytes   1024bytes   8192bytes
> mdc2         1.57      0.43      0.07       0.03        0.01
> md4          6.6       6.6       4.47       2.5         0.35
> md5          6.9       5.65      3.68       1.44        0.23
> hmac(md5)      0.35      0.01            0.42       0.12        0
> sha1         7.29      6.46      4.35       2.21        0.42
> sha256               18.35     10.99     5.05       1.64        0.24
> sha512               31.85     32.08     13.72      4.95        0.67
> whirlpool      0.69    0.66      0.44       0.33        0.31
> rmd160               6.61      4.96      3.08       1.32        0.36
> rc4          -0.01     -0.02     -0.19      -0.14       -0.22
> descbc               0.04      -0        0.02       0.07        0.04
> desede3              -0.02     0.01      0          -0          0.01
> aes-128              0.05      -0        -0         0           0.01
> aes-192              0.04      -0.01     -0         0.01        0
> aes-256              0.08      -0.01     0          0.01        0.01
> aes-128              0.02      0.03      -0.01      -0.01       -0.1
> aes-192              0         0.01      0.02       0.01        -0.08
> aes-256              0         0.02      0.01       -0          -0.07
> ghash        0.51      0.36      0.06       0.02        -0.01
> camellia-128   -0.34   -0.03     -0.02      -0.18       -0.69
> camellia-192   -0.26   0.22      0.11       0.07        -0.26
> camellia-256   -0.23   0.03     -0.01       -0.04       -0.32
> idea         0.18      0.07     0.02        0           0.02
> seed         0.02      0.06    -0.04        0.04        0.06
> rc2          0.04      0       -0           0           0.01
> blowfish       0.26    0.09    -0.09       -0.04        -0
> cast         0.14      0.04    0.02        0.01         0
> 
> Please let me know if you have any questions.
> 
> Thanks,
> Ashley Lai
> 


Attachment: opensslPerfRmHint.ods
Description: application/vnd.oasis.opendocument.spreadsheet

Reply via email to