Re: SHA-256 implementation improvement

Andy Polyakov Wed, 30 May 2012 10:27:07 -0700

version 11/05/2015:
sha256 39017.64k 87648.54k 150106.58k 183705.94k197330.99k
version 1.8:
sha256 33560.42k 73153.83k 121472.43k 167948.67k180955.23k


It sounds like we're talking about Nehalem, as it's very close to
difference reported by Pavel:

i5 Lynnfield       1250 / 1426 / 1271 / 1121 / 1033

                                          1100

Indeed, you observe ~8% difference and above difference

It occurred to me that you might also be referring to bigger than 8%difference for blocks shorter than 1KB. While looking good in specificbenchmark fully unrolled loop can hurt overall performance, because it'slikely to evict other code from cache. I mean in real life you don't dojust SHA256 and nothing else, don't you? For the moment fully unrolledloop is taken for inputs larger than 1KB. The limit was more or lessarbitrarily chosen, but intention is to eventually quantify costs ofbringing code to cache and adjust value accordingly.

______________________________________________________________________
OpenSSL Project                                 http://www.openssl.org
Development Mailing List                       [email protected]
Automated List Manager                           [email protected]

Re: SHA-256 implementation improvement

Reply via email to