You can reuse a Crypto++ SHA-256 object after you call Final() on it. No need to allocate a new object for each digest. That's probably causing most of the overhead, if Brian's benchmark is calling SHA256_new() every time.
If you do reuse SHA-256 objects, I see no reason why the per-digest overhead would be much higher than OpenSSLs. -------------------------------------------------- From: "Zooko O'Whielacronx" <[email protected]> Sent: Wednesday, September 01, 2010 9:40 PM To: "Crypto++ Users" <[email protected]> Subject: pycryptopp/Crypto++'s SHA-256 has slightly larger setup overhead than hashlib/OpenSSL's
Dear Wei Dai et al.: Wei Dai has done a great job of optimizing Crypto++, at least for cycles per byte in the long run as the measurement. There's an old saying "What gets measured gets improved.". Crypto++'s implementation of SHA-256 has excellent cycles per byte in the long run, but what really matters is total time for a specific task, which equals constant startup overhead + bytes * cycles-per-byte. Brian Warner recently benchmarked Crypto++ v5.6.0's implementation of SHA-256 vs. OpenSSL's, both as wrapped by Python wrappers. His results and his benchmarking script are posted here: http://tahoe-lafs.org/pipermail/tahoe-dev/2010-August/004948.html Here's his bottom line: """ So at 1 MiB, hashlib/openssl gets 120MBps, while pycryptopp/Crypto++ gets 126MBps (about 5% faster). But hashlib/openssl has lower startup time (2.5us vs 14us). ... The large hashes will take about 8.962s for hashlib and 8.732s for pycryptopp, and the small ones will be about 90ms for hashlib vs 472ms for pycryptopp. So we can expect the total to be about 9.051s for hashlib and 9.205 for pycryptopp. In other words: not a significant difference, at least for large files. """ Some of this could be due to pycryptopp instead of due to Crypto++ itself. When you instantiate a pycryptopp sha256 object it executes this C++ code: SHA256_new() http://tahoe-lafs.org/trac/pycryptopp/browser/trunk/pycryptopp/hash/sha256module.cpp?rev=702#L97 then this: SHA256_init() http://tahoe-lafs.org/trac/pycryptopp/browser/trunk/pycryptopp/hash/sha256module.cpp?rev=702#L116 Then when you hash some data it executes this: SHA256_update() http://tahoe-lafs.org/trac/pycryptopp/browser/trunk/pycryptopp/hash/sha256module.cpp?rev=702#L35 and when you ask for the digest it executes this: SHA256_digest() http://tahoe-lafs.org/trac/pycryptopp/browser/trunk/pycryptopp/hash/sha256module.cpp?rev=702#L52 Does anyone see ways I could optimize the setup overhead of this C++ code which could possibly account for a significant portion of the 11.5 us difference between pycryptopp/Crypto++ and hashlib/OpenSSL? Thanks! Regards, Zooko -- You received this message because you are subscribed to the "Crypto++ Users" Google Group. To unsubscribe, send an email to [email protected]. More information about Crypto++ and this group is available at http://www.cryptopp.com.
-- You received this message because you are subscribed to the "Crypto++ Users" Google Group. To unsubscribe, send an email to [email protected]. More information about Crypto++ and this group is available at http://www.cryptopp.com.
