I ran in some unexpected timing issues while using pyca/cryptography’s hash.SHA256, and I’m wondering if there is something wrong with the timing discrepancy I see between two different hashing approaches.
When I hash a single byte-string of 10million bytes, it seems to take 2-3 orders of magnitude less time than when I loop over the bytes and hash them one by one. Please look at the following bare-bone snippet: — from __future__ import absolute_import, division, print_function import time from cryptography.hazmat.primitives import hashes from cryptography.hazmat.backends import default_backend # d1 = hashes.Hash(algorithm=hashes.SHA256(),backend=default_backend()) d2 = d1.copy() # n = 10000000 print('n:', n) # b = b'a' ba = bytearray(n*b'a') bs = bytes(ba) # s = time.time() d1.update(bs) t = time.time() - s print('ba: ', t) print(d1.finalize()) # s = time.time() for i in range(n): d2.update(b) t = time.time() - s print('b: ', t) print(d2.finalize()) # — The output is: — /usr/local/Cellar/python3/3.5.1/Frameworks/Python.framework/Versions/3.5/bin/python3.5 /Users/franksiebenlist/git/pyvate23/src/pyvate/messagedigest_tst.py n: 10000000 ba: 0.027185916900634766 b'\x01\xf4\xa8|\x04\xb4\n\xf5\x9a\xad\xc0\xe8\x12)5\tp\x9c\x9a\x87c\xa6\x0b\x7f\x9e\x1903"\xf8\xb0<' b: 15.677960872650146 b'\x01\xf4\xa8|\x04\xb4\n\xf5\x9a\xad\xc0\xe8\x12)5\tp\x9c\x9a\x87c\xa6\x0b\x7f\x9e\x1903"\xf8\xb0<' Process finished with exit code 0 — Results for python 2 and 3 are similar. I understand that there may be a few more object-creations and casts involved in the looping, but 500 times slower… that was un unexpected surprise. Comments? Observation? Thanks, Frank. _______________________________________________ Cryptography-dev mailing list Cryptography-dev@python.org https://mail.python.org/mailman/listinfo/cryptography-dev