After I sent my message yesterday evening, I was also wondering about that 512bit (64byte) block-size of sha256, and if that would add to the observed slowness. The following output shows time as a function of byte-chunk size (1,2,8,32,64,128,256 bytes)
b: 12.111763954162598 b2: 5.806451082229614 b8: 1.4664850234985352 b32: 0.37551307678222656 b64: 0.20229697227478027 b128: 0.11141395568847656 b256: 0.06758689880371094 8388608 bs: 0.020879030227661133 Time seems to go down linearly with increase of chunk size, and there is no perceived "speed boost" when we go through the 64byte thresh-hold. Time seems to be only linearly related to the number of python-to-C calls. And again, I can understand that the overhead is proportional to the number of python-to-C calls, but it's just the factor of 500 (2-3 order of magnitude) that (unpleasantly) surprised me. It requires one to optimize on byte-string size to pass in the update(), when you have many bytes to hash. For example, if you read from a file or socket, don't update() 1 byte at the time while you read from the stream, but fill-up a (big) buffer first and pass that buffer. -Frank. PS. I haven't looked at the sha256 C-code, but I can imagine that when you pass the update() one byte at the time, it will fill-up some 64byte-buffer, and if that buffer is filled, it will churn/hash that block. The adding a byte to the buffer is all low-level fast code in C, while the churning would use significantly more CPU cycles... hard to phantom that you would see much slower performance when you pass a single byte at the time in C... On Tue, Jul 12, 2016 at 8:07 AM, lvh <_...@lvh.io> wrote: > Hi, > >> On Jul 11, 2016, at 10:42 PM, Frank Siebenlist <frank.siebenl...@gmail.com> >> wrote: > > <snipsnip> > >> I understand that there may be a few more object-creations and casts >> involved in the looping, but 500 times slower… that was un unexpected >> surprise. > > As expected. You both get massively increased C call overhead and the worst > case because you don’t get to hit a block until every 512/8 == 64 updates. > Alas, openssl speed doesn’t distinguish between the same message sizes but in > different chunk sizes, but you can at least clearly see the performance > multiplier for larger messages. > > lvh > _______________________________________________ > Cryptography-dev mailing list > Cryptography-dev@python.org > https://mail.python.org/mailman/listinfo/cryptography-dev _______________________________________________ Cryptography-dev mailing list Cryptography-dev@python.org https://mail.python.org/mailman/listinfo/cryptography-dev