Hi, Von: Jeff Hardy [mailto:jdha...@gmail.com] > On Thu, Feb 27, 2014 at 11:11 AM, Markus Schaber <m.scha...@codesys.com> > wrote: > > Hi, > > > > I'm just trying to sum it up: > > > > 1) The current code: > > - High memory usage. > > - High load on the large object heap. > > - Limited by the available amount of memory (which might be considered a > violation of the Python API). > > - High CPU usage when used incrementally (quadratic to the number of > blocks added). > > > > 2) Optimizing with MemoryStream and lazy calculation: > > - High memory usage. > > - High load on the large object heap. > > - Limited by the available amount of memory (which might be considered a > violation of the Python API). > > + Optimal CPU usage when the hash is only fetched once. > > ± Better than current code, but still not optimal when hash is > incrementally fetched several times. > > > > 3) Optimizing with jagged arrays and lazy calculation: > > - High memory usage. > > + Improved or no impact on the large object heap (depending on the exact > implementation) > > - Limited by the available amount of memory (which might be considered a > violation of the Python API). > > + Optimal CPU usage when the hash is only fetched once. > > ± Better than current code, but still not optimal when hash is > incrementally fetched several times. > > > > 4) Using the existing .NET incremental APIs > > + Low, constant memory usage. > > + No impact on the large object heap. > > + No limit of data length by the amount of memory. > > + Optimal CPU usage when the hash is only fetched once. > > - Breaks when hash is incrementally fetched several times (which likely > is a violation of the Python API). > > > > 5) Finding or porting a different Hash implementation in C#: > > + Low, constant memory usage. > > + No impact on the large object heap. > > + No limit of data length by the amount of memory. > > + Optimal CPU usage when the hash is only fetched once. > > + Optimal CPU usage when the hash is incrementally fetched several times. > > > > I've a local prototype implemented for 2), but I'm not sure whether that's > > the best way to go... > > Good analysis! > > My preference would be for (4), raising an exception if .update() is called > after .digest(), or .copy() is called at all. As a fallback, an extra > parameter to hashlib.new (&c) that triggers (2), for cases where its needed - > I can't say for sure, but I would think calling .update() after .digest() > would be rare, and so would .copy() (damn you Google for shutting down code > search). At least then the common case is fast and edge cases are (usually) > possible.
Do you think asking on some cPython lists could give usable feedback how common it is to call copy() or to continue feeding data after calling digest()? > > Maybe we should google for purely managed implementations of the hash codes > > with a sensible license... > > There seems to be for MD5 and SHA1 but not SHA2 or RIPEMD160. They could be > ported from the public domain Crypto++ library, but that seems like a lot of > work for an edge case. Yes, that seems to be a lot of work. On the other hand, it's the 100% solution. :-) Best regards Markus Schaber CODESYS® a trademark of 3S-Smart Software Solutions GmbH Inspiring Automation Solutions 3S-Smart Software Solutions GmbH Dipl.-Inf. Markus Schaber | Product Development Core Technology Memminger Str. 151 | 87439 Kempten | Germany Tel. +49-831-54031-979 | Fax +49-831-54031-50 E-Mail: m.scha...@codesys.com | Web: http://www.codesys.com | CODESYS store: http://store.codesys.com CODESYS forum: http://forum.codesys.com Managing Directors: Dipl.Inf. Dieter Hess, Dipl.Inf. Manfred Werner | Trade register: Kempten HRB 6186 | Tax ID No.: DE 167014915 _______________________________________________ Ironpython-users mailing list Ironpython-users@python.org https://mail.python.org/mailman/listinfo/ironpython-users