New submission from Antoine Pietri <antoine.piet...@gmail.com>:
SHA-1 has been broken a while ago. While the general recommandation is to migrate to more recent hashes (like SHA-2 and SHA-3), a lot of industry applications (notably Merkle DAG implementations like Git or Blockchains) require backwards compatibility with SHA-1, at least for the time being required for all the users to transition. The SHAttered authors published along with their paper a reference implementation of a "hardened SHA-1" algorithm, a SHA-1 implementation that uses counter-cryptanalysis to detect inputs that were forged to produce a hash collision. What that means is that Hardened SHA-1 is a secure hash function that produces the same output as SHA-1 in 99.999999...% of cases, and only differs when two inputs were specifically made to generate collisions. The reference implementation is here: https://github.com/cr-marcstevens/sha1collisiondetection A large part of the industry has adopted Hardened SHA-1 as a temporary replacement for SHA-1, most notably Git under the name "sha1dc": https://github.com/git/git/commit/28dc98e343ca4eb370a29ceec4c19beac9b5c01e Since CPython has its own implementation of SHA-1, I think it would be a good idea to provide a hardened SHA-1 implementation. So either: 1. we replace the current implementation of sha1 by sha1dc completely, which might be a problem for people who write script to detect whether two files collide with classic sha1 2. we replace the current implementation but we keep the old one under a new name, like "sha1_broken" or "sha1_classic", which breaks backwards compatibility in a few marginal cases but the functionality can be trivially restored by changing the name of the hash 3. we keep the current implementation but add a new one under a new name "sha1dc", which probably means most people will stay on a broken implementation for no good reason, but it will be fully backwards-compatible even in the marginal cases 4. we don't implement Hardened SHA-1 at all, and we advise people to change their hash algorithm, while realizing that this solution is not feasible in a lot of cases. I'd suggest going with either 1. or 2. What would be your favorite option? Not sure whether this should go in security or enhancement, so I put it in the latter category to be more conservative in issue prioritization. I added the devs who worked the most on Modules/sha1module.c in the Nosy list. ---------- components: Library (Lib) messages: 327343 nosy: antoine.pietri, christian.heimes, loewis, vstinner priority: normal severity: normal status: open title: sha1module: Switch sha1 implementation to sha1dc/hardened sha1 type: enhancement versions: Python 3.8 _______________________________________ Python tracker <rep...@bugs.python.org> <https://bugs.python.org/issue34930> _______________________________________ _______________________________________________ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com