New submission from Antoine Pietri <antoine.piet...@gmail.com>:

SHA-1 has been broken a while ago. While the general recommandation is to 
migrate to more recent hashes (like SHA-2 and SHA-3), a lot of industry 
applications (notably Merkle DAG implementations like Git or Blockchains) 
require backwards compatibility with SHA-1, at least for the time being 
required for all the users to transition.

The SHAttered authors published along with their paper a reference 
implementation of a "hardened SHA-1" algorithm, a SHA-1 implementation that 
uses counter-cryptanalysis to detect inputs that were forged to produce a hash 
collision. What that means is that Hardened SHA-1 is a secure hash function 
that produces the same output as SHA-1 in 99.999999...% of cases, and only 
differs when two inputs were specifically made to generate collisions. The 
reference implementation is here: 
https://github.com/cr-marcstevens/sha1collisiondetection

A large part of the industry has adopted Hardened SHA-1 as a temporary 
replacement for SHA-1, most notably Git under the name "sha1dc": 
https://github.com/git/git/commit/28dc98e343ca4eb370a29ceec4c19beac9b5c01e

Since CPython has its own implementation of SHA-1, I think it would be a good 
idea to provide a hardened SHA-1 implementation. So either:

1. we replace the current implementation of sha1 by sha1dc completely, which 
might be a problem for people who write script to detect whether two files 
collide with classic sha1

2. we replace the current implementation but we keep the old one under a new 
name, like "sha1_broken" or "sha1_classic", which breaks backwards 
compatibility in a few marginal cases but the functionality can be trivially 
restored by changing the name of the hash

3. we keep the current implementation but add a new one under a new name 
"sha1dc", which probably means most people will stay on a broken implementation 
for no good reason, but it will be fully backwards-compatible even in the 
marginal cases

4. we don't implement Hardened SHA-1 at all, and we advise people to change 
their hash algorithm, while realizing that this solution is not feasible in a 
lot of cases.

I'd suggest going with either 1. or 2. What would be your favorite option?

Not sure whether this should go in security or enhancement, so I put it in the 
latter category to be more conservative in issue prioritization. I added the 
devs who worked the most on Modules/sha1module.c in the Nosy list.

----------
components: Library (Lib)
messages: 327343
nosy: antoine.pietri, christian.heimes, loewis, vstinner
priority: normal
severity: normal
status: open
title: sha1module: Switch sha1 implementation to sha1dc/hardened sha1
type: enhancement
versions: Python 3.8

_______________________________________
Python tracker <rep...@bugs.python.org>
<https://bugs.python.org/issue34930>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

Reply via email to