[issue34930] sha1module: Switch sha1 implementation to sha1dc/hardened sha1

2018-10-16 Thread Antoine Pietri


Antoine Pietri  added the comment:

Thanks, those arguments are convincing. I guess for applications that really 
can't move to a more secure hash, it would be better for them to rely on 
third-party libraries that implement the "band-aid".

I'm closing this for now.

--
stage:  -> resolved
status: open -> closed

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue34930] sha1module: Switch sha1 implementation to sha1dc/hardened sha1

2018-10-16 Thread Christian Heimes


Christian Heimes  added the comment:

I talked to some experts (Alex Gaynor, Simo Sorce). They all share my sentiment 
and are against SHA1DC. The algorithm is just a poor bandaid for a gapping 
security issue. Everybody was strongly against replacing SHA1 with SHA1DC by 
default, because it's an incompatible implementation. SHA1DC is only able to 
counteract some of the known flaws, too. Even git doesn't replace SHA1 with 
SHA1DC directly. Instead it turns a detected collision into a fatal error [1].

I'm -1 to add it to the Python standard library. Alex pointed out that the lack 
of SHA1DC in OpenSSL is a clear sign that it's not generally useful. SHA1DC may 
be useful for few applications like git. In general it's not a fool-proof 
safety net for SHA1.

[1] https://github.com/git/git/blob/master/sha1dc_git.c#L17-L23

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue34930] sha1module: Switch sha1 implementation to sha1dc/hardened sha1

2018-10-16 Thread Christian Heimes


Christian Heimes  added the comment:

I wouldn't call SHA1 a secure hash function any more. SHA1DC is both an 
incompatible implementation and a bandaid for legacy applications that can't 
easily update to a proper hashing algorithm. Also it's rather pointless to 
update our SHA1 implementation since OpenSSL still uses the standardized SHA1 
implementation. CPython prefers OpenSSL's implementation because it's much, 
much faster than libtomcrypt's implementation.

I need to study SHA1DC first and get some advice before I can make an educated 
statement. But I'm leaning towards -1 to even support SHA1DC in the standard 
library, because I don't want to promote SHA1 any more. Applications should 
move to SHA2, SHA3 and blake2.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue34930] sha1module: Switch sha1 implementation to sha1dc/hardened sha1

2018-10-10 Thread Raymond Hettinger


Raymond Hettinger  added the comment:

Assigning to Christian to make the call.

+1 for option #1, replacing sha1 implementation with the harden version, 
helping us move close to more-secure-by-default.

--
assignee:  -> christian.heimes
nosy: +rhettinger

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue34930] sha1module: Switch sha1 implementation to sha1dc/hardened sha1

2018-10-10 Thread Antoine Pietri


Antoine Pietri  added the comment:

On Wed, Oct 10, 2018 at 11:27 PM STINNER Victor  wrote:
> I dislike modifying a hash function to return its output but keep the same 
> name. For name, "SHA1" must remain "SHA1". If you want a variant, it should 
> have a different name, but I would expect that the existing sha1 function 
> remains unchanged. How do you keep the compatibility between different 
> programming languages and applications if one use SHA1 and the other uses 
> "hardened SHA-1"?

Well, as I said we could almost consider both algorithms to be
"compatible", in that they only differ in an infinitesimally small
number of cases that were specifically *designed* to break SHA1. I
agree it's not ideal to just replace the function directly, and that's
why I suggested 4 possible alternatives. But you have to understand
that the decision is not as simple as just "it doesn't give the same
outputs so it should have a different name", because it *does* give
the same outputs in *all of the cases that weren't designed to break
it*, and the tradeoff for not making that the default is that most
people who don't care about seeing the collisions happen will keep
using a broken implementation for no reason.

I'm not saying I disagree with you here, I'm just making sure you're
aware of the tradeoff. If we make it the default, it's a *very slight*
break of backwards compatibility, but it will be a positive change for
99.99% of users. The only affected people will be the ones that were
writing scripts to check whether collisions did exist in the old
algorithm, and if we change the name of the "classic sha1" they could
trivially change it themselves.

That said, if you'd rather have another name for it, it also works for
me, it's better than having nothing.

> One alternative is to stop using sha1 :-D

Totally agree with you here, but it's not always an option, so I'd
argue we should do our best to mitigate the problem.

> Do you have examples?

I already gave the Git example:

https://github.com/git/git/commit/28dc98e343ca4eb370a29ceec4c19beac9b5c01e#diff-a44b837d82653a78649b57443ba99460

Fossil also migrated to it:

https://www.fossil-scm.org/xfer/doc/trunk/www/hashpolicy.wiki

The truth is, most of the other Merkle Tree implementations (like
Bitcoin) were using a different hash in the first place, and that
seems to be the main application where you have to keep backward
compatibility with your hashes. So the fact that two of the main SHA-1
Merkle tree implementations moved to Hardened SHA-1 is huge, IMO.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue34930] sha1module: Switch sha1 implementation to sha1dc/hardened sha1

2018-10-10 Thread STINNER Victor


STINNER Victor  added the comment:

I dislike modifying a hash function to return its output but keep the same 
name. For name, "SHA1" must remain "SHA1". If you want a variant, it should 
have a different name, but I would expect that the existing sha1 function 
remains unchanged. How do you keep the compatibility between different 
programming languages and applications if one use SHA1 and the other uses 
"hardened SHA-1"?

One alternative is to stop using sha1 :-D

> A large part of the industry has adopted Hardened SHA-1 ...

Do you have examples?

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue34930] sha1module: Switch sha1 implementation to sha1dc/hardened sha1

2018-10-08 Thread Antoine Pietri


New submission from Antoine Pietri :

SHA-1 has been broken a while ago. While the general recommandation is to 
migrate to more recent hashes (like SHA-2 and SHA-3), a lot of industry 
applications (notably Merkle DAG implementations like Git or Blockchains) 
require backwards compatibility with SHA-1, at least for the time being 
required for all the users to transition.

The SHAttered authors published along with their paper a reference 
implementation of a "hardened SHA-1" algorithm, a SHA-1 implementation that 
uses counter-cryptanalysis to detect inputs that were forged to produce a hash 
collision. What that means is that Hardened SHA-1 is a secure hash function 
that produces the same output as SHA-1 in 99.99...% of cases, and only 
differs when two inputs were specifically made to generate collisions. The 
reference implementation is here: 
https://github.com/cr-marcstevens/sha1collisiondetection

A large part of the industry has adopted Hardened SHA-1 as a temporary 
replacement for SHA-1, most notably Git under the name "sha1dc": 
https://github.com/git/git/commit/28dc98e343ca4eb370a29ceec4c19beac9b5c01e

Since CPython has its own implementation of SHA-1, I think it would be a good 
idea to provide a hardened SHA-1 implementation. So either:

1. we replace the current implementation of sha1 by sha1dc completely, which 
might be a problem for people who write script to detect whether two files 
collide with classic sha1

2. we replace the current implementation but we keep the old one under a new 
name, like "sha1_broken" or "sha1_classic", which breaks backwards 
compatibility in a few marginal cases but the functionality can be trivially 
restored by changing the name of the hash

3. we keep the current implementation but add a new one under a new name 
"sha1dc", which probably means most people will stay on a broken implementation 
for no good reason, but it will be fully backwards-compatible even in the 
marginal cases

4. we don't implement Hardened SHA-1 at all, and we advise people to change 
their hash algorithm, while realizing that this solution is not feasible in a 
lot of cases.

I'd suggest going with either 1. or 2. What would be your favorite option?

Not sure whether this should go in security or enhancement, so I put it in the 
latter category to be more conservative in issue prioritization. I added the 
devs who worked the most on Modules/sha1module.c in the Nosy list.

--
components: Library (Lib)
messages: 327343
nosy: antoine.pietri, christian.heimes, loewis, vstinner
priority: normal
severity: normal
status: open
title: sha1module: Switch sha1 implementation to sha1dc/hardened sha1
type: enhancement
versions: Python 3.8

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com