Hello all.   Johannes, thanks for adding me to this discussion.

So, as one of the coauthors of the SHA-1 collision detection code, I just 
wanted to chime in and say I'm glad to see the move to a longer hash function.  
Though, as a cryptographer, I have a few thoughts on the matter that I thought 
I would share.

I think that moving to SHA256 is a fine change, and I support it.

I'm not anywhere near the expert in this that Joan Daeman is.  I am someone who 
has worked in this space more or less peripherally.  However, I agree with Adam 
Langley that basically all of the finalists for a hash function replacement are 
about the same for the security needs of Git.  I think that, for this 
community, other software engineering considerations should be more important 
to the selection process.

I think Joan's survey of cryptanalysis papers and the numbers that he gives are 
interesting, and I had never seen the comparison laid out like that.  So, I 
think that there is a good argument to be made that SHA3 has had more 
cryptanalysis than SHA2.  Though, Joan, are the papers that you surveyed only 
focused on SHA2?  I'm curious if you think that the design/construction of 
SHA2, as it can be seen as an iteration of MD5/SHA1, means that the 
cryptanalysis papers on those constructions can be considered to apply to SHA2? 
 Again, I'm not an expert in this, but I do know that Marc Steven's techniques 
for constructing collisions also provided some small cryptanalytic improvements 
against the SHA2 family as well.  I also think that while the paper survey is a 
good way to look over all of this, the more time in the position of high 
profile visibility that SHA2 has can give us some confidence as well.

Also something worth pointing out is that the connection SHA2 has to SHA1 means 
that if Marc Steven's cryptanalysis of MD5/SHA-1 were ever successfully applied 
to SHA2, the SHA1 collision detection approach could be applied there as well, 
thus providing a drop in replacement in that situation.  That said, we don't 
know that there is not a similar way of addressing issues with the SHA3/Sponge 
design.  It's just that because we haven't seen any weaknesses of this sort in 
similar designs, we just don't know what a similar approach would be there yet. 
 I don't want to put too much stock in this argument, it's just saying "Well, 
we already know how SHA2 is likely to break, and we've had fixes for similar 
things in the past."  This is pragmatic but not inspiring or confidence 
building.

So, I also want to state my biases in favor of SHA2 as an employee of 
Microsoft.  Microsoft, being a corporation headquartered in a America, with the 
US Gov't as a major customer definitely prefers to defer to the US Gov't NIST 
standardization process.  And from that perspective SHA2 or SHA3 would be good 
choices.  I, personally, think that the NIST process is the best we have.  It 
is relatively transparent, and NIST employs a fair number of very competent 
cryptographers.  Also, I am encouraged by the widespread international 
participation that the NIST competitions and selection processes attract.

As such, and reflecting this bias, in the internal discussions that Johannes 
alluded to, SHA2 and SHA3 were the primary suggestions.  There was a slight 
preference for SHA2 because SHA3 is not exposed through the windows 
cryptographic APIs (though Git does not use those, so this is a nonissue for 
this discussion.)

I also wanted to thank Johannes for keeping the cryptographers that he 
discussed this with anonymous.  After all, cryptographers are known for being 
private.  And I wanted to say that Johannes did, in fact, accurately represent 
our internal discussions on the matter.

I also wanted to comment on the discussion of the "internal state having the 
same size as the output."  Linus referred to this several times.  This is known 
as narrow-pipe vs wide-pipe in the hash function design literature.  Linus is 
correct that wide-pipe designs are more in favor currently, and IIRC, all of 
the serious SHA3 candidates employed this.  That said, it did seem that in the 
discussion this was being equated with "length extension attacks."  And that 
connection is just not accurate.  Length extension attacks are primarily a 
motivation of the HMAC liked nested hashing design for MACs, because of a 
potential forgery attack.  Again, this doesn't really matter because the 
decision has been made despite this discussion.  I just wanted to set the 
record straight about this, as to avoid doing the right thing for the wrong 
reason (T.S. Elliot's "greatest treason.")

One other thing that I wanted to throw out there for the future is that in the 
crypto community there is currently a very large push to post quantum 
cryptography.  Whether the threat of quantum computers is real or imagined this 
is a hot area of research, with a NIST competition to select post quantum 
asymmetric cryptographic algorithms.  That is not directly of concern to the 
selection of a hash function.  However, if we take this threat as legitimate, 
quantum computers reduce the strength of symmetric crypto, both encryption and 
hash functions, by 1/2.  So, if this is the direction that the crypto community 
ultimately goes in, 512bit hashes will be seen as standard over the next decade 
or so.  I don't think that this should be involved in this discussion, 
presently.   I'm just saying that not unlike the time when SHA1 was selected, I 
think that the replacement of a 256bit hash is on the horizon as well.

Thanks,
Dan Shumow

Reply via email to