On Fri, Aug 03 2018, Jonathan Nieder wrote:

> Hi again,
>
> Sorry for the slow review.  I finally got a chance to look this over
> again.
>
> My main nits are about the commit message: I think it still focuses
> too much on the process instead of the usual "knowing what I know now,
> here's the clearest explanation for why we need this patch" approach.
> I can send a patch illustrating what I mean tomorrow morning.

I think it makes if you just take over 2/2 of this series (or even the
whole thing), since the meat of it is already something I copy/pasted
from you, and you've got more of an opinion / idea about how to proceed
(which is good!); it's more efficient than me trying to fix various
stuff you're pointing out at this point, I also think it makes sense
that you change the "Author" line for that, since the rest of the
changes will mainly be search-replace by me.

Perhaps it's better for readability if those search-replace changes go
into their own change, i.e. make it a three-part where 2/3 does the
search-replace, and promises that 3/3 has the full rationale etc.

> Ævar Arnfjörð Bjarmason wrote:
>
>> From a security perspective, it seems that SHA-256, BLAKE2, SHA3-256,
>> K12, and so on are all believed to have similar security properties.
>> All are good options from a security point of view.
>>
>> SHA-256 has a number of advantages:
>>
>> * It has been around for a while, is widely used, and is supported by
>>   just about every single crypto library (OpenSSL, mbedTLS, CryptoNG,
>>   SecureTransport, etc).
>>
>> * When you compare against SHA1DC, most vectorized SHA-256
>>   implementations are indeed faster, even without acceleration.
>>
>> * If we're doing signatures with OpenPGP (or even, I suppose, CMS),
>>   we're going to be using SHA-2, so it doesn't make sense to have our
>>   security depend on two separate algorithms when either one of them
>>   alone could break the security when we could just depend on one.
>>
>> So SHA-256 it is.
>
> The above is what I wrote, so of course I'd like it. ;-)
>
>>                    See the "Hash algorithm analysis" thread as of
>> [1]. Linus has come around to this choice and suggested Junio make the
>> final pick, and he's endorsed SHA-256 [3].
>
> The above paragraph has the same problem as before of (1) not being
> self-contained and (2) focusing on the process that led to this patch
> instead of the benefit of the patch itself.  I think we should omit it.
>
>> This follow-up change changes occurrences of "NewHash" to
>> "SHA-256" (or "sha256", depending on the context). The "Selection of a
>> New Hash" section has also been changed to note that historically we
>> used the the "NewHash" name while we didn't know what the new hash
>> function would be.
>
> nit: Commit messages are usually in the imperative tense.  This is in
> the past tense, I think because it is a continuation of that
> discussion about process.
>
> For this part, I think we can let the patch speak for itself.
>
>> This leaves no use of "NewHash" anywhere in git.git except in the
>> aforementioned section (and as a variable name in t/t9700/test.pl, but
>> that use from 2008 has nothing to do with this transition plan).
>
> This part is helpful --- good.
>
>> 1. 
>> https://public-inbox.org/git/20180720215220.gb18...@genre.crustytoothpaste.net/
>> 2. 
>> https://public-inbox.org/git/ca+55afwse9bf8e0hlk9pp3fvd5lavy5grdsv3fbntgzekja...@mail.gmail.com/
>> 3. https://public-inbox.org/git/xmqqzhygwd5o....@gitster-ct.c.googlers.com/
>
> Footnotes to the historical part --- I'd recommend removing these.
>
>> Helped-by: Jonathan Nieder <jrnie...@gmail.com>
>> Helped-by: Junio C Hamano <gits...@pobox.com>
>> Signed-off-by: Ævar Arnfjörð Bjarmason <ava...@gmail.com>
>
> Here I'd want to put a pile of acks, e.g.:
>
>  Acked-by: Linus Torvalds <torva...@linux-foundation.org>
>  Acked-by: brian m. carlson <sand...@crustytoothpaste.net>
>  Acked-by: Johannes Schindelin <johannes.schinde...@gmx.de>
>  Acked-by: Dan Shumow <dan...@microsoft.com>
>  Acked-by: Junio C Hamano <gits...@pobox.com>
>
> [...]
>> --- a/Documentation/technical/hash-function-transition.txt
>> +++ b/Documentation/technical/hash-function-transition.txt
>> @@ -59,14 +59,11 @@ that are believed to be cryptographically secure.
>>
>>  Goals
>>  -----
>> -Where NewHash is a strong 256-bit hash function to replace SHA-1 (see
>> -"Selection of a New Hash", below):
>> -
>> -1. The transition to NewHash can be done one local repository at a time.
>> +1. The transition to SHA-256 can be done one local repository at a time.
>
> Yay!
>
> [...]
>>      [extensions]
>> -            objectFormat = newhash
>> +            objectFormat = sha256
>>              compatObjectFormat = sha1
>
> Yes, makes sense.
>
> [...]
>> @@ -155,36 +152,36 @@ repository extensions.
>>  Object names
>>  ~~~~~~~~~~~~
>>  Objects can be named by their 40 hexadecimal digit sha1-name or 64
>> -hexadecimal digit newhash-name, plus names derived from those (see
>> +hexadecimal digit sha256-name, plus names derived from those (see
>>  gitrevisions(7)).
>>
>>  The sha1-name of an object is the SHA-1 of the concatenation of its
>>  type, length, a nul byte, and the object's sha1-content. This is the
>>  traditional <sha1> used in Git to name objects.
>>
>> -The newhash-name of an object is the NewHash of the concatenation of its
>> -type, length, a nul byte, and the object's newhash-content.
>> +The sha256-name of an object is the SHA-256 of the concatenation of its
>> +type, length, a nul byte, and the object's sha256-content.
>
> Sensible.
>
> [...]
>>
>>  Object format
>>  ~~~~~~~~~~~~~
>>  The content as a byte sequence of a tag, commit, or tree object named
>> -by sha1 and newhash differ because an object named by newhash-name refers to
>> +by sha1 and sha256 differ because an object named by sha256-name refers to
>
> Not about this patch: this should say SHA-1 and SHA-256, I think.
> Leaving it as is in this patch as you did is the right thing.
>
> [...]
>> @@ -255,10 +252,10 @@ network byte order):
>>    up to and not including the table of CRC32 values.
>>  - Zero or more NUL bytes.
>>  - The trailer consists of the following:
>> -  - A copy of the 20-byte NewHash checksum at the end of the
>> +  - A copy of the 20-byte SHA-256 checksum at the end of the
>
> Not about this patch: a SHA-256 is 32 bytes.  Leaving that for a
> separate patch as you did is the right thing, though.
>
> [...]
>> -  - 20-byte NewHash checksum of all of the above.
>> +  - 20-byte SHA-256 checksum of all of the above.
>
> Likewise.
>
> [...]
>> @@ -351,12 +348,12 @@ the following steps:
>>     (This list only contains objects reachable from the "wants". If the
>>     pack from the server contained additional extraneous objects, then
>>     they will be discarded.)
>> -3. convert to newhash: open a new (newhash) packfile. Read the topologically
>> +3. convert to sha256: open a new (sha256) packfile. Read the topologically
>
> Not about this patch: this one's more murky, since it's talking about
> the object names instead of the hash function.  I guess "sha256"
> instead of "SHA-256" for this could be right, but I worry it's going
> to take time for me to figure out the exact distinction.  That seems
> like a reason to just call it SHA-256 (but in a separate patch).
>
> [...]
>> -   sha1-content, convert to newhash-content, and write it to the newhash
>> -   pack. Record the new sha1<->newhash mapping entry for use in the idx.
>> +   sha1-content, convert to sha256-content, and write it to the sha256
>> +   pack. Record the new sha1<->sha256 mapping entry for use in the idx.
>>  4. sort: reorder entries in the new pack to match the order of objects
>> -   in the pack the server generated and include blobs. Write a newhash idx
>> +   in the pack the server generated and include blobs. Write a sha256 idx
>>     file
>
> Likewise.
>
> [...]
>> @@ -388,16 +385,16 @@ send-pack.
>>
>>  Signed Commits
>>  ~~~~~~~~~~~~~~
>> -We add a new field "gpgsig-newhash" to the commit object format to allow
>> +We add a new field "gpgsig-sha256" to the commit object format to allow
>>  signing commits without relying on SHA-1. It is similar to the
>> -existing "gpgsig" field. Its signed payload is the newhash-content of the
>> -commit object with any "gpgsig" and "gpgsig-newhash" fields removed.
>> +existing "gpgsig" field. Its signed payload is the sha256-content of the
>> +commit object with any "gpgsig" and "gpgsig-sha256" fields removed.
>
> That reminds me --- we need to add support for stripping these out.
>
> [...]
>> @@ -601,18 +598,22 @@ The user can also explicitly specify which format to 
>> use for a
>>  particular revision specifier and for output, overriding the mode. For
>>  example:
>>
>> -git --output-format=sha1 log abac87a^{sha1}..f787cac^{newhash}
>> +git --output-format=sha1 log abac87a^{sha1}..f787cac^{sha256}
>>
>> -Selection of a New Hash
>> ------------------------
>> +Choice of Hash
>> +--------------
>
> Yay!
>
> [...]
>> -Some hashes under consideration are SHA-256, SHA-512/256, SHA-256x16,
>> -K12, and BLAKE2bp-256.
>> +We choose SHA-256. See the thread starting at
>> +<20180609224913.gc38...@genre.crustytoothpaste.net> for the discussion
>> +(https://public-inbox.org/git/20180609224913.gc38...@genre.crustytoothpaste.net/)
>
> Can this reference be moved to a footnote?  It's not part of the
> design, but it's a good reference.
>
> Thanks again for getting this documented.
>
> Sincerely,
> Jonathan

Reply via email to