On Mon, Jul 18, 2016 at 11:00:35AM -0700, Junio C Hamano wrote:
> Continuing this thought process, I do not see a good way to allow us
> to wean ourselves off of the old hash, unless we _break_ the pack
> stream format so that each object in the pack carries not just the
> data but also the hash algorithm to be used to _name_ it, so that
> new objects will never be referred to using the old hash.

I think for this reason, I'm going to propose the following approach
when we get there:

* We serialize the hash in the object formats, using multihash or
  something similar.  This means that it is minimally painful if we ever
  need to change in the future[0].
* Each repository carries exactly one hash algorithm, except for
  submodule data.  If we don't do this, then some people will never
  switch because the submodules they depend on haven't.
* If people on the new format need to refer to submodule commits using
  SHA-1, then they have to use a prefix on the hash form; otherwise,
  they can use the raw hash value (without any multihash prefix).
* git fsck verifies one consistent algorithm (excepting submodule
  references).

This preserves the security benefits, avoids future-proofing problems,
and minimizes performance impacts due to naming like you mentioned.

[0] We are practically limited to 256-bit hashes because anything longer
will wrap on an 80-column terminal when in hex form.
-- 
brian m. carlson / brian with sandals: Houston, Texas, US
+1 832 623 2791 | https://www.crustytoothpaste.net/~bmc | My opinion only
OpenPGP: https://keybase.io/bk2204

Attachment: signature.asc
Description: PGP signature

Reply via email to