Junio C Hamano wrote:

> Continuing this thought process, I do not see a good way to allow us
> to wean ourselves off of the old hash, unless we _break_ the pack
> stream format so that each object in the pack carries not just the
> data but also the hash algorithm to be used to _name_ it, so that
> new objects will never be referred to using the old hash.

Taking a step further: I don't think that any backward-compatible
format change would address the security concerns with sufficiently
old hashing algorithms.

As long as my favorite repository is allowed to contain objects
identified by SHA-1, my adversary can exploit a SHA-1 collision using
signed tags referring (possibly indirectly) to backdated objects.  The
Git object format does not include a proof of commit date, so I cannot
guarantee "Only old objects are named by SHA-1".

There is a way to get a backward-compatible *user experience* without
the format change being backward-compatible, though.  Name all objects
in the repository using FuturisticHash.  Also store enough information
to recover the old hashes, either in objects as a new field or in a
side table.

If the old hash is broken, signatures using the old hash cannot be
trusted.  An adversary could generate a collision to retroactively
change the meaning of an existing signature.  To maintain the meaning
of old signatures, someone has to record the new names of all involved
objects, either before the state of the art in breaking the old hash
advances far enough or using a copy of the repository from before the
state of the art had advanced --- in effect you need new signatures to
maintain the meaning of old signatures.  This could happen as part of
the process of updating a repository to use a new hash.

E.g.

        object 
a787a87b98a7s98798a798b7a98b798a7b98a7b987a9b87a9b87a98b79a87b98a7b98a7b987a987987a878a78a
        sha1tag object 04b871796dc0420f8e7561a895b52484b701d51a
         type commit
         tag signedtag
         tagger C O Mitter <commit...@example.com> 1465981006 +0000

         signed tag

         signed tag message body
         -----BEGIN PGP SIGNATURE-----
         Version: GnuPG v1

         iQEcBAABAgAGBQJXYRhOAAoJEGEJLoW3InGJklkIAIcnhL7RwEb/+QeX9enkXhxn
         rxfdqrvWd1K80sl2TOt8Bg/NYwrUBw/RWJ+sg/hhHp4WtvE1HDGHlkEz3y11Lkuh
         8tSxS3qKTxXUGozyPGuE90sJfExhZlW4knIQ1wt/yWqM+33E9pN4hzPqLwyrdods
         q8FWEqPPUbSJXoMbRPw04S5jrLtZSsUWbRYjmJCHzlhSfFWW4eFd37uquIaLUBS0
         rkC3Jrx7420jkIpgFcTI2s60uhSQLzgcCwdA2ukSYIRnjg/zDkj8+3h/GaROJ72x
         lZyI6HWixKJkWw8lE9aAOD9TmTW9sFJwcVAzmAuFX2kUreDUKMZduGcoRYGpD7E=
         =jpXa
         -----END PGP SIGNATURE-----
        -----BEGIN PGP SIGNATURE----
        ...
        -----END PGP SIGNATURE

This example uses a signature to attest that mapping
04b871796dc0420f8e7561a895b52484b701d51a->a787a87b98a7s98798a798b7a98b798a7b98a7b987a9b87a9b87a98b79a87b98a7b98a7b987a987987a878a78a
is correct.  A more straightforward approach would be for the
conversion process to produce an out-of-band signed mapping list to
make the sha1tag usable without such a signature.

Summary:
 * Git's properties depend on using a single hash function throughout
   a repository.  I don't think we should change that.

 * A safe and mostly painless migration to a stronger hash function is
   possible using a signed assertion (for example generated by the
   conversion process) of the mapping from old object names to new
   object names.

 * Dealing with multiple such signed mappings (for example due to
   separate conversion of repositories based on linux.git) is left as
   an exercise to the reader.

Hope that helps,
Jonathan
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to