Re: SHA1 collisions found

Jeff King Fri, 03 Mar 2017 03:43:08 -0800

On Thu, Mar 02, 2017 at 01:46:10PM -0800, Brandon Williams wrote:

> There were a few of us discussing this sort of approach internally.  We
> also figured that, given some performance hit, you could maintain your
> repo in sha256 and do some translation to sha1 if you need to push or
> fetch to a server which has the the repo in a sha1 format.  This way you
> can convert your repo independently of the rest of the world.


Yeah, you definitely _can_ convert between the two. It's just expensive
to do so on the fly. We'd potentially want to be able to during the
transition period just to help people get all the work converted over.
But I had assumed that the conversion would be a mix of:

  1. Unpublished work (or work which is otherwise OK to be rewritten)
     could be converted to the new hash.

  2. Old history could be grafted with a parent pointer that mentions
     the tip of the old history by its new hash, but the pointed-to
     parent contains sha1s.

> As for storing the translation table, you should really only need to
> maintain the table until old clients are phased out and all of the repos
> of a project have experienced flag day and have been converted to
> sha256.

I think you've read more into my "conversion" than I intended. The old
history won't get rewritten. It will just be grafted onto the bottom of
the commit history you've got, and the new trees will all be written
with the new hash.

So you still have those old objects hanging around that refer to things
by their sha1 (not to mention bug trackers, commit messages, etc, which
all use commit ids). And you want to be able to quickly resolve those
references.

What _does_ get rewritten is what's in your ref files, your pack .idx,
etc. Those are all sha256 (or whatever), and work as sha1's do now.
Looking up a sha1 reference from an old object just goes through the
extra level of indirection.

-Peff

Re: SHA1 collisions found

Reply via email to