I recently came across a tree in the wild that had a submodule entry
whose sha1 was the null sha1 (i.e., all-zeros). It triggered an
interesting bug in the diff code, which is fixed by patch 1.

Unfortunately, I have no clue how this tree came about. I'm assuming it
was simply a bug somewhere in git, where the entry should have had
another sha1, or possibly been removed from the tree entirely.. Patches
2 and 3 tighten up our checks for null sha1s in a few places, which
might help detect such a bug earlier.

I assume I have never seen this with a non-submodule entry because such
a tree would fail the usual connectivity checks during fsck or during a
transfer. However, since we don't enforce connectivity on submodule
entries, nothing blocked the creation and propagation of such an entry.

I'm not at liberty to share the repository in question, but if anybody
has specific things to look for, I'd be happy to investigate further.

The patches are:

  [1/3]: diff: do not use null sha1 as a sentinel value

This is the actual bug-fix, and I hope is obviously a good thing to do.

  [2/3]: do not write null sha1s to on-disk index

This one tries to tighten our writing a bit. There are unfortunately a
lot of different code paths that create trees in git. I hope by catching
the index write as a choke-point, we can prevent bugs from spreading.
However, there are a lot of tree-writers that update an index in-core
and then write a tree out directly. I would not be surprised if this
does not catch the bug by itself, but I think it is a step in the right

  [3/3]: fsck: detect null sha1 in tree entries

And this one will at least let us notice the bug once it has happened.
And if transfer.fsckObjects is set, it will prevent bogus trees from
passing between repositories, containing any damage.

To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to