Jeff King <p...@peff.net> writes:
> A much bigger problem is the other places we reference sha1s. The
> obvious place is trees, which have no room for backup pointers (either
> in headers, or with a NUL trick).
This is a tangent (as I do not have anything particularly worth
adding on top of what have already been said around the exact
SHA- topic), but we probably would want to start thinking about
the tree object format "v2" at some point.
Some random thoughts:
- It is OK if existing versions of Git barfed when asked to read a
tree object in the "v2" format. The repository format version
may need to be bumped up when writing such an object, and
transfer protocols need to pay attention to it, to avoid
transferring history with objects in newer representation to
repositories with older repository format version.
- We do not need a new "tree v2" object type. Existing versions of
Git will barf upon seeing such an object, but that won't be the
only way to prevent existing versions of Git from misinterpreting
a tree object recorded in the "v2" format as if it were in the
current format (e.g. a non-octal in the mode field of the first
entry causes tree-walk.c::get_mode() to barf).
- We do not mind two tree objects that encodes the same tree in the
current and the enhanced formats to have different object names.
In fact, we care more about the object names derived purely from
the content of the object as an uninterpreted bytestream, so it
is expected that they have different object names.
This will make the path-limited traversal and diff to open more
trees unnecessarily at the "version bump" boundary in the
history, but that is normal (think of a project that used to
record its text files with CRLF and one day decides to convert
everything to LF; the trees before and after the conversion will
record logically the same contents "git show" should give an
emptyness, but the diff machinery needs to go into contents at
the flag day boundary).
As long as we do not let random "extension of the day" into the
new format willy-nilly, the resulting history will still be
useful and usable. From that point of view, no parts of the
additional information we would record in the updated format that
is not present in the current format should be optional (iow,
once you decide to use the "v2" format to record a certain tree,
you will produce an identical and reproducible representation in
"v2", regardless of your implementation).
All of the above are issues for Git 3.0 and beyond, though ;-).
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html