Re: [PATCH v2] index-format.txt: be more liberal on what can represent invalid cache tree
Junio C Hamano gits...@pobox.com writes: Nguyễn Thái Ngọc Duy pclo...@gmail.com writes: How would that work with existing versions? If you write -2 in cache-tree, the next time 1.8.0 updates cache tree it writes -1 back. That loses whatever information you attach to -2. A new cache-tree extension is probably better. You can easily imagine a definition like this: ... As we clarified that we do not allow implementations to write anything but -1 for invalidated entries until we decide what we will use other values for, the whole log message needs to be updated, I think. -- 8 -- From: Nguyễn Thái Ngọc Duy pclo...@gmail.com Date: Thu, 13 Dec 2012 08:14:47 +0700 Subject: [PATCH] index-format.txt: clarify what is invalid A cache-tree entry with a negative entry count is considered invalid in the current Git; it records that we do not know the object name of a tree that would result by writing the directory covered by the cache-tree as a tree object. Clarify that any entry with a negative entry count is invalid, but the implementations must write -1 there. This way, we can later decide to allow writers to use negative values other than -1 to encode optional information without harming interoperability; we do not know what is encoded how, so keep these other negative values as reserved for now. Signed-off-by: Nguyễn Thái Ngọc Duy pclo...@gmail.com Signed-off-by: Junio C Hamano gits...@pobox.com --- Documentation/technical/index-format.txt | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/Documentation/technical/index-format.txt b/Documentation/technical/index-format.txt index 9d25b30..ce28a7a 100644 --- a/Documentation/technical/index-format.txt +++ b/Documentation/technical/index-format.txt @@ -161,8 +161,9 @@ GIT index format this span of index as a tree. An entry can be in an invalidated state and is represented by having - -1 in the entry_count field. In this case, there is no object name - and the next entry starts immediately after the newline. + a negative number in the entry_count field. In this case, there is no + object name and the next entry starts immediately after the newline. + When writing an invalid entry, -1 should always be used as entry_count. The entries are written out in the top-down, depth-first order. The first entry represents the root level of the repository, followed by the -- 1.8.1.rc1.141.g0ffea5d -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v2] index-format.txt: be more liberal on what can represent invalid cache tree
We have been writing -1 as invalid since day 1. On that same day we accept all negative entry counts as invalid. So in theory all C Git versions out there would be happy to accept any negative numbers. JGit seems to do exactly the same. Correct the document to reflect the fact that -1 is not the only magic number. At least one implementation, libgit2, is found to treat -1 this way. Signed-off-by: Nguyễn Thái Ngọc Duy pclo...@gmail.com --- On Thu, Dec 13, 2012 at 1:14 AM, Junio C Hamano gits...@pobox.com wrote: Nguyễn Thái Ngọc Duy pclo...@gmail.com writes: We have been writing -1 as invalid since day 1. On that same day we accept all negative entry counts as invalid. So in theory all C Git versions out there would be happy to accept any negative numbers. JGit seems to do exactly the same. I am of two minds here. The existing code is being more lenient than specified when they read stuff others wrote, but it still adheres to -1 when writing. Allowing random implementations to write random negative values will close the door for us to later update the specification to encode more informatin about these invalid entries by using negative value other than -1 here. How would that work with existing versions? If you write -2 in cache-tree, the next time 1.8.0 updates cache tree it writes -1 back. That loses whatever information you attach to -2. A new cache-tree extension is probably better. I am OK with a reword to say negative means invalid, and writers should write -1 for invalid entries, but without the latter half, this change is not justified. Done. Documentation/technical/index-format.txt | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/Documentation/technical/index-format.txt b/Documentation/technical/index-format.txt index 9d25b30..ce28a7a 100644 --- a/Documentation/technical/index-format.txt +++ b/Documentation/technical/index-format.txt @@ -161,8 +161,9 @@ GIT index format this span of index as a tree. An entry can be in an invalidated state and is represented by having - -1 in the entry_count field. In this case, there is no object name - and the next entry starts immediately after the newline. + a negative number in the entry_count field. In this case, there is no + object name and the next entry starts immediately after the newline. + When writing an invalid entry, -1 should always be used as entry_count. The entries are written out in the top-down, depth-first order. The first entry represents the root level of the repository, followed by the -- 1.8.0.rc2.23.g1fb49df -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v2] index-format.txt: be more liberal on what can represent invalid cache tree
Nguyễn Thái Ngọc Duy pclo...@gmail.com writes: How would that work with existing versions? If you write -2 in cache-tree, the next time 1.8.0 updates cache tree it writes -1 back. That loses whatever information you attach to -2. A new cache-tree extension is probably better. You can easily imagine a definition like this: - If non-negative, the entry is valid and it is the number of index entries that are covered by this subtree; - If -1, the cached-tree does not know the object name of the tree object, and nothing else is known. The caller needs to walk the index to skip the entries that could have been covered by this subtree, if the cached tree information were valid; - If less than -1, the cached-tree does not know the object name of the tree object, but we know the number of index entries that are covered by this subtree. The caller, instead of walking the index, can subtract the count from -1 and skip that many entries to find the index entry after the part that is inside this directory. Newer Git may write -201 and newer Git may be able to take advantage of the new information encoded there, saving 200 calls to prefixcmp(), while given the same index, older Git will operate just as correctly as before. An older Git may write that part of the cache-tree back with -1, defeating the optimization when a newer Git reads it back, but by definition the older Git does not know what to write to help that optimization that did not exist when it was done, and writing -1 will not confuse the newer Git. -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html