Re: [PATCH v2 3/5] rev-list: handle missing tree objects properly
> I don't understand about the ">= 0". What should I replace it with? > Maybe you mean the return is never positive so I can change: > > parse_tree_gently(tree, 1) >= 0 > > to: > !parse_tree_gently(tree, 1) > > ? Sorry for the lack of clarity - that is what I meant. > > The missing mechanism (for error, allow-any, print) should work without > > needing to consult whether an object is a promisor object or not - it > > should just print whatever is missing, so the "if > > (!is_promisor_object..." line looks out of place. > Done. I considered that a missing object which is not a promisor is a > serious error, so I had it die here. It is a serious error, but as far as I can tell, that is what the --missing flags are supposed to help diagnose (so we can't die since we need the diagnoses to be printed). See, for example, 'rev-list W/ --missing=print' in t6112 - the "r1" repository does not have partial clone enabled (which I verified by inserting a test_pause then cat-ting r1/.git/config), but nothing dies. > But now that I've added the > do_not_die_on_missing_tree flag, it's more natural to keep the > previous promisor check as-is. OK, I'll take a look once you send out v4. > Also, is_promisor_object is an > expensive check, and it would be better to skip it during the common > execution path (which should be when exclude_promisor_objects, an > internal-use-only flag, is *not* set, which means we never call > is_promisor_object. That's true. > > In my original review [1], I suggested that we always show a tree if we > > have its hash - if we don't have the object, we just recurse into it. > > This would be the same as your patch, except that the 'die("bad tree > > object...' is totally removed instead of merely moved. I still think > > this solution has some merit - all the tests still pass (except that we > > need to check for "unable to read" instead of "bad tree object" in error > > messages), but I just realized that it might still be backwards > > incompatible in that a basic "rev-list --objects" would now succeed > > instead of fail if a tree was missing (I haven't tested this though). > The presence of the die if !is_promisor_object is what justified the > changing of the parse_tree_gently to always be gently, since it is > what showed the OID. Can we really remove both? Maybe in a different > patch set, since I'm no longer touching that line? That's true - the idea of removing both needs more thought, and if we were to do so, we definitely can do it in a different patch set. > > We might need a flag called "do_not_die_on_missing_tree" (much like your > > original idea of "show_missing_trees") so that callers that are prepared > > to deal with missing trees can set this. Sorry for the churn. You can > > document it as such: > Added it, but not with a command-line flag, only in rev-info.h. We can > always add a flag later if people have been relying on the existing > behavior of git rev-list to balk at missing trees. (That seems > unlikely though, considering there is no filter to enable that before > this patchset). By flag, I indeed meant in rev-info.h - sorry for the confusion. That sounds good.
Re: [PATCH v2 3/5] rev-list: handle missing tree objects properly
Resending this in plain-text mode so that git@vger.kernel.org won't bounce it. Sorry for those of you receiving this twice. On Mon, Aug 13, 2018 at 11:20 AM Jonathan Tan wrote: > > > In list-objects.c we no longer print a message to stderr if a tree > > object is missing (quiet_on_missing is always true). I couldn't find > > any place where this would matter, or where the caller of > > traverse_commit_list would need to be fixed to show the error. However, > > in the future it would be trivial to make the caller show the message if > > we needed to. > > Indeed, and I'm not sure why the message was there in the first place - > if parsing fails when revs->ignore_missing_links and > revs->exclude_promisor_objects are both false, we print the OID anyway > in the "die" call, so any message printed by parse_tree_gently() seems > superfluous. > > It might be better to add an additional commit that removes the "gently" > condition (in other words, always parsing gently), with a message > explaining the above. Also, in that commit, I prefer not to add the > "/*quiet_on_missing*/" explanation (we don't seem to do that in Git > code); I also know that the ">= 0" is a holdover from the existing "< 0" > code, but we don't need to do that either. Good idea. I've added a new commit which replaces the calculation with a hard-coded "1" I don't understand about the ">= 0". What should I replace it with? Maybe you mean the return is never positive so I can change: parse_tree_gently(tree, 1) >= 0 to: !parse_tree_gently(tree, 1) ? > > > This is not tested very thoroughly, since we cannot create promisor > > objects in tests without using an actual partial clone. t0410 has a > > promise_and_delete utility function, but the is_promisor_object function > > does not return 1 for objects deleted in this way. More tests will will > > come in a patch that implements a filter that can be used with git > > clone. > > is_promisor_object() should. If you still have the code you used to > verify that, can you share it? In particular, pay attention to the path > of the repo - promise_and_delete is hardcoded to use one particular > path. It turns out I wasn't setting the extensions.partial_clone config in my test, and that's why everything wasn't working. So I've moved all the tests feasible back to the earlier commit. Cool :) > > Whether you test in this patch or in the last patch, make sure that the > following are tested: > git rev-list --missing=error, allow-any, allow-promisor, print > git rev-list --exclude-promisor-objects > Added --missing=print, --missing=allow-any, and --exclude-promisor-objects to t0410 --missing=allow-promisor did some seem sufficiently interesting or different from allow-any to justify adding it. I had to put missing=error into the commit that introduces the tree:0 filter, since that flag causes an automatic attempt to fetch the missing object, which t0410 does not seem to support. So added test case "auto-fetching of trees with --missing=error" to t5616. > Also, test when a tree pointed to by a commit is missing, and when a > tree pointed to by a tree is missing. Former is done multiple times already, added latter to t0410 as "missing non-root tree object and rev-list." > > > @@ -152,20 +151,21 @@ static void process_tree(struct traversal_context > > *ctx, > > die("bad tree object"); > > if (obj->flags & (UNINTERESTING | SEEN)) > > return; > > - if (parse_tree_gently(tree, gently) < 0) { > > + parsed = parse_tree_gently(tree, /*quiet_on_missing=*/1) >= 0; > > + if (!parsed) { > > if (revs->ignore_missing_links) > > return; > > > > + if (!is_promisor_object(>oid)) > > + die("bad tree object %s", oid_to_hex(>oid)); > > + > > /* > >* Pre-filter known-missing tree objects when explicitly > >* requested. This may cause the actual filter to report > >* an incomplete list of missing objects. > >*/ > > - if (revs->exclude_promisor_objects && > > - is_promisor_object(>oid)) > > + if (revs->exclude_promisor_objects) > > return; > > - > > - die("bad tree object %s", oid_to_hex(>oid)); > > } > > The missing mechanism (for error, allow-any, print) should work without > needing to consult whether an object is a promisor object or not - it > should just print whatever is missing, so the "if > (!is_promisor_object..." line looks out of place. Done. I considered that a missing object which is not a promisor is a serious error, so I had it die here. But now that I've added the do_not_die_on_missing_tree flag, it's more natural to keep the previous promisor check as-is. Also, is_promisor_object is an expensive check, and it would be better to skip it during the common execution path (which should be when exclude_promisor_objects, an
Re: [PATCH v2 3/5] rev-list: handle missing tree objects properly
> In list-objects.c we no longer print a message to stderr if a tree > object is missing (quiet_on_missing is always true). I couldn't find > any place where this would matter, or where the caller of > traverse_commit_list would need to be fixed to show the error. However, > in the future it would be trivial to make the caller show the message if > we needed to. Indeed, and I'm not sure why the message was there in the first place - if parsing fails when revs->ignore_missing_links and revs->exclude_promisor_objects are both false, we print the OID anyway in the "die" call, so any message printed by parse_tree_gently() seems superfluous. It might be better to add an additional commit that removes the "gently" condition (in other words, always parsing gently), with a message explaining the above. Also, in that commit, I prefer not to add the "/*quiet_on_missing*/" explanation (we don't seem to do that in Git code); I also know that the ">= 0" is a holdover from the existing "< 0" code, but we don't need to do that either. > This is not tested very thoroughly, since we cannot create promisor > objects in tests without using an actual partial clone. t0410 has a > promise_and_delete utility function, but the is_promisor_object function > does not return 1 for objects deleted in this way. More tests will will > come in a patch that implements a filter that can be used with git > clone. is_promisor_object() should. If you still have the code you used to verify that, can you share it? In particular, pay attention to the path of the repo - promise_and_delete is hardcoded to use one particular path. Whether you test in this patch or in the last patch, make sure that the following are tested: git rev-list --missing=error, allow-any, allow-promisor, print git rev-list --exclude-promisor-objects Also, test when a tree pointed to by a commit is missing, and when a tree pointed to by a tree is missing. > @@ -152,20 +151,21 @@ static void process_tree(struct traversal_context *ctx, > die("bad tree object"); > if (obj->flags & (UNINTERESTING | SEEN)) > return; > - if (parse_tree_gently(tree, gently) < 0) { > + parsed = parse_tree_gently(tree, /*quiet_on_missing=*/1) >= 0; > + if (!parsed) { > if (revs->ignore_missing_links) > return; > > + if (!is_promisor_object(>oid)) > + die("bad tree object %s", oid_to_hex(>oid)); > + > /* >* Pre-filter known-missing tree objects when explicitly >* requested. This may cause the actual filter to report >* an incomplete list of missing objects. >*/ > - if (revs->exclude_promisor_objects && > - is_promisor_object(>oid)) > + if (revs->exclude_promisor_objects) > return; > - > - die("bad tree object %s", oid_to_hex(>oid)); > } The missing mechanism (for error, allow-any, print) should work without needing to consult whether an object is a promisor object or not - it should just print whatever is missing, so the "if (!is_promisor_object..." line looks out of place. In my original review [1], I suggested that we always show a tree if we have its hash - if we don't have the object, we just recurse into it. This would be the same as your patch, except that the 'die("bad tree object...' is totally removed instead of merely moved. I still think this solution has some merit - all the tests still pass (except that we need to check for "unable to read" instead of "bad tree object" in error messages), but I just realized that it might still be backwards incompatible in that a basic "rev-list --objects" would now succeed instead of fail if a tree was missing (I haven't tested this though). We might need a flag called "do_not_die_on_missing_tree" (much like your original idea of "show_missing_trees") so that callers that are prepared to deal with missing trees can set this. Sorry for the churn. You can document it as such: Blobs are shown without regard for their existence. But not so for trees: unless exclude_promisor_objects is set and the tree in question is a promisor object, or ignore_missing_links is set (and in this case, the tree in question may or may not be a promisor object), the revision walker dies with a "bad tree object" message when encountering a missing tree. For callers that can handle missing trees and want them to be filterable and showable, set this to true. The revision walker will filter and show such a missing tree as usual, but will not attempt to recurse into this tree object. [1] https://public-inbox.org/git/20180810002411.13447-1-jonathanta...@google.com/
[PATCH v2 3/5] rev-list: handle missing tree objects properly
Previously, we assumed only blob objects could be missing. This patch makes rev-list handle missing trees like missing blobs. A missing tree will cause an error if --missing indicates an error should be caused, and the hash is printed even if the tree is missing. In list-objects.c we no longer print a message to stderr if a tree object is missing (quiet_on_missing is always true). I couldn't find any place where this would matter, or where the caller of traverse_commit_list would need to be fixed to show the error. However, in the future it would be trivial to make the caller show the message if we needed to. This is not tested very thoroughly, since we cannot create promisor objects in tests without using an actual partial clone. t0410 has a promise_and_delete utility function, but the is_promisor_object function does not return 1 for objects deleted in this way. More tests will will come in a patch that implements a filter that can be used with git clone. Signed-off-by: Matthew DeVore --- builtin/rev-list.c | 10 ++ list-objects.c | 17 + t/t5317-pack-objects-filter-objects.sh | 13 + 3 files changed, 28 insertions(+), 12 deletions(-) diff --git a/builtin/rev-list.c b/builtin/rev-list.c index 5b07f3f4a..ea0daf0c4 100644 --- a/builtin/rev-list.c +++ b/builtin/rev-list.c @@ -6,6 +6,7 @@ #include "list-objects.h" #include "list-objects-filter.h" #include "list-objects-filter-options.h" +#include "object.h" #include "object-store.h" #include "pack.h" #include "pack-bitmap.h" @@ -209,7 +210,8 @@ static inline void finish_object__ma(struct object *obj) */ switch (arg_missing_action) { case MA_ERROR: - die("missing blob object '%s'", oid_to_hex(>oid)); + die("missing %s object '%s'", + type_name(obj->type), oid_to_hex(>oid)); return; case MA_ALLOW_ANY: @@ -222,8 +224,8 @@ static inline void finish_object__ma(struct object *obj) case MA_ALLOW_PROMISOR: if (is_promisor_object(>oid)) return; - die("unexpected missing blob object '%s'", - oid_to_hex(>oid)); + die("unexpected missing %s object '%s'", + type_name(obj->type), oid_to_hex(>oid)); return; default: @@ -235,7 +237,7 @@ static inline void finish_object__ma(struct object *obj) static int finish_object(struct object *obj, const char *name, void *cb_data) { struct rev_list_info *info = cb_data; - if (obj->type == OBJ_BLOB && !has_object_file(>oid)) { + if (!has_object_file(>oid)) { finish_object__ma(obj); return 1; } diff --git a/list-objects.c b/list-objects.c index ccc529e5e..aedcd0228 100644 --- a/list-objects.c +++ b/list-objects.c @@ -143,8 +143,7 @@ static void process_tree(struct traversal_context *ctx, struct rev_info *revs = ctx->revs; int baselen = base->len; enum list_objects_filter_result r = LOFR_MARK_SEEN | LOFR_DO_SHOW; - int gently = revs->ignore_missing_links || -revs->exclude_promisor_objects; + int parsed; if (!revs->tree_objects) return; @@ -152,20 +151,21 @@ static void process_tree(struct traversal_context *ctx, die("bad tree object"); if (obj->flags & (UNINTERESTING | SEEN)) return; - if (parse_tree_gently(tree, gently) < 0) { + parsed = parse_tree_gently(tree, /*quiet_on_missing=*/1) >= 0; + if (!parsed) { if (revs->ignore_missing_links) return; + if (!is_promisor_object(>oid)) + die("bad tree object %s", oid_to_hex(>oid)); + /* * Pre-filter known-missing tree objects when explicitly * requested. This may cause the actual filter to report * an incomplete list of missing objects. */ - if (revs->exclude_promisor_objects && - is_promisor_object(>oid)) + if (revs->exclude_promisor_objects) return; - - die("bad tree object %s", oid_to_hex(>oid)); } strbuf_addstr(base, name); @@ -180,7 +180,8 @@ static void process_tree(struct traversal_context *ctx, if (base->len) strbuf_addch(base, '/'); - process_tree_contents(ctx, tree, base); + if (parsed) + process_tree_contents(ctx, tree, base); if (!(obj->flags & USER_GIVEN) && ctx->filter_fn) { r = ctx->filter_fn(LOFS_END_TREE, obj, diff --git a/t/t5317-pack-objects-filter-objects.sh b/t/t5317-pack-objects-filter-objects.sh index 6710c8bc8..5e35f33bf 100755 --- a/t/t5317-pack-objects-filter-objects.sh +++