Re: [PATCH v4 4/6] rev-list: handle missing tree objects properly

2018-08-14 Thread Jonathan Tan
> > So we don't want to die in list-objects.c. If we
> > fail to fetch, then we will die on line 213 in rev-list.c.
> 
> Why don't we want to die in list-objects.c? When --missing=error is
> passed, fetch_if_missing retains its default value of 1, so
> parse_tree_gently() will attempt to fetch it - and if it fails, I think
> it's appropriate to die in list-objects.c (and this should be the
> current behavior). On other values, e.g. --missing=allow-any, there is
> no autofetch (since fetch_if_missing is 0), so it is correct not to die
> in list-objects.c.

After some in-office discussion, I should have checked line 213 in
builtin/rev-list.c more thorougly. Indeed it is OK not to die in
list-objects.c here, since builtin/rev-list.c already knows how to
handle missing objects in the --missing=error circumstance.


Re: [PATCH v4 4/6] rev-list: handle missing tree objects properly

2018-08-14 Thread Jonathan Tan
> > > @@ -373,6 +375,7 @@ int cmd_rev_list(int argc, const char **argv, const 
> > > char *prefix)
> > >   init_revisions(, prefix);
> > >   revs.abbrev = DEFAULT_ABBREV;
> > >   revs.commit_format = CMIT_FMT_UNSPECIFIED;
> > > + revs.do_not_die_on_missing_tree = 1;
> >
> > Is this correct? I would have expected this to be set only if --missing
> > was set.
> If --missing is not set, then we want to fetch missing objects
> automatically, and then die if we fail to do that, which is what
> happens for blobs.

This is true, and should already be handled. Pay attention to when
fetch_if_missing is set in builtin/rev-list.c.

do_not_die_on_missing_tree should probably be set to 1 whenever
fetch_if_missing is set to 0, I think.

(I acknowledge that the usage of this global variable is confusing, but
I couldn't think of a better way to implement this when I did. Perhaps
when the object store refactoring is done, this can be a store-specific
setting instead of a global variable.)

> So we don't want to die in list-objects.c. If we
> fail to fetch, then we will die on line 213 in rev-list.c.

Why don't we want to die in list-objects.c? When --missing=error is
passed, fetch_if_missing retains its default value of 1, so
parse_tree_gently() will attempt to fetch it - and if it fails, I think
it's appropriate to die in list-objects.c (and this should be the
current behavior). On other values, e.g. --missing=allow-any, there is
no autofetch (since fetch_if_missing is 0), so it is correct not to die
in list-objects.c.

> > As for --missing=allow-promisor, I don't see them being tested anywhere
> > :-( so feel free to make a suggestion. I would put them in t6112 for
> > easy comparison with the other --missing tests.
> Kept my allow-promisor test in t0410 since it requires partial clone
> to be turned on in the config, and because it is pretty similar to
> --exclude-promisor-objects.

OK, sounds good to me.


Re: [PATCH v4 4/6] rev-list: handle missing tree objects properly

2018-08-14 Thread Matthew DeVore
On Tue, Aug 14, 2018 at 11:06 AM Jonathan Tan  wrote:
>
> > Previously, we assumed only blob objects could be missing. This patch
> > makes rev-list handle missing trees like missing blobs. A missing tree
> > will cause an error if --missing indicates an error should be caused,
> > and the hash is printed even if the tree is missing.
>
> The last sentence is difficult to understand - probably better to say
> that all --missing= arguments and --exclude-promisor-objects work for
> missing trees like they currently do for blobs (and do not fixate on
> just --missing=error). And also demonstrate this in tests, like in
> t6612.
Fixed the commit message. And for the tests, in t0410 I changed the
--missing=allow-any to --missing-allow-promisor, and in t6112 I added
--missing=allow-any and --missing=print test cases.

>
> > In list-objects.c we no longer print a message to stderr if a tree
> > object is missing (quiet_on_missing is always true). I couldn't find
> > any place where this would matter, or where the caller of
> > traverse_commit_list would need to be fixed to show the error. However,
> > in the future it would be trivial to make the caller show the message if
> > we needed to.
> >
> > This is not tested very thoroughly, since we cannot create promisor
> > objects in tests without using an actual partial clone. t0410 has a
> > promise_and_delete utility function, but the is_promisor_object function
> > does not return 1 for objects deleted in this way. More tests will will
> > come in a patch that implements a filter that can be used with git
> > clone.
>
> These two paragraphs are no longer applicable, I think.
Sorry about that. Removed.

>
> > @@ -373,6 +375,7 @@ int cmd_rev_list(int argc, const char **argv, const 
> > char *prefix)
> >   init_revisions(, prefix);
> >   revs.abbrev = DEFAULT_ABBREV;
> >   revs.commit_format = CMIT_FMT_UNSPECIFIED;
> > + revs.do_not_die_on_missing_tree = 1;
>
> Is this correct? I would have expected this to be set only if --missing
> was set.
If --missing is not set, then we want to fetch missing objects
automatically, and then die if we fail to do that, which is what
happens for blobs. So we don't want to die in list-objects.c. If we
fail to fetch, then we will die on line 213 in rev-list.c.

>
> > - process_tree_contents(ctx, tree, base);
> > + /*
> > +  * NEEDSWORK: we should not have to process this tree's contents if 
> > the
> > +  * filter wants to exclude all its contents AND the filter doesn't 
> > need
> > +  * to collect the omitted OIDs. We should add a LOFR_SKIP_TREE bit 
> > which
> > +  * allows skipping all children.
> > +  */
> > + if (parsed)
> > + process_tree_contents(ctx, tree, base);
>
> I agree with Jeff Hostetler in [1] that a LOFR_SKIP_TREE bit is
> desirable, but I don't think that this patch is the right place to
> introduce this NEEDSWORK. For me, this patch is about skipping iterating
> over the contents of a tree because the tree does not exist; this
> NEEDSWORK is about skipping iterating over the contents of a tree
> because we don't want its contents, and it is quite confusing to
> conflate the two.
I've removed this.

>
> [1] 
> https://public-inbox.org/git/d751d56b-84bb-a03d-5f2a-7dbaf8d94...@jeffhostetler.com/
>
> > @@ -124,6 +124,7 @@ struct rev_info {
> >   first_parent_only:1,
> >   line_level_traverse:1,
> >   tree_blobs_in_commit_order:1,
> > + do_not_die_on_missing_tree:1,
>
> I know that the other flags don't have documentation, but I think it's
> worth documenting this one because it is rather complicated. I have
> provided a sample one in my earlier review - feel free to use that or
> come up with your own.
Added your wording to revision.h without major change.

>
> > diff --git a/t/t0410-partial-clone.sh b/t/t0410-partial-clone.sh
> > index 4984ca583..74e3c5767 100755
> > --- a/t/t0410-partial-clone.sh
> > +++ b/t/t0410-partial-clone.sh
> > @@ -186,6 +186,72 @@ test_expect_success 'rev-list stops traversal at 
> > missing and promised commit' '
> >   ! grep $FOO out
> >  '
> >
> > +test_expect_success 'show missing tree objects with --missing=print' '
> > + rm -rf repo &&
> > + test_create_repo repo &&
> > + test_commit -C repo foo &&
> > + test_commit -C repo bar &&
> > + test_commit -C repo baz &&
> > +
> > + TREE=$(git -C repo rev-parse bar^{tree}) &&
> > +
> > + promise_and_delete $TREE &&
> > +
> > + git -C repo config core.repositoryformatversion 1 &&
> > + git -C repo config extensions.partialclone "arbitrary string" &&
> > + git -C repo rev-list --quiet --missing=print --objects HEAD 
> > >missing_objs 2>rev_list_err &&
> > + echo "?$TREE" >expected &&
> > + test_cmp expected missing_objs &&
> > +
> > + # do not complain when a missing tree cannot be parsed
> > + ! grep -q "Could not read " rev_list_err
> > +'
>
> I 

Re: [PATCH v4 4/6] rev-list: handle missing tree objects properly

2018-08-14 Thread Jonathan Tan
> Previously, we assumed only blob objects could be missing. This patch
> makes rev-list handle missing trees like missing blobs. A missing tree
> will cause an error if --missing indicates an error should be caused,
> and the hash is printed even if the tree is missing.

The last sentence is difficult to understand - probably better to say
that all --missing= arguments and --exclude-promisor-objects work for
missing trees like they currently do for blobs (and do not fixate on
just --missing=error). And also demonstrate this in tests, like in
t6612.

> In list-objects.c we no longer print a message to stderr if a tree
> object is missing (quiet_on_missing is always true). I couldn't find
> any place where this would matter, or where the caller of
> traverse_commit_list would need to be fixed to show the error. However,
> in the future it would be trivial to make the caller show the message if
> we needed to.
> 
> This is not tested very thoroughly, since we cannot create promisor
> objects in tests without using an actual partial clone. t0410 has a
> promise_and_delete utility function, but the is_promisor_object function
> does not return 1 for objects deleted in this way. More tests will will
> come in a patch that implements a filter that can be used with git
> clone.

These two paragraphs are no longer applicable, I think.

> @@ -373,6 +375,7 @@ int cmd_rev_list(int argc, const char **argv, const char 
> *prefix)
>   init_revisions(, prefix);
>   revs.abbrev = DEFAULT_ABBREV;
>   revs.commit_format = CMIT_FMT_UNSPECIFIED;
> + revs.do_not_die_on_missing_tree = 1;

Is this correct? I would have expected this to be set only if --missing
was set.

> - process_tree_contents(ctx, tree, base);
> + /*
> +  * NEEDSWORK: we should not have to process this tree's contents if the
> +  * filter wants to exclude all its contents AND the filter doesn't need
> +  * to collect the omitted OIDs. We should add a LOFR_SKIP_TREE bit which
> +  * allows skipping all children.
> +  */
> + if (parsed)
> + process_tree_contents(ctx, tree, base);

I agree with Jeff Hostetler in [1] that a LOFR_SKIP_TREE bit is
desirable, but I don't think that this patch is the right place to
introduce this NEEDSWORK. For me, this patch is about skipping iterating
over the contents of a tree because the tree does not exist; this
NEEDSWORK is about skipping iterating over the contents of a tree
because we don't want its contents, and it is quite confusing to
conflate the two.

[1] 
https://public-inbox.org/git/d751d56b-84bb-a03d-5f2a-7dbaf8d94...@jeffhostetler.com/

> @@ -124,6 +124,7 @@ struct rev_info {
>   first_parent_only:1,
>   line_level_traverse:1,
>   tree_blobs_in_commit_order:1,
> + do_not_die_on_missing_tree:1,

I know that the other flags don't have documentation, but I think it's
worth documenting this one because it is rather complicated. I have
provided a sample one in my earlier review - feel free to use that or
come up with your own.

> diff --git a/t/t0410-partial-clone.sh b/t/t0410-partial-clone.sh
> index 4984ca583..74e3c5767 100755
> --- a/t/t0410-partial-clone.sh
> +++ b/t/t0410-partial-clone.sh
> @@ -186,6 +186,72 @@ test_expect_success 'rev-list stops traversal at missing 
> and promised commit' '
>   ! grep $FOO out
>  '
>  
> +test_expect_success 'show missing tree objects with --missing=print' '
> + rm -rf repo &&
> + test_create_repo repo &&
> + test_commit -C repo foo &&
> + test_commit -C repo bar &&
> + test_commit -C repo baz &&
> +
> + TREE=$(git -C repo rev-parse bar^{tree}) &&
> +
> + promise_and_delete $TREE &&
> +
> + git -C repo config core.repositoryformatversion 1 &&
> + git -C repo config extensions.partialclone "arbitrary string" &&
> + git -C repo rev-list --quiet --missing=print --objects HEAD 
> >missing_objs 2>rev_list_err &&
> + echo "?$TREE" >expected &&
> + test_cmp expected missing_objs &&
> +
> + # do not complain when a missing tree cannot be parsed
> + ! grep -q "Could not read " rev_list_err
> +'

I think that the --exclude-promisor-tests can go in t0410 as you have
done, but the --missing tests (except for --missing=allow-promisor)
should go in t6112. (And like the existing --missing tests, they should
be done without setting extensions.partialclone.)

As for --missing=allow-promisor, I don't see them being tested anywhere
:-( so feel free to make a suggestion. I would put them in t6112 for
easy comparison with the other --missing tests.


[PATCH v4 4/6] rev-list: handle missing tree objects properly

2018-08-14 Thread Matthew DeVore
Previously, we assumed only blob objects could be missing. This patch
makes rev-list handle missing trees like missing blobs. A missing tree
will cause an error if --missing indicates an error should be caused,
and the hash is printed even if the tree is missing.

In list-objects.c we no longer print a message to stderr if a tree
object is missing (quiet_on_missing is always true). I couldn't find
any place where this would matter, or where the caller of
traverse_commit_list would need to be fixed to show the error. However,
in the future it would be trivial to make the caller show the message if
we needed to.

This is not tested very thoroughly, since we cannot create promisor
objects in tests without using an actual partial clone. t0410 has a
promise_and_delete utility function, but the is_promisor_object function
does not return 1 for objects deleted in this way. More tests will will
come in a patch that implements a filter that can be used with git
clone.

Signed-off-by: Matthew DeVore 
---
 builtin/rev-list.c | 11 +++--
 list-objects.c | 17 +--
 revision.h |  1 +
 t/t0410-partial-clone.sh   | 66 ++
 t/t5317-pack-objects-filter-objects.sh | 13 +
 5 files changed, 101 insertions(+), 7 deletions(-)

diff --git a/builtin/rev-list.c b/builtin/rev-list.c
index 5b07f3f4a..49d6deed7 100644
--- a/builtin/rev-list.c
+++ b/builtin/rev-list.c
@@ -6,6 +6,7 @@
 #include "list-objects.h"
 #include "list-objects-filter.h"
 #include "list-objects-filter-options.h"
+#include "object.h"
 #include "object-store.h"
 #include "pack.h"
 #include "pack-bitmap.h"
@@ -209,7 +210,8 @@ static inline void finish_object__ma(struct object *obj)
 */
switch (arg_missing_action) {
case MA_ERROR:
-   die("missing blob object '%s'", oid_to_hex(>oid));
+   die("missing %s object '%s'",
+   type_name(obj->type), oid_to_hex(>oid));
return;
 
case MA_ALLOW_ANY:
@@ -222,8 +224,8 @@ static inline void finish_object__ma(struct object *obj)
case MA_ALLOW_PROMISOR:
if (is_promisor_object(>oid))
return;
-   die("unexpected missing blob object '%s'",
-   oid_to_hex(>oid));
+   die("unexpected missing %s object '%s'",
+   type_name(obj->type), oid_to_hex(>oid));
return;
 
default:
@@ -235,7 +237,7 @@ static inline void finish_object__ma(struct object *obj)
 static int finish_object(struct object *obj, const char *name, void *cb_data)
 {
struct rev_list_info *info = cb_data;
-   if (obj->type == OBJ_BLOB && !has_object_file(>oid)) {
+   if (!has_object_file(>oid)) {
finish_object__ma(obj);
return 1;
}
@@ -373,6 +375,7 @@ int cmd_rev_list(int argc, const char **argv, const char 
*prefix)
init_revisions(, prefix);
revs.abbrev = DEFAULT_ABBREV;
revs.commit_format = CMIT_FMT_UNSPECIFIED;
+   revs.do_not_die_on_missing_tree = 1;
 
/*
 * Scan the argument list before invoking setup_revisions(), so that we
diff --git a/list-objects.c b/list-objects.c
index f9b51db7a..e88474a2d 100644
--- a/list-objects.c
+++ b/list-objects.c
@@ -143,6 +143,7 @@ static void process_tree(struct traversal_context *ctx,
struct rev_info *revs = ctx->revs;
int baselen = base->len;
enum list_objects_filter_result r = LOFR_MARK_SEEN | LOFR_DO_SHOW;
+   int parsed;
 
if (!revs->tree_objects)
return;
@@ -150,7 +151,9 @@ static void process_tree(struct traversal_context *ctx,
die("bad tree object");
if (obj->flags & (UNINTERESTING | SEEN))
return;
-   if (parse_tree_gently(tree, 1) < 0) {
+
+   parsed = parse_tree_gently(tree, 1) >= 0;
+   if (!parsed) {
if (revs->ignore_missing_links)
return;
 
@@ -163,7 +166,8 @@ static void process_tree(struct traversal_context *ctx,
is_promisor_object(>oid))
return;
 
-   die("bad tree object %s", oid_to_hex(>oid));
+   if (!revs->do_not_die_on_missing_tree)
+   die("bad tree object %s", oid_to_hex(>oid));
}
 
strbuf_addstr(base, name);
@@ -178,7 +182,14 @@ static void process_tree(struct traversal_context *ctx,
if (base->len)
strbuf_addch(base, '/');
 
-   process_tree_contents(ctx, tree, base);
+   /*
+* NEEDSWORK: we should not have to process this tree's contents if the
+* filter wants to exclude all its contents AND the filter doesn't need
+* to collect the omitted OIDs. We should add a LOFR_SKIP_TREE bit which
+* allows skipping all children.
+*/
+   if (parsed)
+