Re: [PATCH] revision: introduce prepare_revision_walk_extended()

2017-12-25 Thread René Scharfe
Am 24.12.2017 um 15:22 schrieb Jeff King:
> The single-traversal thing I suspect doesn't matter much in practice. In
> both cases if we would visit commit X twice, we'd immediately see on the
> second visit that it has already been cleared and not do anymore work.

Good point.  That makes clear_commit_marks_many() less useful than
advertised in e895cb5135, though.

>Side note: Another question is whether it would simply be faster to
>clear the flags for _all_ objects that we've touched in the current
>process (we have clear_object_flags() for this already). Then we know
>that we touch each one once, and we as a bonus we don't even have to
>keep the previous tips. The downsides are:
> 
>  - if another traversal in the process looked at many objects, but
>our current traversal looked at few, then we would examine more
>objects than we need to (albeit with lower cost per object)
> 
>  - it's possible there's another traversal in the same process whose
>flags we would want to have saved. I suspect such a setup is
>broken already, though, unless there's a guarantee that the two
>traversals don't overlap.

I thought about that nuclear option as well.  It might be a good idea
for code in cmd_* and similar leaf functions for cleaning up between
unrelated stages (e.g. between parts that had been separate external
git command calls before).  They probably only load potentially
interesting objects into memory and don't need to bother much about
interactions with other functions.

But clear_object_flags() makes me nervous because it clears the flags
of all kinds of objects, not just for commits, and I can't easily
convince myself that this is safe.  Adding a version that checks the
object type would be an easy way out.

René


Re: [PATCH] revision: introduce prepare_revision_walk_extended()

2017-12-24 Thread Jeff King
On Thu, Dec 21, 2017 at 07:41:44PM +0100, René Scharfe wrote:

> Am 20.12.2017 um 14:08 schrieb Jeff King:
> > On Tue, Dec 19, 2017 at 10:33:55AM -0800, Junio C Hamano wrote:
> > 
> >> Should we take the patch posted as-is and move forward?
> > 
> > I suppose so.  I don't really have anything against the patch. My main
> > complaint was just that I don't think it's actually cleaning up the
> > gross parts of the interface. It's just substituting one gross thing (a
> > funny struct flag) for another (a special version of the prepare
> > function that takes a funny out-parameter).
> 
> If this is a fair description (and I have to admit that it is) then I
> don't understand why you aren't against the patch.  Let's drop it.

Heh, I almost wrote more on this. My thinking was two-fold:

  - while I think the fundamental gross-ness is still there after your
patch, it does smooth some of the rough edges. So it's a slight
improvement over the status quo.

  - I read an article a while back (which unfortunately I can't find
now) about the idea of "default to yes" in open source. I.e., the
idea that if somebody bothered to cook up a patch and there is no
good reason to reject it, you should take it.

Certainly there are cases where that can go too far: obvious ones
like bad ideas where it is not really "default" anymore, but also
subtle ones where the changes are code churn whose fallouts will
be dealt with by others. But this patch is a good example. I'm not
convinced it makes things better, but I don't think it makes things
worse. If you, who looked a lot closer at the problem than I did,
still think it's a good idea after our discussion, it's probably
worth applying.

So my comments weren't really "this is a bad idea" as much as "is there
a better idea". We didn't come up with one after some discussion, and
I'm willing to let it go there for now.

But if you want to keep thinking on it, I'm game. ;)

> Can we do better?  How about something this?  It reverts bundle to
> duplicate the object_array, but creates just a commit_list in the other
> two cases.  As a side-effect this allows clearing flags for all entries
> with a single traversal.

I think this is basically the "copy it before-hand" thing from earlier
in the thread. Switching to just keeping commits seems like a nice
change. It's an easy (if minor) optimization, and it makes clear exactly
which part of the data we need to keep around.

The single-traversal thing I suspect doesn't matter much in practice. In
both cases if we would visit commit X twice, we'd immediately see on the
second visit that it has already been cleared and not do anymore work.

  Side note: Another question is whether it would simply be faster to
  clear the flags for _all_ objects that we've touched in the current
  process (we have clear_object_flags() for this already). Then we know
  that we touch each one once, and we as a bonus we don't even have to
  keep the previous tips. The downsides are:

- if another traversal in the process looked at many objects, but
  our current traversal looked at few, then we would examine more
  objects than we need to (albeit with lower cost per object)

- it's possible there's another traversal in the same process whose
  flags we would want to have saved. I suspect such a setup is
  broken already, though, unless there's a guarantee that the two
  traversals don't overlap.

So anyway, I think this is a reasonable approach, unless we're really
worried about the extra O(# of pending) allocation.

-Peff


Re: [PATCH] revision: introduce prepare_revision_walk_extended()

2017-12-21 Thread René Scharfe
Am 20.12.2017 um 14:08 schrieb Jeff King:
> On Tue, Dec 19, 2017 at 10:33:55AM -0800, Junio C Hamano wrote:
> 
>> Should we take the patch posted as-is and move forward?
> 
> I suppose so.  I don't really have anything against the patch. My main
> complaint was just that I don't think it's actually cleaning up the
> gross parts of the interface. It's just substituting one gross thing (a
> funny struct flag) for another (a special version of the prepare
> function that takes a funny out-parameter).

If this is a fair description (and I have to admit that it is) then I
don't understand why you aren't against the patch.  Let's drop it.

Can we do better?  How about something this?  It reverts bundle to
duplicate the object_array, but creates just a commit_list in the other
two cases.  As a side-effect this allows clearing flags for all entries
with a single traversal.

---
 bisect.c   | 18 +++---
 builtin/checkout.c | 13 +++--
 bundle.c   | 12 +---
 commit.c   | 18 --
 commit.h   |  3 +++
 revision.c |  3 +--
 revision.h | 12 
 7 files changed, 35 insertions(+), 44 deletions(-)

diff --git a/bisect.c b/bisect.c
index 0fca17c02b..c966220738 100644
--- a/bisect.c
+++ b/bisect.c
@@ -819,27 +819,23 @@ static void check_merge_bases(int no_checkout)
 static int check_ancestors(const char *prefix)
 {
struct rev_info revs;
-   struct object_array pending_copy;
+   struct commit_list *to_clear = NULL;
int res;
 
bisect_rev_setup(, prefix, "^%s", "%s", 0);
 
-   /* Save pending objects, so they can be cleaned up later. */
-   pending_copy = revs.pending;
-   revs.leak_pending = 1;
-
/*
-* bisect_common calls prepare_revision_walk right away, which
-* (together with .leak_pending = 1) makes us the sole owner of
-* the list of pending objects.
+* bisect_common() calls prepare_revision_walk() right away,
+* which (among other things) clears revs.pending.  Save the
+* pending commits so that we can up their marks later.
 */
+   commit_list_insert_from_object_array(, _clear);
+
bisect_common();
res = (revs.commits != NULL);
 
/* Clean up objects used, as they will be reused. */
-   clear_commit_marks_for_object_array(_copy, ALL_REV_FLAGS);
-
-   object_array_clear(_copy);
+   clear_and_free_commit_list(to_clear, ALL_REV_FLAGS);
 
return res;
 }
diff --git a/builtin/checkout.c b/builtin/checkout.c
index f9f3797e11..0a694ae14a 100644
--- a/builtin/checkout.c
+++ b/builtin/checkout.c
@@ -791,7 +791,7 @@ static void orphaned_commit_warning(struct commit *old, 
struct commit *new)
 {
struct rev_info revs;
struct object *object = >object;
-   struct object_array refs;
+   struct commit_list *to_clear = NULL;
 
init_revisions(, NULL);
setup_revisions(0, NULL, , NULL);
@@ -803,13 +803,8 @@ static void orphaned_commit_warning(struct commit *old, 
struct commit *new)
add_pending_oid(, "HEAD", >object.oid, UNINTERESTING);
 
/* Save pending objects, so they can be cleaned up later. */
-   refs = revs.pending;
-   revs.leak_pending = 1;
+   commit_list_insert_from_object_array(, _clear);
 
-   /*
-* prepare_revision_walk (together with .leak_pending = 1) makes us
-* the sole owner of the list of pending objects.
-*/
if (prepare_revision_walk())
die(_("internal error in revision walk"));
if (!(old->object.flags & UNINTERESTING))
@@ -818,9 +813,7 @@ static void orphaned_commit_warning(struct commit *old, 
struct commit *new)
describe_detached_head(_("Previous HEAD position was"), old);
 
/* Clean up objects used, as they will be reused. */
-   clear_commit_marks_for_object_array(, ALL_REV_FLAGS);
-
-   object_array_clear();
+   clear_and_free_commit_list(to_clear, ALL_REV_FLAGS);
 }
 
 static int switch_branches(const struct checkout_opts *opts,
diff --git a/bundle.c b/bundle.c
index 93290962c9..6916296834 100644
--- a/bundle.c
+++ b/bundle.c
@@ -134,7 +134,7 @@ int verify_bundle(struct bundle_header *header, int verbose)
struct ref_list *p = >prerequisites;
struct rev_info revs;
const char *argv[] = {NULL, "--all", NULL};
-   struct object_array refs;
+   struct object_array refs = OBJECT_ARRAY_INIT;
struct commit *commit;
int i, ret = 0, req_nr;
const char *message = _("Repository lacks these prerequisite commits:");
@@ -158,13 +158,11 @@ int verify_bundle(struct bundle_header *header, int 
verbose)
setup_revisions(2, argv, , NULL);
 
/* Save pending objects, so they can be cleaned up later. */
-   refs = revs.pending;
-   revs.leak_pending = 1;
+   for (i = 0; i < revs.pending.nr; i++) {
+   struct object_array_entry *e = 

Re: [PATCH] revision: introduce prepare_revision_walk_extended()

2017-12-20 Thread Jeff King
On Tue, Dec 19, 2017 at 10:33:55AM -0800, Junio C Hamano wrote:

> >> Why does prepare_revision_walk() clear the list of pending objects at
> >> all?  Assuming the list is append-only then perhaps remembering the
> >> last handled index would suffice.
> 
> For that matter, why does it copy, instead of using revs->pending
> directly?  I do not think I can answer this, as I think the design
> decisions led to this code predates me.

Me too, then. :) I think part of that copy is that we're moving the
items over to revs->commits, and dropping any non-commit objects.

> In any case, it seems that the discussion has veered into an
> interesting tangent but at this point it seems to me that it is not
> likely to produce an immediate improvement over the posted patch.

Fair enough.

> Should we take the patch posted as-is and move forward?

I suppose so.  I don't really have anything against the patch. My main
complaint was just that I don't think it's actually cleaning up the
gross parts of the interface. It's just substituting one gross thing (a
funny struct flag) for another (a special version of the prepare
function that takes a funny out-parameter).

-Peff


Re: [PATCH] revision: introduce prepare_revision_walk_extended()

2017-12-19 Thread Junio C Hamano
Jeff King  writes:

> On Mon, Dec 18, 2017 at 08:18:19PM +0100, René Scharfe wrote:
>
>> > The root of the matter is that the revision-walking code doesn't clean
>> > up after itself. In every case, the caller is just saving these to clean
>> > up commit marks, isn't it?
>> 
>> bundle also checks if the pending objects exists.
>
> Thanks, I missed that one. So just adding a feature to clean up commit
> marks wouldn't be sufficient to cover that case.

OK.

>> > That sidesteps all of the memory ownership issues by just creating a
>> > copy. That's less efficient, but I'd be surprised if it matters in
>> > practice (we tend to do one or two revisions per process, there don't
>> > tend to be a lot of pending tips, and we're really just talking about
>> > copying some pointers here).
>> [...]
>> I don't know if there can be real-world use cases with millions of
>> entries (when it would start to hurt).
>
> I've seen repos which have tens of thousands of tags. Something like
> "rev-list --all" would have tens of thousands of pending objects.
> I think in practice it's limited to the number of objects (though in
> practice more like the number of commits).
> ...

OK again; that is an argument against "let's copy the array".

>> Why does prepare_revision_walk() clear the list of pending objects at
>> all?  Assuming the list is append-only then perhaps remembering the
>> last handled index would suffice.

For that matter, why does it copy, instead of using revs->pending
directly?  I do not think I can answer this, as I think the design
decisions led to this code predates me.

In any case, it seems that the discussion has veered into an
interesting tangent but at this point it seems to me that it is not
likely to produce an immediate improvement over the posted patch.

Should we take the patch posted as-is and move forward?


Re: [PATCH] revision: introduce prepare_revision_walk_extended()

2017-12-19 Thread Jeff King
On Mon, Dec 18, 2017 at 08:18:19PM +0100, René Scharfe wrote:

> > The root of the matter is that the revision-walking code doesn't clean
> > up after itself. In every case, the caller is just saving these to clean
> > up commit marks, isn't it?
> 
> bundle also checks if the pending objects exists.

Thanks, I missed that one. So just adding a feature to clean up commit
marks wouldn't be sufficient to cover that case.

> > That sidesteps all of the memory ownership issues by just creating a
> > copy. That's less efficient, but I'd be surprised if it matters in
> > practice (we tend to do one or two revisions per process, there don't
> > tend to be a lot of pending tips, and we're really just talking about
> > copying some pointers here).
> [...]
> I don't know if there can be real-world use cases with millions of
> entries (when it would start to hurt).

I've seen repos which have tens of thousands of tags. Something like
"rev-list --all" would have tens of thousands of pending objects.
I think in practice it's limited to the number of objects (though in
practice more like the number of commits).

I'd note also that for most uses we don't need a full object_array. You
really just need a pointer to the "struct object" to wipe its flags.

So there we might waste 8 bytes per object in the worst case. But bear
in mind that the process is wasting a lot more than that per "struct
commit" that we're holding. And versus the existing scheme, it's only
for the moment until prepare_revision_walk() frees the old pending list.

> Why does prepare_revision_walk() clear the list of pending objects at
> all?  Assuming the list is append-only then perhaps remembering the
> last handled index would suffice.

I assume it was mostly to clean up after itself, since there's no
explicit "I'm done with the traversal" function. But as I said earlier,
I'd be surprised of a revision walk doesn't leave some allocated cruft
in rev_info these days (e.g., pathspec cruft). In practice it doesn't
matter much because we don't do arbitrary numbers of traversals in
single process.

-Peff


Re: [PATCH] revision: introduce prepare_revision_walk_extended()

2017-12-18 Thread René Scharfe
Am 18.12.2017 um 16:10 schrieb Jeff King:
> On Sat, Dec 16, 2017 at 01:12:16PM +0100, René Scharfe wrote:
> 
>> prepare_revision_walk() allows callers to take ownership of the array of
>> pending objects by setting the rev_info flag "leak_pending" and copying
>> the object_array "pending".  They use it to clear commit marks after
>> setup is done.  This interface is brittle enough that it requires
>> extensive comments.
>>
>> Provide an easier way by adding a function that can hand over the array
>> to a caller-supplied output parameter and converting all users of the
>> flag "leak_pending" to call prepare_revision_walk_extended() instead.
> 
> I think this is _better_, but it's still kind of a funny interface.
> 
> The root of the matter is that the revision-walking code doesn't clean
> up after itself. In every case, the caller is just saving these to clean
> up commit marks, isn't it?

bundle also checks if the pending objects exists.

> Could we instead have an interface like:
> 
>revs.clear_commit_marks = 1;
>prepare_revision_walk();
>...
>finish_revision_walk();
> 
> where that final function would do any cleanup, including clearing the
> commit marks. I suspect there are other small bits that get leaked
> because there's not really any "destructor" for a revision walk.
> 
> It's not as flexible as this whole "make a copy of the pending tips"
> thing, but it keeps all of the details abstracted away from the callers.
> 
> Alternatively:
> 
>> +`prepare_revision_walk_extended`::
>> +
>> +Like prepare_revision_walk(), but allows callers to take ownership
>> +of the array of pending objects by passing an object_array pointer
>> +as the second parameter; passing NULL clears the array.
> 
> What if we just got rid of this function and had callers do:
> 
>object_array_copy(_pending, );
>prepare_revision_walk();
>...
>clear_commit_marks_for_object_array(_pending);
> 
> That sidesteps all of the memory ownership issues by just creating a
> copy. That's less efficient, but I'd be surprised if it matters in
> practice (we tend to do one or two revisions per process, there don't
> tend to be a lot of pending tips, and we're really just talking about
> copying some pointers here).

This was done before I added the leak_pending flag.

t5502 and t5571 have test cases with ca. 1000 pending objects, t5551
and t5541 with ca. 2000, p5310 more than 8000.  That's just a few KB.

I don't know if there can be real-world use cases with millions of
entries (when it would start to hurt).

Why does prepare_revision_walk() clear the list of pending objects at
all?  Assuming the list is append-only then perhaps remembering the
last handled index would suffice.

René


Re: [PATCH] revision: introduce prepare_revision_walk_extended()

2017-12-18 Thread Jeff King
On Sat, Dec 16, 2017 at 01:12:16PM +0100, René Scharfe wrote:

> prepare_revision_walk() allows callers to take ownership of the array of
> pending objects by setting the rev_info flag "leak_pending" and copying
> the object_array "pending".  They use it to clear commit marks after
> setup is done.  This interface is brittle enough that it requires
> extensive comments.
> 
> Provide an easier way by adding a function that can hand over the array
> to a caller-supplied output parameter and converting all users of the
> flag "leak_pending" to call prepare_revision_walk_extended() instead.

I think this is _better_, but it's still kind of a funny interface.

The root of the matter is that the revision-walking code doesn't clean
up after itself. In every case, the caller is just saving these to clean
up commit marks, isn't it?

Could we instead have an interface like:

  revs.clear_commit_marks = 1;
  prepare_revision_walk();
  ...
  finish_revision_walk();

where that final function would do any cleanup, including clearing the
commit marks. I suspect there are other small bits that get leaked
because there's not really any "destructor" for a revision walk.

It's not as flexible as this whole "make a copy of the pending tips"
thing, but it keeps all of the details abstracted away from the callers.

Alternatively:

> +`prepare_revision_walk_extended`::
> +
> + Like prepare_revision_walk(), but allows callers to take ownership
> + of the array of pending objects by passing an object_array pointer
> + as the second parameter; passing NULL clears the array.

What if we just got rid of this function and had callers do:

  object_array_copy(_pending, );
  prepare_revision_walk();
  ...
  clear_commit_marks_for_object_array(_pending);

That sidesteps all of the memory ownership issues by just creating a
copy. That's less efficient, but I'd be surprised if it matters in
practice (we tend to do one or two revisions per process, there don't
tend to be a lot of pending tips, and we're really just talking about
copying some pointers here).

-Peff


Re: [PATCH] revision: introduce prepare_revision_walk_extended()

2017-12-17 Thread Martin Ågren
On 16 December 2017 at 13:12, René Scharfe  wrote:
> prepare_revision_walk() allows callers to take ownership of the array of
> pending objects by setting the rev_info flag "leak_pending" and copying
> the object_array "pending".  They use it to clear commit marks after
> setup is done.  This interface is brittle enough that it requires
> extensive comments.
>
> Provide an easier way by adding a function that can hand over the array
> to a caller-supplied output parameter and converting all users of the
> flag "leak_pending" to call prepare_revision_walk_extended() instead.
>
> Signed-off-by: Rene Scharfe 
> ---
>  Documentation/technical/api-revision-walking.txt |  6 ++
>  bisect.c | 17 +
>  builtin/checkout.c   |  9 +
>  bundle.c |  9 +
>  revision.c   | 10 +-
>  revision.h   | 14 ++
>  6 files changed, 24 insertions(+), 41 deletions(-)
>
> diff --git a/Documentation/technical/api-revision-walking.txt 
> b/Documentation/technical/api-revision-walking.txt
> index 55b878ade8..9dc573d2ec 100644
> --- a/Documentation/technical/api-revision-walking.txt
> +++ b/Documentation/technical/api-revision-walking.txt
> @@ -50,6 +50,12 @@ function.
> returns any error (non-zero return code) and if it does not, you can
> start using get_revision() to do the iteration.
>
> +`prepare_revision_walk_extended`::
> +
> +   Like prepare_revision_walk(), but allows callers to take ownership
> +   of the array of pending objects by passing an object_array pointer
> +   as the second parameter; passing NULL clears the array.

This might make someone wonder what the difference between passing NULL
and using `prepare_revision_walk()` is. Perhaps: "passing NULL clears
the array, just as prepare_revision_walk() would." Possibly only matters
once we gain more parameters, and maybe not even then...

The name of your new function ("..._extended") doesn't describe the
nature of the extended behavior and made me wonder if it was too
generic. But that genericness might be a good thing. When/If we need to
tweak the behavior along some other axis, we can add a third parameter
to ..._extended and pass NULL/0 as appropriate. The simple cases will
stay simple and we won't gain lots of functions with minor differences.

> diff --git a/builtin/checkout.c b/builtin/checkout.c
> index e1e157d205..1f04f5d5e5 100644
> --- a/builtin/checkout.c
> +++ b/builtin/checkout.c
> @@ -796,14 +796,7 @@ static void orphaned_commit_warning(struct commit *old, 
> struct commit *new)
> add_pending_oid(, "HEAD", >object.oid, UNINTERESTING);
>
> /* Save pending objects, so they can be cleaned up later. */
> -   refs = revs.pending;
> -   revs.leak_pending = 1;
> -
> -   /*
> -* prepare_revision_walk (together with .leak_pending = 1) makes us
> -* the sole owner of the list of pending objects.
> -*/
> -   if (prepare_revision_walk())
> +   if (prepare_revision_walk_extended(, ))
> die(_("internal error in revision walk"));
> if (!(old->object.flags & UNINTERESTING))
> suggest_reattach(old, );
> diff --git a/bundle.c b/bundle.c
> index 93290962c9..6af6e38c40 100644
> --- a/bundle.c
> +++ b/bundle.c
> @@ -158,14 +158,7 @@ int verify_bundle(struct bundle_header *header, int 
> verbose)
> setup_revisions(2, argv, , NULL);
>
> /* Save pending objects, so they can be cleaned up later. */
> -   refs = revs.pending;
> -   revs.leak_pending = 1;
> -
> -   /*
> -* prepare_revision_walk (together with .leak_pending = 1) makes us
> -* the sole owner of the list of pending objects.
> -*/
> -   if (prepare_revision_walk())
> +   if (prepare_revision_walk_extended(, ))
> die(_("revision walk setup failed"));
>
> i = req_nr;

This copy-paste coding that you get rid of here can be attributed to me.
I obviously like your cleaned-up version much better.

> diff --git a/revision.h b/revision.h
> index 54761200ad..5d4b475334 100644
> --- a/revision.h
> +++ b/revision.h
> @@ -150,18 +150,6 @@ struct rev_info {
> date_mode_explicit:1,
> preserve_subject:1;
> unsigned intdisable_stdin:1;
> -   /*
> -* Set `leak_pending` to prevent `prepare_revision_walk()` from 
> clearing
> -* the array of pending objects (`pending`). It will still forget 
> about
> -* the array and its entries, so they really are leaked. This can be
> -* useful if the `struct object_array` `pending` is copied before
> -* calling `prepare_revision_walk()`. By setting `leak_pending`, you
> -* effectively claim ownership of the old array, so you 

[PATCH] revision: introduce prepare_revision_walk_extended()

2017-12-16 Thread René Scharfe
prepare_revision_walk() allows callers to take ownership of the array of
pending objects by setting the rev_info flag "leak_pending" and copying
the object_array "pending".  They use it to clear commit marks after
setup is done.  This interface is brittle enough that it requires
extensive comments.

Provide an easier way by adding a function that can hand over the array
to a caller-supplied output parameter and converting all users of the
flag "leak_pending" to call prepare_revision_walk_extended() instead.

Signed-off-by: Rene Scharfe 
---
 Documentation/technical/api-revision-walking.txt |  6 ++
 bisect.c | 17 +
 builtin/checkout.c   |  9 +
 bundle.c |  9 +
 revision.c   | 10 +-
 revision.h   | 14 ++
 6 files changed, 24 insertions(+), 41 deletions(-)

diff --git a/Documentation/technical/api-revision-walking.txt 
b/Documentation/technical/api-revision-walking.txt
index 55b878ade8..9dc573d2ec 100644
--- a/Documentation/technical/api-revision-walking.txt
+++ b/Documentation/technical/api-revision-walking.txt
@@ -50,6 +50,12 @@ function.
returns any error (non-zero return code) and if it does not, you can
start using get_revision() to do the iteration.
 
+`prepare_revision_walk_extended`::
+
+   Like prepare_revision_walk(), but allows callers to take ownership
+   of the array of pending objects by passing an object_array pointer
+   as the second parameter; passing NULL clears the array.
+
 `get_revision`::
 
Takes a pointer to a `rev_info` structure and iterates over it,
diff --git a/bisect.c b/bisect.c
index 0fca17c02b..a2af405d28 100644
--- a/bisect.c
+++ b/bisect.c
@@ -641,9 +641,10 @@ static void bisect_rev_setup(struct rev_info *revs, const 
char *prefix,
/* XXX leak rev_argv, as "revs" may still be pointing to it */
 }
 
-static void bisect_common(struct rev_info *revs)
+static void bisect_common(struct rev_info *revs,
+ struct object_array *old_pending_ptr)
 {
-   if (prepare_revision_walk(revs))
+   if (prepare_revision_walk_extended(revs, old_pending_ptr))
die("revision walk setup failed");
if (revs->tree_objects)
mark_edges_uninteresting(revs, NULL);
@@ -825,15 +826,7 @@ static int check_ancestors(const char *prefix)
bisect_rev_setup(, prefix, "^%s", "%s", 0);
 
/* Save pending objects, so they can be cleaned up later. */
-   pending_copy = revs.pending;
-   revs.leak_pending = 1;
-
-   /*
-* bisect_common calls prepare_revision_walk right away, which
-* (together with .leak_pending = 1) makes us the sole owner of
-* the list of pending objects.
-*/
-   bisect_common();
+   bisect_common(, _copy);
res = (revs.commits != NULL);
 
/* Clean up objects used, as they will be reused. */
@@ -964,7 +957,7 @@ int bisect_next_all(const char *prefix, int no_checkout)
bisect_rev_setup(, prefix, "%s", "^%s", 1);
revs.limited = 1;
 
-   bisect_common();
+   bisect_common(, NULL);
 
find_bisection(, , , !!skipped_revs.nr);
revs.commits = managed_skipped(revs.commits, );
diff --git a/builtin/checkout.c b/builtin/checkout.c
index e1e157d205..1f04f5d5e5 100644
--- a/builtin/checkout.c
+++ b/builtin/checkout.c
@@ -796,14 +796,7 @@ static void orphaned_commit_warning(struct commit *old, 
struct commit *new)
add_pending_oid(, "HEAD", >object.oid, UNINTERESTING);
 
/* Save pending objects, so they can be cleaned up later. */
-   refs = revs.pending;
-   revs.leak_pending = 1;
-
-   /*
-* prepare_revision_walk (together with .leak_pending = 1) makes us
-* the sole owner of the list of pending objects.
-*/
-   if (prepare_revision_walk())
+   if (prepare_revision_walk_extended(, ))
die(_("internal error in revision walk"));
if (!(old->object.flags & UNINTERESTING))
suggest_reattach(old, );
diff --git a/bundle.c b/bundle.c
index 93290962c9..6af6e38c40 100644
--- a/bundle.c
+++ b/bundle.c
@@ -158,14 +158,7 @@ int verify_bundle(struct bundle_header *header, int 
verbose)
setup_revisions(2, argv, , NULL);
 
/* Save pending objects, so they can be cleaned up later. */
-   refs = revs.pending;
-   revs.leak_pending = 1;
-
-   /*
-* prepare_revision_walk (together with .leak_pending = 1) makes us
-* the sole owner of the list of pending objects.
-*/
-   if (prepare_revision_walk())
+   if (prepare_revision_walk_extended(, ))
die(_("revision walk setup failed"));
 
i = req_nr;
diff --git a/revision.c b/revision.c
index f6a3da5cd9..3a231451a4 100644
---