On 4/10/2018 11:02 PM, Junio C Hamano wrote:
Derrick Stolee <dsto...@microsoft.com> writes:

@@ -800,17 +810,26 @@ static struct commit_list *paint_down_to_common(struct 
commit *one, int n, struc
                return result;
        }
        prio_queue_put(&queue, one);
+       if (one->generation < min_nonstale_gen)
+               min_nonstale_gen = one->generation;
for (i = 0; i < n; i++) {
                twos[i]->object.flags |= PARENT2;
                prio_queue_put(&queue, twos[i]);
+               if (twos[i]->generation < min_nonstale_gen)
+                       min_nonstale_gen = twos[i]->generation;
        }
- while (queue_has_nonstale(&queue)) {
+       while (queue_has_nonstale(&queue, min_nonstale_gen)) {
                struct commit *commit = prio_queue_get(&queue);
                struct commit_list *parents;
                int flags;
+ if (commit->generation > last_gen)
+                       BUG("bad generation skip");
+
+               last_gen = commit->generation;
+
                flags = commit->object.flags & (PARENT1 | PARENT2 | STALE);
                if (flags == (PARENT1 | PARENT2)) {
                        if (!(commit->object.flags & RESULT)) {
@@ -830,6 +849,10 @@ static struct commit_list *paint_down_to_common(struct 
commit *one, int n, struc
                                return NULL;
                        p->object.flags |= flags;
Hmph.  Can a commit that used to be not stale (and contributed to
the current value of min_nonstale_gen) become stale here by getting
visited twice, invalidating the value in min_nonstale_gen?

min_nonstale_gen can be "wrong" in the way you say, but fits the definition from the commit message:

"To properly take advantage of this condition, track the minimum generation number of a commit that **enters the queue** with nonstale status." (Emphasis added)

You make an excellent point about how this can be problematic. I was confused by the lack of clear performance benefits here, but I think that whatever benefits making queue_has_nonstale() be O(1) were removed by walking more commits than necessary.

Consider the following commit graph, where M is a parent of both A and B, S is a parent of M and B, and there is a large set of commits reachable from M with generation number larger than gen(S).

A    B
| __/|
|/   |
M    |
|\   |
. |  |
. |  |
. |_/
|/
S

Between A and B, the true merge base is M. Anything reachable from M is marked as stale. When S is added to the queue, it is only reachable from B, so it is non-stale. However, it is marked stale after M is walked. The old code would detect this as a termination condition, but the new code would not.

I think this data shape is actually common (not exactly, as it may be that some ancestor of M provides a second path to S) especially in the world of pull requests and users merging master into their topic branches.

I'll remove this commit in the next version, but use the new prototype for queue_has_nonstale() in "commit: add short-circuit to paint_down_to_common()" using the given 'min_generation' instead of 'min_nonstale_gen'.

Thanks,
-Stolee

Reply via email to