Re: Bug when git rev-list options "--first-parent" and "--ancestry-path" are used together?

Michael Haggerty Sat, 25 May 2013 11:41:03 -0700

[Junio, sorry for the dup; somehow I failed to CC the first version to
the mailing list.]

On 05/23/2013 07:20 PM, Junio C Hamano wrote:
> Michael Haggerty <mhag...@alum.mit.edu> writes:
> 
>> It seems to me that
>>
>>      git rev-list --first-parent --ancestry-path A..B
>>
>> is well-defined and should list the commits in the intersection between
>>
>>      git rev-list --first-parent                 A..B
>>
>> and
>>
>>      git rev-list                --ancestry-path A..B
>>
>> But in many cases the first command doesn't provide any output even
>> though there are commits common to the output of the last two commands.
>>
>> For example, take as an example the DAG from test t6019:
>>
>> #          D---E-------F
>> #         /     \       \
>> #    B---C---G---H---I---J
>> #   /                     \
>> #  A-------K---------------L--M
>>
>> (The merges are always downwards; e.g., the first parent of commit L is
>> K.)  The command
>>
>>     git rev-list --first-parent --ancestry-path D..J
>>
>> doesn't generate any output, whereas I would expect it to output "H I
>> J".
> 
> As I do not see how "only show first-parent chains from near the tip
> but stop immediately when the chain deviates from the ancestry path"
> could be a sensible operation (in other words, I do not offhand
> think of examples of what useful things you can do with that
> information), I actually expect that "-f-p -a-p D..J" should error
> out, instead of giving no output.
> 
> You are correct to point out that sometimes -f-p and -a-p _could_ be
> compatible, e.g. "-f-p -a-p A..M", or "-f-p -a-p B..M".  But I think
> the only case that they are compatible is when "-f-p" output is a
> strict subset of what "-a-p" without "-f-p" would give.

I guess I should tell you the application that motivated the use of
these options together.  Maybe you can suggest a better approach.

I'm trying to find the most general conditions where an incremental
merge [1,2] makes sense.  The simple case is merging two linear
branches, call them "branch" and "master":

o - 0 - 1 - 2 - 3 - 4    ← master
     \
      A - B - C - D      ← branch

My tool git-imerge does this (conceptually) by constructing the pairwise
merges between each commit on master and each commit on branch:

o - 0 - 1  - 2  - 3  - 4   ← master
    |   |    |    |    |
    A - A1 - A2 - A3 - A4
    |   |    |    |    |
    B - B1 - B2 - B3 - B4
    |   |    |    |    |
    C - C1 - C2 - C3 - C4
    |   |    |    |    |
    D - D1 - D2 - D3 - D4  ← final merge
    ↑
  branch

Each of the new items [ABCD][1234] is a merge commit between its
neighbor above and its neighbor to the left.  The idea is that the
pairwise merges are less likely to conflict, and if they conflict are
likely to be easy to reconcile.

I am trying to generalize this approach to a more complicated situation
where the two branches are not linear.

The current incremental merge algorithm requires that each commit
0,1,2,3,4 and 0,A,B,C be a direct descendant of its predecessors.
That's trivial for linear branches.  But if the situation looks like this:

o - 0 - 1 - 2 - 3 - 4    ← master
     \
      A - B - C - D      ← branch
           \     /
            X - Y

then there is no correct ordering of 0,A,B,C,D,X,Y that has the
descendancy property.  So currently I always take use the first-parent
list of commits on each branch (i.e., omitting X and Y), which does have
that property.

Now assume a slightly more complicated situation, in which master has
been merged to feature branch at some point:

o - 0 - 1 - 2 - 3 - 4    ← master
     \       \
      A - B - C - D      ← branch
           \     /
            X - Y

Now when we do an incremental merge branch into master, how do we
generate the list of commits to put one each axis of the table?  The
merge base is "2", so I think the best answer is

1- 2 - 3  - 4   ← master
   |   |    |
   C - C3 - C4
   |   |    |
   D - D3 - D4  ← final merge
   ↑
 branch

The simplest way I can think of to generate the list C,D is

    git rev-list --first-parent --ancestry-path 2..D

We need --ancestry-path to avoid getting commits A and B.  It's still
not clear that this is always the best approach but at least it seems
safe.  For example, suppose that the first parent of D is Y rather than C:

o - 0 - 1 - 2 - 3 - 4    ← master
     \       \
      A - B - C --.
           \       \
            X - Y - D    ← branch

In this case, the above command (it it worked correctly) would just
output "D".  This is correct and could be used for a (degenerate, in
this case) incremental merge [3].

This is my reason for wanting --first-parent and --ancestry-path to work
together.

For now I'm just running rev-list twice and computing the intersection
by hand, but it was surprising that git could not do this for me.

Michael

[1] https://github.com/mhagger/git-imerge
[2]
http://softwareswirl.blogspot.com/2013/05/git-imerge-practical-introduction.html
(and references therein)
[3] It might be more helpful to use C,D in this case rather than just D,
but I haven't yet figured out whether this is just a special case or
whether it hints at an algorithm that is more clever than
"--first-parent --ancestry-path".

-- 
Michael Haggerty
mhag...@alum.mit.edu
http://softwareswirl.blogspot.com/
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Bug when git rev-list options "--first-parent" and "--ancestry-path" are used together?

Reply via email to