Re: Incidence of criss-cross merges

Julian Foad Fri, 13 Jul 2012 10:11:42 -0700

Stefan Fuhrmann wrote:
> On Thu, Jul 12, 2012 at 3:43 PM, Julian Foad wrote:
>> Stefan Fuhrmann wrote:
>>> From a process perspective, it seems much more prudent to
>>> do "qualified" merges like "merge from /trunk up to the last
>>> fully tested nightly build revision" and "merge from branch up
>>> to the point that I think is safe".
>> 
>> Yes, that makes sense.
[...]
>>  A = /trunk
>>  Tested versions   *                   *
>>  A o---o---o---o---o---o---o---o---o---o---o---o---o
>>     \               \____     /
>>      \               ____\ __/
>>       \             /     \
>>  B     o---o---o---o---o---o---o---o---o---o---o
>>  Safe versions     *                   *
>>                    1       2   3
>>
>>  Sequence of events:
>>  1. Nightly testing.
>>  2. Catch up branch from latest stable trunk.
>>  3. Reintegrate branch to trunk, from latest tested version of branch.
>>
>> That sort of scenario?
> 
> Exactly. The thing is that at this point the graph does
> not even depend on whether the changes to a given
> node actually produce a conflict. If the history processing
> code had any problems with this kind of merge graph,
> that problem would surface more often than actual conflicts.


Let's first clarify for other readers that we're not concerned with how the two 
criss-crossing merges themselves work: each of those is a straightforward merge 
and the fact that they cross over is not relevant to either of them.  We might 
want to build some sort of system for detecting this criss-crossing as it 
happens, with the aim of alerting the user or preventing the second merge from 
being committed, but that is not what we are talking about here.  Rather, we 
assume those two criss-crossing merges have already happened, and we are 
talking now about how a third merge (afterwards) will behave.

Currently, the "symmetric" merge code treats a criss-cross merge by choosing 
the later of the two possible common ancestors.  In fact, this is just a 
fall-out from choosing the youngest common ancestor: it doesn't currently fetch 
enough information to notice that the latest merges in each direction cross 
over.

The result of choosing one of the common ancestors will be a sub-optimal 
merge.  Typically, all needed changes are merged, but some unneeded changes are 
merged too (changes that had already been made in the target branch, either 
originally or through an earlier merge).  Therefore the merge will, in general, 
have "spurious" conflicts due to making the same change again.  In simple cases 
the duplicate changes may be auto-resolved: in particular, Subversion's 
internal text merge auto-resolves duplicate hunks of text added to a file, and 
Subversion auto-resolves duplicate added properties, but that is only going to 
happen when such additions were clean additions.

There may be other ways in which the result is sub-optimal, such as failing to 
report a conflict and just choosing one of the alternatives, where the user 
really should have been given a conflict to resolve.  Note also that 
"sub-optimal" doesn't necessarily mean "just a bit worse that ideal", it can 
mean "unusably bad" in some cases.

It so happens that in trying to make automated test scenarios for all possible 
merges [1], I am running in to the issue of criss-cross merges.  In preparation 
for attempting a merge from A to B after a scenario such as this one:

      A1      A2
    ---o---x---o------  mergeinfo = B:1
   /      /      \
   \     /        \
    ---o-------o---x--  mergeinfo = A:1-2
      B1      B2

my latest attempt to fake the scenario does this:

      A1      A2  AF    where AF makes changes B1
    ---o-------o---o--           and sets mergeinfo = B:1
   /
   \
    ---o-------o---o--  where BF makes change A1,A2
      B1      B2  BF             and sets mergeinfo = A:1-2

but I have just realized (after seeing unexpected conflicts in the following 
merge) that that actually represents a criss-cross merge:

      A1      A2
    ---o-------o---x--  mergeinfo = B:1-2
   /      ______\ _/
   \     /       \
    ---o-------o---x--  mergeinfo = A:1
      B1      B2

The end points (the tips of the two branches) are in the same state as what I 
was trying to achieve (the first diagram), but the difference is that the state 
labaled as 'A2' does not have change 'B1' in it in this second case.  The merge 
code will use that state as the 'base' of the three-way merge, and thus the 
result it produces will be different from the result produced by a merge after 
the scenario set up in the first diagram.  The state of A2 is the only relevant 
difference between the non-criss-cross (first diagram) and criss-cross (third 
diagram) scenarios; the mergeinfo on A2 will also differ and the states and 
mergeinfo of some other nodes may also differ, but those other differences are 
not noticed by the current algorithms ("symmetric" and 1.7 non-reintegrate, 
that is; 1.7 reintegrate would look at different nodes).


As far as I have been able to discover, there is no generally agreed solution 
to the problem.  Most of the references I can find are, like [3], a commentary 
on one or more of the attempts that have been made to find a better solution.  
I found one summary [2] of how different merge algorithms attempt to handle the 
criss-cross scenario.  It is a bit too light on detail, but it does mention 
some interesting solutions and their limitations.

At this point, I think the issue boils down to:

  * Yes, the criss-cross situation can be produced in plausible real life usage.

  * The current handling is sub-optimal, though not worse than we've had before.

  * We might want to look at more sophisticated ways to handle it.

  * We might want to look at ways to help users avoid the situation.


References: 

[1] See the thread  "Subtree mergeinfo -- what I learnt..." on this list.

[2] "CrissCrossMerge" on revctrl.org:
<http://revctrl.org/CrissCrossMerge?action=recall&rev=34>
(Sadly, the revctrl.org Wiki is full of spam; you have to look at old versions 
of the pages to find the relevant content.)

[3] "Does Criss-Cross Merge Matter?" on vestasys.org:
<http://wiki.vestasys.org/MergingFuture/Food4Thought/TrickyCases#head-5ef046f558b5a8924506ec82d315e2ca7b9782a8>

- Julian
--
Certified & Supported Apache Subversion Downloads: 
http://www.wandisco.com/subversion/download

Re: Incidence of criss-cross merges

Reply via email to