[Subversion Wiki] Update of "MergeTrackingIdeas" by JulianFoad

Apache Wiki Thu, 06 Oct 2011 09:55:47 -0700

Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Subversion Wiki" for 
change notification.


The "MergeTrackingIdeas" page has been changed by JulianFoad:
http://wiki.apache.org/subversion/MergeTrackingIdeas

Comment:
(Manually) import doc from Google docs

New page:
''Design notes, plans and ideas by JulianFoad''

----
= Merge Tracking Ideas =
Ways in which we could improve on the 1.6 mergeinfo and merge tracking scheme.  
Experimental thoughts, not fully thought out.

== Logical Change Tracking ==
This idea came out of 
a[[http://colabti.org/irclogger/irclogger_log/svn-dev?date=2011-09-06|discussion
 on IRC]].

The  idea is to define a more powerful form of merge tracking, as an upgrade  
from the current (1.6) merge tracking.  The additional power is in  tracking 
“logical changes” as they are merged from one branch to another  to another in 
arbitrary ways, including back-and-forth and circular  patterns, and still 
being able to say what changes we “need” to merge  from branch B to branch C.

This is, I think, going some way towards what people sometimes call 
''changeset-based merging''.

Advantages:

 * Support  more flexible branching topologies.  It doesn’t matter whether 
changes  have already been merged to the target branch directly from this 
source  branch or via any other chain of branches.
 * Enable the “reintegrate” purpose to be served by the same automatic merge 
algorithm as is used for catch-up.

The emphasis is on being able to ''detect''  and describe what logical changes 
are needed, without necessarily being  able to perform the merge automatically 
in all cases.  The current  state of thinking is that Subversion would be able 
perform the merge  when a candidate revision on the source branch is a merge of 
logical  changes that are either ''all'' needed on the target or ''none'' of 
them are needed.  But if this candidate revision is a merge and only ''some''  
of the logical changes in it are needed, then Subversion would, we  suppose, 
stop and print a helpful description of the situation.  But  even that level of 
capability is a big improvement on 1.6, and helpful  diagnostics for unhandled 
situations are entirely the sort of assistance  a user should be entitled to 
expect from a VCS.

=== Principles ===
 * We track ''logical changes''.
 * User-facing goals defined in terms of merging the right set of ''logical 
changes''.
 * Each commit is either a ''logical change'' or a ''merge'' of ''logical 
changes''.

We introduce the concept of a ''logical change'' as the fundamental unit of 
change that is tracked.  A ''logical change''  starts life as a committed 
change that is not part of a merge.  When  that tree-content change is merged 
to another branch (adapted if  necessary to accommodate any physical and/or 
semantic differences  between the branches), the resulting commit is ''not'' a 
new ''logical change'' but rather is a ''merge''.  A ''merge'' is defined as a 
committed change that includes a mergeinfo change that brings in one or more 
''logical changes''.  A ''logical change''  has a unique identifier (let’s say 
the branch and revision in which it  was originally committed) and is always 
identified by that same  identifier, no matter what branches it has been merged 
through or  whether it has been merged together with other ''logical changes'' 
in a single ''merge''.

We must be able to identify the ''logical changes'' in the system.  To identify 
the ''logical changes ''in existing 1.6 mergeinfo, we will classify each commit 
as a ''logical change'' if it is a change without any mergeinfo change, or else 
as a ''merge'' if it includes a mergeinfo change.  If it’s ''merge'',  then it 
brings in some pre-existing logical changes and/or merges.  By  scanning 
recursively into mergeinfo history, we can identify all the  original logical 
changes brought in by a merge.

The user-facing goals of merging are defined in terms of getting the right set 
of ''logical changes'' onto the target branch.  This is in contrast to the 1.6 
scheme which is defined in terms of getting the complete set of ''commits on 
the source branch'' onto the target branch.  The difference is that we will 
select a commit in the source branch only if it is a ''logical change'' or if 
it is a ''merge'' that brings in ''logical changes'' that we don’t have; and 
not if it is a ''merge'' that brings in ''logical changes'' that we do already 
have.

Each ''candidate revision'' to merge from the source branch is merged iff it 
is, or is a merge that brings in, ''logical changes'' that we don’t yet have on 
the target.

 * If the candidate revision is a ''logical change'', then we merge it iff we 
don’t have that ''logical change'' on the source branch (as determined by the 
source branch’s mergeinfo).
 * If the candidate revision is a ''merge'', then we merge it if all the 
''logical changes'' it brings in are ones we don’t already have, or we skip it 
if all the ''logical changes'' it brings in are ones we do already have.
 * If the candidate revision is a ''merge'' that brings in a some new ''logical 
changes''  that we don’t have and some that we do have, then Subversion can at  
least bail out, telling the user which changes are present and which are  to be 
merged.  At the moment we don’t anticipate being able to untangle  the relevant 
parts of the physical edit from the source branch, nor to  fetch the required 
''logical changes'' from their origin or from some other branch.

The ''reintegrate'' purpose is to bring all ''logical changes'' from the source 
branch that are not already in the target branch.  That is exactly the same as 
for a ''catch-up''  merge, and so the same algorithm can be used.  Users might 
still want  to specify the “--reintegrate” option because of the additional 
checks  that it performs before merging, but that would be optional and for the 
 user’s benefit not for the system’s benefit.  A plain automatic merge  would 
still work in that direction even if the old reintegration  constraints are not 
met.

=== Migration from 1.6 ===
See above about recursive scanning of 1.6 mergeinfo.

Retro-fitting the principle of ''logical changes''  onto an existing 1.6 merge 
history would seem to be a good fit, as it  is already common mantra and 
practice to separate new logical changes  from merges.  The consequences if 
this principle has occasionally not  been followed in the past would seem to be 
predictable and relatively  straightforward to recover from.  Where the history 
has been altered by  record-only merges or direct editing or removal of 
mergeinfo, however,  this method of classifying old commits may be untenable or 
need  augmenting with user input.  (Investigate?)

=== Other issues to explore / define ===
Reverse merges — first need to define basic semantics before can contemplate 
supporting.

Subtree merges — semantics?

Merging  into a mixed-rev WC — what special considerations apply? Does it help  
to remember that a WC being merged into is commonly acting as a  proto-revision?

MI storage — semantics and format. per branch? whole history in one place?

=== Rules (differences from 1.6) ===
 1. '''###? '''The ''Feature Branch'' relationship can be applied with cycles 
in the graph of relationships.
  * A ⇒ B, B ⇒ A
  * A ⇒ B, B ⇒ C, C ⇒ A

 1. '''###? '''The ''Feature Branch'' relationship can be applied with multiple 
paths in the graph of relationships.
  * A ⇒ B, B ⇒ C, A ⇒ C
  * A ⇒ B, A ⇒ C, B ⇒ D, C ⇒ D

=== Requirements ===
 * Editable  merge history.  (Because it increases reliance on the correctness 
of  mergeinfo, and especially mergeinfo changes, which in the current 1.6  
scheme is fragile.)
 * Quick(ish)  traversal of mergeinfo history.  This suggests a new storage 
model in  which all the historic mergeinfo (of a given branch?) is in one place.

=== A Worked Example ===
== Multiple Commits in One Logical Change ==
We might want to allow multiple revisions to be recorded as being components of 
the same ''merge''.   This would increase the power of merge tracking in a 
functional sense,  but it is “advanced” functionality and would require user 
awareness and  tool support to make use of it.

Merge  A:10 to B, committing the result initially as B:13, then doing some  
more conflict resolution in B:14 and B:16.  Arrange somehow (by user  input, 
for example) for B:14 and B:16 also to be recorded as part of the  “same” 
merge: branch B revs 13, 14, 16 jointly comprise the merge of  A:10.  In a 
subsequent merge from B to C, assuming A:10 is already on C,  that would 
prevent B:13, B:14 and B:16 from being merged to C.

Maybe  worth designing in the ability, as it (at first sight) sounds like  
something that could be unused and unimplemented at first and then  implemented 
later.  Until the merge algorithm pays attention to it ''and''  somebody 
populates it, those follow-ups B:14 and B:16 will simply be  merged to C and 
will conflict (physically and/or semantically) just like  happens today.

 . ###  What are the semantics exactly?  Does it matter whether A:10 is an  
original change or a merge?  What gets complex when the merge has  multiple 
source changes?

== Distinguish Operative and No-op Source Revs ==
At  present the revs we record are ones that have been “considered” from  the 
source branch — regardless whether they contained an original change  or a 
merge or nothing at all.

 . ### My overall impression is this is not a useful avenue, but here’s the 
thought anyway.

The aim of a ''catch-up''  merge is to reach a state in which a single 
continuous revision range  (including all operative and no-op revs) is recorded 
as having been  merged from the immediate ''source branch''.   If there are 
gaps but all the gaps are no-op, the merge algorithm  searches those gaps and 
finds that there is nothing to do, and then  (potentially, and in practice 
actually) fills in those gaps in the  recorded mergeinfo.

If  we were to distinguish between operative and no-op revs, that would  help 
in displaying mergeinfo in a more user-friendly way.  ###  Specifics?

This info is already discoverable, it’s just not fast.

This distinction would come “for free” if we start recording ''logical 
changes'' rather than physical changes.  But then the question, “Are there any 
eligible changes to merge?” might be harder to answer.

[Subversion Wiki] Update of "MergeTrackingIdeas" by JulianFoad

Reply via email to