Hi, all.

I want to update you all on the "symmetric merge" [1] status and my plans, and 
invite your thoughts and any assistance you can give.

I'll be presenting this subject at Elego's SvnDay [2] and at WANdisco events in 
October, but the presentation will be aimed at users and so will concentrate on 
how the end result is better for the user and won't say much about the details 
I'm talking about below.


GRAND PLAN


Two main phases.


  Phase 1 (now).
    Implement in terms of "sync" and "reintegrate".  Accept their limitations; 
that is:    - Any "simple" [3] merge will work fine (in either direction).
    - A non-"simple" merge can be performed only in the same direction as the 
previous merge.
    - Make Subversion use "symmetric merge" automatically for any merge request 
that we currently handle as a "sync" -- that is when:
      - it's a forward merge
      - no revision range is given (at least, no starting revision; an ending 
revision is acceptable)
      - "--reintegrate" is not specified
    - For testing purposes, we also make Subversion use the "symmetric merge" 
whenever the test suite requests a "reintegrate".  I don't see any reason to 
make it do that for users; indeed I think it would be bad to make this special 
option start doing things it didn't do before.


  Phase 2 ("later"):
    Rewrite more of the merge code to alleviate limitations -- to be able to 
skip cherry-picks, support mixed-rev etc. when merging in either direction.
    - Make the implementation more symmetric.  This involves pretty deep 
changes in the merge code, so much so that I think this task would best be 
combined with a significant revision of the internal merge data structures 
(svn_mergeinfo_t and so on).  Maybe even combined with a revision of the way 
mergeinfo is stored.

Concentrate on getting phase 1 complete and releasable.  I *think* it is nearly 
done.  (See TESTING, below.)  The implementation
 mimics "sync" or "reintegrate" depending on where it finds the most 
recent base, according to the rule that "sync" should be used when 
merging again in the same direction as last time, and "reintegrate" when
 merging in the opposite direction.  It
 doesn't matter that this implementation has all the
 limitations of the current "reintegrate" merge when changing direction,
 because that's already as good as 1.7 for all 1.7-supported cases 
AFAIK.  The benefit of it just Doing The Right Thing for simple merges, 
enabling repeated to-and-fro merging, seems huge.

Phase 2 
is much lower priority and much more a SMOP, with less impact on users 
(documentation etc.).  Phase 2 will bring flexibility that isn't of great 
importance to users AFAIK, since cherry-picking and subtree merging is most 
often used alone -- on a divergent branch (that's not going to be reintegrated) 
-- and rarely on a convergent branch (which is going to be reintegrated, so 
to-and-fro merging is likely).  And phase 1 already enables cherry-picks etc. 
to be accomodated to some extent.


CONCERNS

The main concerns a couple of months ago were that it wasn't handling subtrees 
and mixed-rev WCs and so on.  I believe now that it does (in the "sync" 
direction -- that is, whenever merging in the same direction as the previous 
merge).  There are a few tests failing (see below) but from a design and 
implementation point of view I am confident that it should support these cases 
and that these failures must be due to relatively minor issues.


TESTING

Current test suite:

The following tests fail when merge-cmd.c is patched to call "symmetric merge" 
for sync and reintegrate merges [4]:
FAIL:  merge_reintegrate_tests.py 10: merge --reintegrate with subtree mergeinfo
FAIL:  merge_tests.py 78: dont merge revs into a subtree that predate it
FAIL:  merge_tests.py 88: subtree merges dont cause spurious conflicts
FAIL:  merge_tests.py 89: target and subtrees need
 nonintersecting revs

Clearly there's something up with subtree merges, but, as I said above, I have 
reason to believe that it's not fundamentally broken or unsupported.

New tests:

I've started "merge_symmetric_tests.py".

Are any new tests required to ensure existing scenarios aren't broken, that may 
not be tested yet?

  - The "keep-alive dance".  For completeness, we should check how that will 
behave, as we can assume some people will have adopted practices that 
incorporate it.  I do NOT think we should continue to support that work-around: 
if it continues to behave as now, that's fine, and if its behaviour changes, I 
expect to be able to claim that as an intentional behaviour change for the 
better.  That is, to be plain, if anyone's relying on that, they may need to 
change their practice.

  - The tests I have at the moment are pretty small cases.  It would be good
 to create a test that exercises a series of much bigger merges.

  - Any other scenario?


PERFORMANCE

  - Performance (in terms of network traffic, in particular).  After a series 
of (same direction or to-and-fro) merges, is the cost of the base-finding 
algorithm proportional to time since the YCA of the branches (i.e., ever 
increasing), or
 only proportional to time since last merge?

  - Any other performance concern?


Please let me know any thoughts you have.  And if you might be able to take on 
the investigation of one of the test failures, or writing a new test (whether 
pseudo-code or actual code), or checking the performance, that would be awesome.

- Julian


[1] <http://wiki.apache.org/subversion/SymmetricMerge>
[2] <http://www.elego.de/svnday2012>
[3] Define a "simple" merge as one that does not find any subtree merges, 
cherry picks, mixed-rev/switched/sparse WC.

[4] Attached patch, "use-symmetric-merge-1.patch", makes all (?) sync and reint 
merge requests use the 'symmetric' code.

Attachment: use-symmetric-merge-1.patch
Description: Binary data

Reply via email to