On 09/12/2016 12:13 PM, Gábor STEFANIK wrote:
-----Original Message-----
From: Yuya Nishihara [mailto:you...@gmail.com] On Behalf Of Yuya
Nishihara
Sent: Sunday, September 11, 2016 6:29 PM
To: Gábor STEFANIK <gabor.stefa...@nng.com>
Cc: mercurial-devel@mercurial-scm.org
Subject: Re: [PATCH 1 of 2 v8] graft: support grafting across move/copy
(issue4028)

On Fri, 26 Aug 2016 11:16:31 -0500, Gábor Stefanik wrote:
# HG changeset patch
# User Gábor Stefanik <gabor.stefa...@nng.com> # Date 1472225958 -7200
#      Fri Aug 26 17:39:18 2016 +0200
# Node ID f32aa28298164aa38830e83369c57c9553c6ff08
# Parent  318e2b600b80e4ed3c6f37df46ec7544f60d4c0b
graft: support grafting across move/copy (issue4028)

Sorry for late reply, and I don't fully understand how this patch handles
reverse copy tracking correctly yet. But weekend is over, so I just wrote
random comments.

+    # In certain scenarios (e.g. graft, update or rebase), ca can be
overridden
+    # We still need to know a real common ancestor in this case
+    # We can't just compute _c1.ancestor(_c2) and compare it to ca,
because
+    # there can be multiple common ancestors, e.g. in case of bidmerge.
+    cta = ca
+    # ca.descendant(wc) and ca.descendant(ca) are False, work around that
+    _c1 = c1.p1() if c1.rev() is None else c1
+    _c2 = c2.p1() if c2.rev() is None else c2
+    dirty_c1 = not (ca == _c1 or ca.descendant(_c1))
+    dirty_c2 = not (ca == _c2 or ca.descendant(_c2))
+    graft = dirty_c1 or dirty_c2
+    if graft:
+        cta = _c1.ancestor(_c2)

Can we know if we're doing graft-type merge beforehand? I think
ca.descendant() isn't fast so it should be avoided for normal merges.

This has been argued repeatedly. Basically the only way you can know in advance that your 
merge is going to be graftlike is by  doing a few descendant() calls yourself. So, with 
the exception of the "hg merge" command (which is guaranteed to yield 
ungraftlike merges), all commands wishing to do a merge will have to run through this 
whole descendant game.

In the first few versions of the patches, I actually did graftlikeness 
detection in mergemod.graft(), but it was a nightmare to get it to work 
properly, and as it turns out, graft() isn't the only thing doing a graftlike 
merge. I was informed that calling descendant() once in a command is fine, it's 
only too slow for calling in a loop.

I don't think complicating the code, potentially missing some edge cases of unusual 
commands doing graftlike merges, and breaking extension compatibility is warranted for a 
few ms speedup of "hg merge".

Given how often you had to explain this, it is probably worth adding something about this in the inline comment about it.

     # find interesting file sets from manifests
+    if graft:
+        repo.ui.debug("  computing unmatched files in rotated DAG\n")
     addedinm1 = m1.filesnotin(ma)
     addedinm2 = m2.filesnotin(ma)
-    u1, u2 = _computenonoverlap(repo, c1, c2, addedinm1, addedinm2)
+    _u1, _u2 = _computenonoverlap(repo, c1, c2, addedinm1, addedinm2)
+    if not graft:
+        u1, u2 = _u1, _u2
+    else: # need to recompute this for directory move handling when
grafting
+        repo.ui.debug("  computing unmatched files in unrotated DAG\n")
+        u1, u2 = _computenonoverlap(repo, c1, c2, m1.filesnotin(mta),
+                                                  m2.filesnotin(mta))
+
     bothnew = sorted(addedinm1 & addedinm2)

     for f in u1:
-        checkcopies(c1, f, m1, m2, ca, limit, diverge, copy1, fullcopy1)
+        checkcopies(c1, f, m1, m2, ca, cta, dirty_c1, limit, diverge, copy1,
+                    fullcopy1, incomplete1, incompletediverge)

     for f in u2:
-        checkcopies(c2, f, m2, m1, ca, limit, diverge, copy2, fullcopy2)
+        checkcopies(c2, f, m2, m1, ca, cta, dirty_c2, limit, diverge, copy2,
+                    fullcopy2, incomplete2, incompletediverge)

[snip]

+    # combine partial copy paths discovered in the previous step
+    copyfrom, copyto = incomplete1, incomplete2
+    if dirty_c1:
+        copyfrom, copyto = incomplete2, incomplete1
+    for f in copyfrom:
+        if f in copyto:
+            copy[copyto[f]] = copyfrom[f]
+            del copyto[f]
+    for f in incompletediverge:
+        assert f not in diverge
+        ic = incompletediverge[f]
+        if ic[0] in copyto:
+            diverge[f] = [copyto[ic[0]], ic[1]]

According to Matt's comment, we need two copy traces split at 'ca', but we
use ones split at 'cta' (and somewhat 'ca' is taken into account?), because it
wouldn't be easy to track copies backwards.

https://www.mercurial-scm.org/pipermail/mercurial-devel/2016-
August/086915.html

I guess that are addressed these "incomplete" dicts and "if"s in
checkcopies(), but I wonder if there are unhandled cases such as non-linear
DAG to be rotated, which might include "ypoc". I'm not sure, though.

See the tests. Virtually every imaginable case is tested.

Matt actually wanted 3 copy traces, one between "cta" and "ca", one from one parent to "ca", and one from the other parent to 
"cta". The problem with this approach is that checkcopies can't just stop after going "behind" some cutoff revision, since it's operating in a 
low-level way in which "behind" doesn't really make sense. This is presumably for perf reasons. As a result, the checkcopies pass going from a parent to 
"ca" will actually go back to "cta", and find spurious copies. We could perhaps identify and remove those spurious copies by comparing the 
output to that of the ca->cta pass, but then we would need post-processing as complex as what this patch has to accomplish that, so we win nothing by going 
3-pass.

Okay this is the first time I read bout this 'behind' thing, I need to investigate that but that should probably extensibility documented as a comment.

--
Pierre-Yves David
_______________________________________________
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel

Reply via email to