Re: [PATCH 11/30] directory rename detection: testcases exploring possibly suboptimal merges

2017-11-14 Thread Elijah Newren
On Tue, Nov 14, 2017 at 12:33 PM, Stefan Beller  wrote:
> On Fri, Nov 10, 2017 at 11:05 AM, Elijah Newren  wrote:

>> +# Possible Resolutions:
>> +#   Previous git: y/{a,b,f},   z/{c,d},   x/e
>> +#   Expected: y/{a,b,e,f}, z/{c,d}
>> +#   Preferred:y/{a,b,e},   z/{c,d,f}
>
> it might be tricky in the future to know what "previous git" is;
> "Previous git" means without directory renames enabled;
>
> "expected" means we expect the algorithm presented in this series to produce
> this output, preferred is what we actually expect.

Yes, how about using:
  "Without dir rename detection:"
  "Currently expected:"
and
  "Optimal:"
?

>> +# Testcase 8b, Dual-directory rename, one into the others' way, with 
>> conflicting filenames
>> +#   Commit A. x/{a_1,b_1}, y/{a_2,b_2}
>> +#   Commit B. x/{a_1,b_1,e_1}, y/{a_2,b_2,e_2}
>> +#   Commit C. y/{a_1,b_1}, z/{a_2,b_2}
>> +#
>> +# Possible Resolutions:
>> +#   Previous git: y/{a_1,b_1,e_2}, z/{a_2,b_2}, x/e_1
>> +#   Scary:y/{a_1,b_1}, z/{a_2,b_2}, CONFLICT(add/add, e_1 vs. 
>> e_2)
>> +#   Preferred:y/{a_1,b_1,e_1}, z/{a_2,b_2,e_2}
>
> It may be common to have sub directories with the same path having different
> blobs, e.g. when having say multiple hardware configurations in different sub
> directories configured. Then renaming becomes a pain when they overlap.

Sure, agreed.  Although, the one nice thing about this particular
testcase is that despite showing suboptimal merge behavior, it's at
least the exact same suboptimal behavior as before when we didn't have
directory rename detection.

>> +# moves directories.  Implment directory rename detection suboptimally, and
>
> Implement

Thanks.

> ok, so add "Expected" as well? (repeating "Previous git", or so?)

Yeah, I should make that more explicit.

>> +# Testcase 8d, rename/delete...or not?
>> +#   (Related to testcase 5b; these may appear slightly inconsistent to 
>> users;
>> +#Also related to testcases 7d and 7e)
>
>> +#   Commit A: z/{b,c,d}
>> +#   Commit B: y/{b,c}
>> +#   Commit C: z/{b,c,d,e}
>> +#   Expected: y/{b,c,e}
>
> Why this?
> * d is deleted in B and not found in the result
> * the rename detection also worked well in z->y  for adding e
>
> I do not see the confusion, yet.

Um...yaay?  If you don't see it as confusing, then maybe others don't?
 I was wondering if folks would expect a rename/delete conflict (x/d
either deleted or renamed to y/d via directory rename detection), and
be annoyed if the merge succeeded and didn't even give so much as a
warning about what happened to 'd'.

>> +#   In this case, I'm leaning towards: commit B was the one that deleted z/d
>> +#   and it did the rename of z to y, so the two "conflicts" (rename vs.
>> +#   delete) are both coming from commit B, which is non-sensical.  Conflicts
>> +#   during merging are supposed to be about opposite sides doing things
>> +#   differently.
>
>   "Sensical has not yet become an "official" word in the English language, 
> which
>   would be why you can't use it. Nonsense is a word, therefore nonsensical can
>   used to describe something of nonsense. However, sense has different 
> meanings
>   and doesn't have an adjective for something of sense"
>
> from https://english.stackexchange.com/questions/38582/antonym-of-nonsensical
> I don't mind it, the spell checker just made me go on a detour. Maybe 
> illogical?

Illogical works for me.

>> +# Testcase 8e, Both sides rename, one side adds to original directory
>> +#   Commit A: z/{b,c}
>> +#   Commit B: y/{b,c}
>> +#   Commit C: w/{b,c}, z/d
>> +#
>> +# Possible Resolutions:
>> +#   Previous git: z/d, CONFLICT(z/b -> y/b vs. w/b), CONFLICT(z/c -> y/c 
>> vs. w/c)
>> +#   Expected: y/d, CONFLICT(z/b -> y/b vs. w/b), CONFLICT(z/c -> y/c 
>> vs. w/c)
>> +#   Preferred:??
>> +#
>> +# Notes: In commit B, directory z got renamed to y.  In commit C, directory 
>> z
>> +#did NOT get renamed; the directory is still present; instead it is
>> +#considered to have just renamed a subset of paths in directory z
>> +#elsewhere.  Therefore, the directory rename done in commit B to z/
>> +#applies to z/d and maps it to y/d.
>> +#
>> +#It's possible that users would get confused about this, but what
>> +#should we do instead?   Silently leaving at z/d seems just as bad 
>> or
>> +#maybe even worse.  Perhaps we could print a big warning about z/d
>> +#and how we're moving to y/d in this case, but when I started 
>> thinking
>> +#abouty the ramifications of doing that, I didn't know how to rule 
>> out
>> +#that opening other weird edge and corner cases so I just punted.
>
> s/about/abouty

I think you mean the other direction?  Thanks for catching, I'll fix that up.

> It sort of makes sense from a users POV.

I'm afraid I'm unsure what the antecedent of "It" is here.  (Are you
just saying that my rationale for what I listed as "Expected" makes
sense, or something else?)


Re: [PATCH 11/30] directory rename detection: testcases exploring possibly suboptimal merges

2017-11-14 Thread Stefan Beller
On Fri, Nov 10, 2017 at 11:05 AM, Elijah Newren  wrote:
> Signed-off-by: Elijah Newren 
> ---
>  t/t6043-merge-rename-directories.sh | 371 
> 
>  1 file changed, 371 insertions(+)
>
> diff --git a/t/t6043-merge-rename-directories.sh 
> b/t/t6043-merge-rename-directories.sh
> index 115d0d2622..bdfd943c88 100755
> --- a/t/t6043-merge-rename-directories.sh
> +++ b/t/t6043-merge-rename-directories.sh
> @@ -1683,4 +1683,375 @@ test_expect_failure '7e-check: transitive rename in 
> rename/delete AND dirs in th
> test $(git hash-object y/d~C^0) = $(git rev-parse A:x/d)
>  '
>
> +
> +###
> +# SECTION 8: Suboptimal merges
> +#
> +# As alluded to in the last section, the ruleset we have built up for
> +# detecting directory renames unfortunately has some special cases where it
> +# results in slightly suboptimal or non-intuitive behavior.  This section
> +# explores these cases.
> +#
> +# To be fair, we already had non-intuitive or suboptimal behavior for most
> +# of these cases in git before introducing implicit directory rename
> +# detection, but it'd be nice if there was a modified ruleset out there
> +# that handled these cases a bit better.
> +###
> +
> +# Testcase 8a, Dual-directory rename, one into the others' way
> +#   Commit A. x/{a,b},   y/{c,d}
> +#   Commit B. x/{a,b,e}, y/{c,d,f}
> +#   Commit C. y/{a,b},   z/{c,d}
> +#
> +# Possible Resolutions:
> +#   Previous git: y/{a,b,f},   z/{c,d},   x/e
> +#   Expected: y/{a,b,e,f}, z/{c,d}
> +#   Preferred:y/{a,b,e},   z/{c,d,f}

it might be tricky in the future to know what "previous git" is;
"Previous git" means without directory renames enabled;

"expected" means we expect the algorithm presented in this series to produce
this output, preferred is what we actually expect.


> +#
> +# Note: Both x and y got renamed and it'd be nice to detect both, and we do
> +# better with directory rename detection than git did previously, but the
> +# simple rule from section 5 prevents me from handling this as optimally as
> +# we potentially could.

which were:

   If a subset of to-be-renamed files have a file or directory in the way,
   "turn off" the directory rename for those specific sub-paths, falling
   back to old handling.  But, sadly, see testcases 8a and 8b.

The tricky part is y in this example as x,y "swapped" its content in C,
and moved 'old y content' to the new z/.

Makes sense, but I agree it might be painful to debug such a case
from a users point of view.

> +
> +# Testcase 8b, Dual-directory rename, one into the others' way, with 
> conflicting filenames
> +#   Commit A. x/{a_1,b_1}, y/{a_2,b_2}
> +#   Commit B. x/{a_1,b_1,e_1}, y/{a_2,b_2,e_2}
> +#   Commit C. y/{a_1,b_1}, z/{a_2,b_2}
> +#
> +# Possible Resolutions:
> +#   Previous git: y/{a_1,b_1,e_2}, z/{a_2,b_2}, x/e_1
> +#   Scary:y/{a_1,b_1}, z/{a_2,b_2}, CONFLICT(add/add, e_1 vs. 
> e_2)
> +#   Preferred:y/{a_1,b_1,e_1}, z/{a_2,b_2,e_2}

It may be common to have sub directories with the same path having different
blobs, e.g. when having say multiple hardware configurations in different sub
directories configured. Then renaming becomes a pain when they overlap.

> +# Note: Very similar to 8a, except instead of 'e' and 'f' in directories x 
> and
> +# y, both are named 'e'.  Without directory rename detection, neither file
> +# moves directories.  Implment directory rename detection suboptimally, and

Implement

> +# you get an add/add conflict, but both files were added in commit B, so this
> +# is an add/add conflict where one side of history added both files --
> +# something we can't represent in the index.  Obviously, we'd prefer the last
> +# resolution, but our previous rules are too coarse to allow it.  Using both
> +# the rules from section 4 and section 5 save us from the Scary resolution,
> +# making us fall back to pre-directory-rename-detection behavior for both
> +# e_1 and e_2.

ok, so add "Expected" as well? (repeating "Previous git", or so?)

> +
> +# Testcase 8c, rename+modify/delete
> +#   (Related to testcases 5b and 8d)
> +#   Commit A: z/{b,c,d}
> +#   Commit B: y/{b,c}
> +#   Commit C: z/{b,c,d_modified,e}
> +#   Expected: y/{b,c,e}, CONFLICT(rename+modify/delete: x/d -> y/d or 
> deleted)
> +#
> +#   Note: This testcase doesn't present any concerns for me...until you
> +# compare it with testcases 5b and 8d.  See notes in 8d for more
> +# details.

Makes sense.

> +# Testcase 8d, rename/delete...or not?
> +#   (Related to testcase 5b; these may appear slightly inconsistent to users;
> +#Also related to testcases 7d and 7e)

> +#   Commit A: z/{b,c,d}
> +#   Commit B: y/{b,c}
> +#   Commit C: z/{b,c,d,e}
> +#   Expected: y/{b,c,e}

Why this?
* d is deleted in B and not found in the result
* the rename detection also worked well in z->y

[PATCH 11/30] directory rename detection: testcases exploring possibly suboptimal merges

2017-11-10 Thread Elijah Newren
Signed-off-by: Elijah Newren 
---
 t/t6043-merge-rename-directories.sh | 371 
 1 file changed, 371 insertions(+)

diff --git a/t/t6043-merge-rename-directories.sh 
b/t/t6043-merge-rename-directories.sh
index 115d0d2622..bdfd943c88 100755
--- a/t/t6043-merge-rename-directories.sh
+++ b/t/t6043-merge-rename-directories.sh
@@ -1683,4 +1683,375 @@ test_expect_failure '7e-check: transitive rename in 
rename/delete AND dirs in th
test $(git hash-object y/d~C^0) = $(git rev-parse A:x/d)
 '
 
+
+###
+# SECTION 8: Suboptimal merges
+#
+# As alluded to in the last section, the ruleset we have built up for
+# detecting directory renames unfortunately has some special cases where it
+# results in slightly suboptimal or non-intuitive behavior.  This section
+# explores these cases.
+#
+# To be fair, we already had non-intuitive or suboptimal behavior for most
+# of these cases in git before introducing implicit directory rename
+# detection, but it'd be nice if there was a modified ruleset out there
+# that handled these cases a bit better.
+###
+
+# Testcase 8a, Dual-directory rename, one into the others' way
+#   Commit A. x/{a,b},   y/{c,d}
+#   Commit B. x/{a,b,e}, y/{c,d,f}
+#   Commit C. y/{a,b},   z/{c,d}
+#
+# Possible Resolutions:
+#   Previous git: y/{a,b,f},   z/{c,d},   x/e
+#   Expected: y/{a,b,e,f}, z/{c,d}
+#   Preferred:y/{a,b,e},   z/{c,d,f}
+#
+# Note: Both x and y got renamed and it'd be nice to detect both, and we do
+# better with directory rename detection than git did previously, but the
+# simple rule from section 5 prevents me from handling this as optimally as
+# we potentially could.
+
+test_expect_success '8a-setup: Dual-directory rename, one into the others way' 
'
+   git rm -rf . &&
+   git clean -fdqx &&
+   rm -rf .git &&
+   git init &&
+
+   mkdir x &&
+   mkdir y &&
+   echo a >x/a &&
+   echo b >x/b &&
+   echo c >y/c &&
+   echo d >y/d &&
+   git add x y &&
+   test_tick &&
+   git commit -m "A" &&
+
+   git branch A &&
+   git branch B &&
+   git branch C &&
+
+   git checkout B &&
+   echo e >x/e &&
+   echo f >y/f &&
+   git add x/e y/f &&
+   test_tick &&
+   git commit -m "B" &&
+
+   git checkout C &&
+   git mv y z &&
+   git mv x y &&
+   test_tick &&
+   git commit -m "C"
+'
+
+test_expect_failure '8a-check: Dual-directory rename, one into the others way' 
'
+   git checkout B^0 &&
+
+   git merge -s recursive C^0 &&
+
+   test 6 -eq $(git ls-files -s | wc -l) &&
+   test 0 -eq $(git ls-files -u | wc -l) &&
+   test 0 -eq $(git ls-files -o | wc -l) &&
+
+   test $(git rev-parse HEAD:y/a) = $(git rev-parse A:x/a) &&
+   test $(git rev-parse HEAD:y/b) = $(git rev-parse A:x/b) &&
+   test $(git rev-parse HEAD:y/e) = $(git rev-parse B:x/e) &&
+   test $(git rev-parse HEAD:y/f) = $(git rev-parse B:y/f) &&
+   test $(git rev-parse HEAD:z/c) = $(git rev-parse A:y/c) &&
+   test $(git rev-parse HEAD:z/d) = $(git rev-parse A:y/d)
+'
+
+# Testcase 8b, Dual-directory rename, one into the others' way, with 
conflicting filenames
+#   Commit A. x/{a_1,b_1}, y/{a_2,b_2}
+#   Commit B. x/{a_1,b_1,e_1}, y/{a_2,b_2,e_2}
+#   Commit C. y/{a_1,b_1}, z/{a_2,b_2}
+#
+# Possible Resolutions:
+#   Previous git: y/{a_1,b_1,e_2}, z/{a_2,b_2}, x/e_1
+#   Scary:y/{a_1,b_1}, z/{a_2,b_2}, CONFLICT(add/add, e_1 vs. e_2)
+#   Preferred:y/{a_1,b_1,e_1}, z/{a_2,b_2,e_2}
+#
+# Note: Very similar to 8a, except instead of 'e' and 'f' in directories x and
+# y, both are named 'e'.  Without directory rename detection, neither file
+# moves directories.  Implment directory rename detection suboptimally, and
+# you get an add/add conflict, but both files were added in commit B, so this
+# is an add/add conflict where one side of history added both files --
+# something we can't represent in the index.  Obviously, we'd prefer the last
+# resolution, but our previous rules are too coarse to allow it.  Using both
+# the rules from section 4 and section 5 save us from the Scary resolution,
+# making us fall back to pre-directory-rename-detection behavior for both
+# e_1 and e_2.
+
+test_expect_success '8b-setup: Dual-directory rename, one into the others way, 
with conflicting filenames' '
+   git rm -rf . &&
+   git clean -fdqx &&
+   rm -rf .git &&
+   git init &&
+
+   mkdir x &&
+   mkdir y &&
+   echo a1 >x/a &&
+   echo b1 >x/b &&
+   echo a2 >y/a &&
+   echo b2 >y/b &&
+   git add x y &&
+   test_tick &&
+   git commit -m "A" &&
+
+   git branch A &&
+   git branch B &&
+   git branch C &&
+
+   git checkout B &&
+   echo e1 >x/e &&
+   echo e2 >y/e &&
+   git add x