Re: [PATCH 04/30] directory rename detection: basic testcases

2017-11-13 Thread Junio C Hamano
Stefan Beller  writes:

> On Fri, Nov 10, 2017 at 11:05 AM, Elijah Newren  wrote:
>> Signed-off-by: Elijah Newren 
>> ...
>> +#  B
>> +#  o
>> +# / \
>> +#  A o   ?
>> +# \ /
>> +#  o
>> +#  C
>> + ...
>> +# Testcase 1a, Basic directory rename.
>> +#   Commit A: z/{b,c}
>> +#   Commit B: y/{b,c}
>> +#   Commit C: z/{b,c,d,e/f}
>
> (minor thought:)
> After rereading the docs above this is clear; I wonder if instead of A, B, C
> a notation of Base, ours, theirs would be easier to understand?

I had a similar thought, but as long as everything in this file is
consistent, as we have that picture upfront, I am OK with it.  FWIW,
t1000 uses O (original--common ancestor) A and B, which was the
notation commonly used in our codebase since the early days when we
needed to call them with single letters.

>> +test_expect_success '1a-setup: Simple directory rename detection' '
>> +test_expect_failure '1a-check: Simple directory rename detection' '
>
> Thanks for splitting the setup and the check into two different test cases!
>
>
>> +   git checkout B^0 &&
>
> Any reason for ^0 ? (to make clear it is a branch?)

I think it is to make it clear that no matter what this test does
(or fails to do), the branch B is *not* affected by it because we'd
be playing on a detached head.

>> +test_expect_success '1b-setup: Merge a directory with another' '
>> +   git rm -rf . &&
>> +   git clean -fdqx &&
>> +   rm -rf .git &&
>> +   git init &&
>
> This is quite a strong statement to start a test with.

Yes.  If a test before this one did cd ../.. and forgot to come
back, we'd be in trouble.  If we want a fresh repository perhaps
test-create-repo inside the trash repository may be a less evil
option.


Re: [PATCH 04/30] directory rename detection: basic testcases

2017-11-13 Thread Elijah Newren
On Mon, Nov 13, 2017 at 5:21 PM, Stefan Beller  wrote:
> On Mon, Nov 13, 2017 at 4:57 PM, Elijah Newren  wrote:

> (slightly dreaming:)
> I wonder if we could teach our test suite to accept multiple test_done
> or restart_tests or such to resurrect the clean slate.

I'm dreaming now too; I would like that a lot more, although the
separate test_create_repo for each case isn't too bad and should be a
pretty mechanical switch.

 +   test 3 -eq $(git ls-files -s | wc -l) &&
>>>
>>> git ls-files >out &&
>>> test_line_count = 3 out &&
>>>

> I am not saying it was a cargo-culting reaction, but rather relaying
> a well discussed style issue to you. ;)

Ah, gotcha.

>> If you feel the return code of ls-files is important, perhaps I could
>> just have a separate
>>git ls-files -s >/dev/null &&
>> call before the others?
>
> I would prefer to not add any further calls; also (speaking generically)
> this would bring in potential racing issues (what if the second ls-files
> behaves differently than the first?)

Makes sense.

>> I'm not following.  The "old" and "new" in the filenames were just
>> there so a human reading the testcase could easily tell which
>> filenames were related and involved in renames.  There is no rename
>> associated with d, so why would I have called it "old" or "new"?
>
> because a user may be impressed by gits pattern matching in the
> rename detection:
>
>  A: z/{oldb,oldc}
>  B: z/{newb,newc}
>  C: z/{oldb, oldc, oldd}
>
> Obviously A->B is z/{old->new}-* (not a directory rename,
> but just patterns), so one might be tempted to expect newd
> as the expectation. But that is nonsense(!?)

Ah, now I see where you were going.  Thanks for explaining.

>> I think 2 is insanity.
>
> or the place where hooks/custom code shines. :)

:)


Re: [PATCH 04/30] directory rename detection: basic testcases

2017-11-13 Thread Stefan Beller
On Mon, Nov 13, 2017 at 4:57 PM, Elijah Newren  wrote:
> On Mon, Nov 13, 2017 at 2:04 PM, Stefan Beller  wrote:
>> On Fri, Nov 10, 2017 at 11:05 AM, Elijah Newren  wrote:
>> (minor thought:)
>> After rereading the docs above this is clear; I wonder if instead of A, B, C
>> a notation of Base, ours, theirs would be easier to understand?
>
> Sure, that'd be an easy change.
>
>>> +test_expect_success '1a-setup: Simple directory rename detection' '
>>> +test_expect_failure '1a-check: Simple directory rename detection' '
>>
>> Thanks for splitting the setup and the check into two different test cases!
>>
>>
>>> +   git checkout B^0 &&
>>
>> Any reason for ^0 ? (to make clear it is a branch?)
>
> Partially force-of-habit (did the same with t6036 and t6042), but it
> seemed to have the nice feature of making debugging easier through
> improved reproducability.  In particular, if I had checked out B
> rather than B^0 in the testcase and a merge succeeded when I didn't
> expect it, then attempting to run the same commands gives me a
> different starting point for the merge.  By instead explicitly
> checking out B^0, then even if the merge succeeds, someone who again
> runs checkout B^0 will have the same starting point.

Thanks for enlightening me. Makes sense.

>
>>> +test_expect_success '1b-setup: Merge a directory with another' '
>>> +   git rm -rf . &&
>>> +   git clean -fdqx &&
>>> +   rm -rf .git &&
>>> +   git init &&
>>
>> This is quite a strong statement to start a test with.
>
> It was actually copy-paste from t6036, and achieved two purposes:
>   1) Even if one test fails, subsequent ones continue running.  (Had
> lots of problems with this in t6036 years ago and just ended up with
> those four steps)
>   2) Someone who runs into a failing testcase has a _much_ easier time
> figuring out what is going on with the testcase.  I find it takes a
> fair amount of time to figure out what's going on with other tests in
> git's testsuite because of the presence of so many files completely
> unrelated to the given test, which have simply accumulated from
> previous tests.  For many tests, that may be fine, but in particular
> for t6036, t6042, and now t6043, since these are mostly about corner
> cases that are hard enough to reason about, I didn't want the extra
> distractions.

I agree with these reasons to strongly want a clean slate.

>> Nowadays we rather do
>>
>> test_when_finished "some command" &&
>>
>> than polluting the follow up tests. But as you split up the previous test
>> into 2 tests, it is not clear if this would bring any good.
>
> test_when_finished looks pretty cool; I didn't know about it.  Thanks
> for the pointer.  Not sure if we want to use it here (if we do, we'd
> only do so in the check tests rather than the setup ones).
>
>> Also these are four cleanup commands, I'd have expected fewer.
>> Maybe just "rm -rf ." ? Or as we make a new repo for each test case,
>>
>> test_create_repo 1a &&
>> cd 1a
>>
>> in the first setup, and here we do
>> test_create_repo 1b &&
>> cd 1b
>>
>> relying on test_done to cleanup everything afterwards?
>
> rm aborts for me with 'rm -rf .', but I could possibly make it 'rm -rf
> * .* && git init .'
>
> The test_create_repo might not be so bad as long as every test used it
> and put all files under it's own little repo.

That is my current favorite, I would think.


>  It does create some
> clutter, but it's at least somewhat managed.

But the clutter is outside the repository under test, which
may help with the situation.

> I'm still a bit partial
> to just totally cleaning it out, but if others would prefer, I can
> switch.

(slightly dreaming:)
I wonder if we could teach our test suite to accept multiple test_done
or restart_tests or such to resurrect the clean slate.

>
>>> +   test 3 -eq $(git ls-files -s | wc -l) &&
>>
>> git ls-files >out &&
>> test_line_count = 3 out &&
>>
>> maybe? Piping output of git commands somewhere is an
>> anti-pattern as we cannot examine the return code of ls-files in this case.
>
> Um...I guess that makes sense, if you assumed that I cared about the
> return code of ls-files.

As this is the test suite, we care about the return code of any git
command, for current git as well as future-gits.

>  But it seems to make the tests with multiple
> calls to ls-files in a row (of which there are many) considerably
> uglier, so I'd personally prefer to avoid that.  Also, why would I
> care about the return code of ls-files?

While I understand the rationale here and your examination of ls-files
seems to indicate that we are ok doing it here, a lot of (test) code
is taken for inspiration (copied, adapted) to other test cases.
And most of the time we actually care if the return code is correct
additionally to the actions performed, so I was shooting from the hip
here.

> Your suggestion made me curious, so I went looking.  As far 

Re: [PATCH 04/30] directory rename detection: basic testcases

2017-11-13 Thread Elijah Newren
On Mon, Nov 13, 2017 at 2:04 PM, Stefan Beller  wrote:
> On Fri, Nov 10, 2017 at 11:05 AM, Elijah Newren  wrote:
> (minor thought:)
> After rereading the docs above this is clear; I wonder if instead of A, B, C
> a notation of Base, ours, theirs would be easier to understand?

Sure, that'd be an easy change.

>> +test_expect_success '1a-setup: Simple directory rename detection' '
>> +test_expect_failure '1a-check: Simple directory rename detection' '
>
> Thanks for splitting the setup and the check into two different test cases!
>
>
>> +   git checkout B^0 &&
>
> Any reason for ^0 ? (to make clear it is a branch?)

Partially force-of-habit (did the same with t6036 and t6042), but it
seemed to have the nice feature of making debugging easier through
improved reproducability.  In particular, if I had checked out B
rather than B^0 in the testcase and a merge succeeded when I didn't
expect it, then attempting to run the same commands gives me a
different starting point for the merge.  By instead explicitly
checking out B^0, then even if the merge succeeds, someone who again
runs checkout B^0 will have the same starting point.

>> +test_expect_success '1b-setup: Merge a directory with another' '
>> +   git rm -rf . &&
>> +   git clean -fdqx &&
>> +   rm -rf .git &&
>> +   git init &&
>
> This is quite a strong statement to start a test with.

It was actually copy-paste from t6036, and achieved two purposes:
  1) Even if one test fails, subsequent ones continue running.  (Had
lots of problems with this in t6036 years ago and just ended up with
those four steps)
  2) Someone who runs into a failing testcase has a _much_ easier time
figuring out what is going on with the testcase.  I find it takes a
fair amount of time to figure out what's going on with other tests in
git's testsuite because of the presence of so many files completely
unrelated to the given test, which have simply accumulated from
previous tests.  For many tests, that may be fine, but in particular
for t6036, t6042, and now t6043, since these are mostly about corner
cases that are hard enough to reason about, I didn't want the extra
distractions.

but...

> Nowadays we rather do
>
> test_when_finished "some command" &&
>
> than polluting the follow up tests. But as you split up the previous test
> into 2 tests, it is not clear if this would bring any good.

test_when_finished looks pretty cool; I didn't know about it.  Thanks
for the pointer.  Not sure if we want to use it here (if we do, we'd
only do so in the check tests rather than the setup ones).

> Also these are four cleanup commands, I'd have expected fewer.
> Maybe just "rm -rf ." ? Or as we make a new repo for each test case,
>
> test_create_repo 1a &&
> cd 1a
>
> in the first setup, and here we do
> test_create_repo 1b &&
> cd 1b
>
> relying on test_done to cleanup everything afterwards?

rm aborts for me with 'rm -rf .', but I could possibly make it 'rm -rf
* .* && git init .'

The test_create_repo might not be so bad as long as every test used it
and put all files under it's own little repo.  It does create some
clutter, but it's at least somewhat managed.  I'm still a bit partial
to just totally cleaning it out, but if others would prefer, I can
switch.

>> +   test 3 -eq $(git ls-files -s | wc -l) &&
>
> git ls-files >out &&
> test_line_count = 3 out &&
>
> maybe? Piping output of git commands somewhere is an
> anti-pattern as we cannot examine the return code of ls-files in this case.

Um...I guess that makes sense, if you assumed that I cared about the
return code of ls-files.  But it seems to make the tests with multiple
calls to ls-files in a row (of which there are many) considerably
uglier, so I'd personally prefer to avoid that.  Also, why would I
care about the return code of ls-files?

Your suggestion made me curious, so I went looking.  As far as I can
tell, the return code of ls-files is always 0 unless you both specify
both --error-unmatch and one or more paths, neither of which I did.
Am I missing something?

If you feel the return code of ls-files is important, perhaps I could
just have a separate
   git ls-files -s >/dev/null &&
call before the others?

>> +   test $(git rev-parse HEAD:y/b) = $(git rev-parse A:z/b) &&
>> +   test $(git rev-parse HEAD:y/c) = $(git rev-parse A:z/c) &&
>> +   test $(git rev-parse HEAD:y/d) = $(git rev-parse A:x/d) &&
>
> Speaking of these, there are quite a lot of invocations of rev-parse,
> though it looks clean; recently Junio had some good ideas regarding a
> similar test[1].
> So maybe
>
>   git rev-parse >actual \
> HEAD:y/b  HEAD:y/c HEAD:y/d &&
>   git rev-parse >expect \
> A:z/bA:z/cA:x/d &&
>   test_cmp expect actual
>
> Not sure if that is any nicer, but has fewer calls.

Sure, I can switch it over.

>> +   test_must_fail git rev-parse HEAD:x/d &&
>> +   test_must_fail git rev-parse HEAD:z/d &&
>> + 

Re: [PATCH 04/30] directory rename detection: basic testcases

2017-11-13 Thread Stefan Beller
On Fri, Nov 10, 2017 at 11:05 AM, Elijah Newren  wrote:
> Signed-off-by: Elijah Newren 
> ---
>  t/t6043-merge-rename-directories.sh | 391 
> 
>  1 file changed, 391 insertions(+)
>  create mode 100755 t/t6043-merge-rename-directories.sh
>
> diff --git a/t/t6043-merge-rename-directories.sh 
> b/t/t6043-merge-rename-directories.sh
> new file mode 100755
> index 00..b737b0a105
> --- /dev/null
> +++ b/t/t6043-merge-rename-directories.sh
> @@ -0,0 +1,391 @@
> +#!/bin/sh
> +
> +test_description="recursive merge with directory renames"
> +# includes checking of many corner cases, with a similar methodology to:
> +#   t6042: corner cases with renames but not criss-cross merges
> +#   t6036: corner cases with both renames and criss-cross merges
> +#
> +# The setup for all of them, pictorially, is:
> +#
> +#  B
> +#  o
> +# / \
> +#  A o   ?
> +# \ /
> +#  o
> +#  C
> +#
> +# To help make it easier to follow the flow of tests, they have been
> +# divided into sections and each test will start with a quick explanation
> +# of what commits A, B, and C contain.
> +#
> +# Notation:
> +#z/{b,c}   means  files z/b and z/c both exist
> +#x/d_1 means  file x/d exists with content d1.  (Purpose of the
> +# underscore notation is to differentiate different
> +# files that might be renamed into each other's paths.)
> +
> +. ./test-lib.sh
> +
> +
> +###
> +# SECTION 1: Basic cases we should be able to handle
> +###
> +
> +# Testcase 1a, Basic directory rename.
> +#   Commit A: z/{b,c}
> +#   Commit B: y/{b,c}
> +#   Commit C: z/{b,c,d,e/f}

(minor thought:)
After rereading the docs above this is clear; I wonder if instead of A, B, C
a notation of Base, ours, theirs would be easier to understand?


> +test_expect_success '1a-setup: Simple directory rename detection' '
> +test_expect_failure '1a-check: Simple directory rename detection' '

Thanks for splitting the setup and the check into two different test cases!


> +   git checkout B^0 &&

Any reason for ^0 ? (to make clear it is a branch?)

> +# Testcase 1b, Merge a directory with another
> +#   Commit A: z/{b,c},   y/d
> +#   Commit B: z/{b,c,e}, y/d
> +#   Commit C: y/{b,c,d}
> +#   Expected: y/{b,c,d,e}
> +
> +test_expect_success '1b-setup: Merge a directory with another' '
> +   git rm -rf . &&
> +   git clean -fdqx &&
> +   rm -rf .git &&
> +   git init &&

This is quite a strong statement to start a test with.
Nowadays we rather do

test_when_finished "some command" &&

than polluting the follow up tests. But as you split up the previous test
into 2 tests, it is not clear if this would bring any good.

Also these are four cleanup commands, I'd have expected fewer.
Maybe just "rm -rf ." ? Or as we make a new repo for each test case,

test_create_repo 1a &&
cd 1a

in the first setup, and here we do
test_create_repo 1b &&
cd 1b

relying on test_done to cleanup everything afterwards?


> +# Testcase 1c, Transitive renaming
> +#   (Related to testcases 3a and 6d -- when should a transitive rename 
> apply?)
> +#   (Related to testcases 9c and 9d -- can transitivity repeat?)
> +#   Commit A: z/{b,c},   x/d
> +#   Commit B: y/{b,c},   x/d
> +#   Commit C: z/{b,c,d}

So B is "Rename z to y" and C is "Move x/d to z/d"

> +#   Expected: y/{b,c,d}  (because x/d -> z/d -> y/d)

This is an excellent expectation for a clean case like this.
I have not reached 3, 9 yet, so I'll remember these questions.

> +test_expect_success '1c-setup: Transitive renaming' '
> +   git rm -rf . &&
> +   git clean -fdqx &&
> +   rm -rf .git &&
> +   git init &&
> +
> +   mkdir z &&
> +   echo b >z/b &&
> +   echo c >z/c &&
> +   mkdir x &&
> +   echo d >x/d &&
> +   git add z x &&
> +   test_tick &&
> +   git commit -m "A" &&
> +
> +   git branch A &&
> +   git branch B &&
> +   git branch C &&
> +
> +   git checkout B &&
> +   git mv z y &&
> +   test_tick &&
> +   git commit -m "B" &&
> +
> +   git checkout C &&
> +   git mv x/d z/d &&
> +   test_tick &&
> +   git commit -m "C"
> +'
> +
> +test_expect_failure '1c-check: Transitive renaming' '
> +   git checkout B^0 &&
> +
> +   git merge -s recursive C^0 &&
> +
> +   test 3 -eq $(git ls-files -s | wc -l) &&

git ls-files >out &&
test_line_count = 3 out &&

maybe? Piping output of git commands somewhere is an
anti-pattern as we cannot examine the return code of ls-files in this case.

> +   test $(git rev-parse HEAD:y/b) = $(git rev-parse A:z/b) &&
> +   test $(git rev-parse HEAD:y/c) = $(git rev-parse A:z/c) &&
> +   test $(git rev-parse HEAD:y/d) = $(git rev-parse A:x/d) &&

Speaking of 

[PATCH 04/30] directory rename detection: basic testcases

2017-11-10 Thread Elijah Newren
Signed-off-by: Elijah Newren 
---
 t/t6043-merge-rename-directories.sh | 391 
 1 file changed, 391 insertions(+)
 create mode 100755 t/t6043-merge-rename-directories.sh

diff --git a/t/t6043-merge-rename-directories.sh 
b/t/t6043-merge-rename-directories.sh
new file mode 100755
index 00..b737b0a105
--- /dev/null
+++ b/t/t6043-merge-rename-directories.sh
@@ -0,0 +1,391 @@
+#!/bin/sh
+
+test_description="recursive merge with directory renames"
+# includes checking of many corner cases, with a similar methodology to:
+#   t6042: corner cases with renames but not criss-cross merges
+#   t6036: corner cases with both renames and criss-cross merges
+#
+# The setup for all of them, pictorially, is:
+#
+#  B
+#  o
+# / \
+#  A o   ?
+# \ /
+#  o
+#  C
+#
+# To help make it easier to follow the flow of tests, they have been
+# divided into sections and each test will start with a quick explanation
+# of what commits A, B, and C contain.
+#
+# Notation:
+#z/{b,c}   means  files z/b and z/c both exist
+#x/d_1 means  file x/d exists with content d1.  (Purpose of the
+# underscore notation is to differentiate different
+# files that might be renamed into each other's paths.)
+
+. ./test-lib.sh
+
+
+###
+# SECTION 1: Basic cases we should be able to handle
+###
+
+# Testcase 1a, Basic directory rename.
+#   Commit A: z/{b,c}
+#   Commit B: y/{b,c}
+#   Commit C: z/{b,c,d,e/f}
+#   Expected: y/{b,c,d,e/f}
+
+test_expect_success '1a-setup: Simple directory rename detection' '
+   mkdir z &&
+   echo b >z/b &&
+   echo c >z/c &&
+   git add z &&
+   test_tick &&
+   git commit -m "A" &&
+
+   git branch A &&
+   git branch B &&
+   git branch C &&
+
+   git checkout B &&
+   git mv z y &&
+   test_tick &&
+   git commit -m "B" &&
+
+   git checkout C &&
+   echo d >z/d &&
+   mkdir z/e &&
+   echo f >z/e/f &&
+   git add z/d z/e/f &&
+   test_tick &&
+   git commit -m "C"
+'
+
+test_expect_failure '1a-check: Simple directory rename detection' '
+   git checkout B^0 &&
+
+   git merge -s recursive C^0 &&
+
+   test 4 -eq $(git ls-files -s | wc -l) &&
+
+   test $(git rev-parse HEAD:y/b) = $(git rev-parse A:z/b) &&
+   test $(git rev-parse HEAD:y/c) = $(git rev-parse A:z/c) &&
+   test $(git rev-parse HEAD:y/d) = $(git rev-parse C:z/d) &&
+   test "$(git hash-object y/d)" = $(git rev-parse C:z/d) &&
+   test $(git rev-parse HEAD:y/e/f) = $(git rev-parse C:z/e/f) &&
+   test_must_fail git rev-parse HEAD:z/d &&
+   test_must_fail git rev-parse HEAD:z/e/f &&
+   test ! -d z/d &&
+   test ! -d z/e/f
+'
+
+# Testcase 1b, Merge a directory with another
+#   Commit A: z/{b,c},   y/d
+#   Commit B: z/{b,c,e}, y/d
+#   Commit C: y/{b,c,d}
+#   Expected: y/{b,c,d,e}
+
+test_expect_success '1b-setup: Merge a directory with another' '
+   git rm -rf . &&
+   git clean -fdqx &&
+   rm -rf .git &&
+   git init &&
+
+   mkdir z &&
+   echo b >z/b &&
+   echo c >z/c &&
+   mkdir y &&
+   echo d >y/d &&
+   git add z y &&
+   test_tick &&
+   git commit -m "A" &&
+
+   git branch A &&
+   git branch B &&
+   git branch C &&
+
+   git checkout B &&
+   echo e >z/e &&
+   git add z/e &&
+   test_tick &&
+   git commit -m "B" &&
+
+   git checkout C &&
+   git mv z/b y &&
+   git mv z/c y &&
+   rmdir z &&
+   test_tick &&
+   git commit -m "C"
+'
+
+test_expect_failure '1b-check: Merge a directory with another' '
+   git checkout B^0 &&
+
+   git merge -s recursive C^0 &&
+
+   test 4 -eq $(git ls-files -s | wc -l) &&
+
+   test $(git rev-parse HEAD:y/b) = $(git rev-parse A:z/b) &&
+   test $(git rev-parse HEAD:y/c) = $(git rev-parse A:z/c) &&
+   test $(git rev-parse HEAD:y/d) = $(git rev-parse A:y/d) &&
+   test $(git rev-parse HEAD:y/e) = $(git rev-parse B:z/e) &&
+   test_must_fail git rev-parse HEAD:z/e
+'
+
+# Testcase 1c, Transitive renaming
+#   (Related to testcases 3a and 6d -- when should a transitive rename apply?)
+#   (Related to testcases 9c and 9d -- can transitivity repeat?)
+#   Commit A: z/{b,c},   x/d
+#   Commit B: y/{b,c},   x/d
+#   Commit C: z/{b,c,d}
+#   Expected: y/{b,c,d}  (because x/d -> z/d -> y/d)
+
+test_expect_success '1c-setup: Transitive renaming' '
+   git rm -rf . &&
+   git clean -fdqx &&
+   rm -rf .git &&
+   git init &&
+
+   mkdir z &&
+   echo b >z/b &&
+   echo c >z/c &&
+   mkdir x &&
+   echo d >x/d &&
+   git add z x &&
+   test_tick &&
+   git commit -m "A" &&
+
+   git branch A &&
+   git branch B &&
+