Re: [PATCH] git-rebase.txt: update note about directory rename detection and am

2018-12-07 Thread Elijah Newren
On Fri, Dec 7, 2018 at 9:51 AM Johannes Sixt  wrote:
>
> From: Elijah Newren 
>
> In commit 6aba117d5cf7 ("am: avoid directory rename detection when
> calling recursive merge machinery", 2018-08-29), the git-rebase manpage
> probably should have also been updated to note the stronger
> incompatibility between git-am and directory rename detection.  Update
> it now.
>
> Signed-off-by: Elijah Newren 
> Signed-off-by: Johannes Sixt 
> ---
>   Documentation/git-rebase.txt | 5 +++--
>   1 file changed, 3 insertions(+), 2 deletions(-)
>
> diff --git a/Documentation/git-rebase.txt b/Documentation/git-rebase.txt
> index 41631df6e4..7bea21e8e3 100644
> --- a/Documentation/git-rebase.txt
> +++ b/Documentation/git-rebase.txt
> @@ -569,8 +569,9 @@ it to keep commits that started empty.
>   Directory rename detection
>   ~~~~~~
>
> -The merge and interactive backends work fine with
> -directory rename detection.  The am backend sometimes does not.
> +Directory rename heuristics are enabled in the merge and interactive
> +backends.  Due to the lack of accurate tree information, directory
> +rename detection is disabled in the am backend.
>
>   include::merge-strategies.txt[]

I was intending to send this out the past couple days, was just kinda
busy.  Thanks for handling it for me.


[PATCH] git-rebase.txt: update note about directory rename detection and am

2018-12-07 Thread Johannes Sixt

From: Elijah Newren 

In commit 6aba117d5cf7 ("am: avoid directory rename detection when
calling recursive merge machinery", 2018-08-29), the git-rebase manpage
probably should have also been updated to note the stronger
incompatibility between git-am and directory rename detection.  Update
it now.

Signed-off-by: Elijah Newren 
Signed-off-by: Johannes Sixt 
---
 Documentation/git-rebase.txt | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/Documentation/git-rebase.txt b/Documentation/git-rebase.txt
index 41631df6e4..7bea21e8e3 100644
--- a/Documentation/git-rebase.txt
+++ b/Documentation/git-rebase.txt
@@ -569,8 +569,9 @@ it to keep commits that started empty.
 Directory rename detection
 ~~

-The merge and interactive backends work fine with
-directory rename detection.  The am backend sometimes does not.
+Directory rename heuristics are enabled in the merge and interactive
+backends.  Due to the lack of accurate tree information, directory
+rename detection is disabled in the am backend.

 include::merge-strategies.txt[]

--
2.19.1.1133.g2dd3d172d2


Re: [PATCH 2/3] merge-recursive: add ability to turn off directory rename detection

2018-08-29 Thread Elijah Newren
On Wed, Aug 29, 2018 at 5:54 AM Johannes Schindelin
 wrote:
>
> Hi Elijah,
>
> On Wed, 29 Aug 2018, Elijah Newren wrote:
>
> > Signed-off-by: Elijah Newren 
> > ---
> >  merge-recursive.c | 18 +-
> >  merge-recursive.h |  1 +
> >  2 files changed, 14 insertions(+), 5 deletions(-)
> >
> > diff --git a/merge-recursive.c b/merge-recursive.c
> > index f110e1c5ec..bf3cb03d3a 100644
> > --- a/merge-recursive.c
> > +++ b/merge-recursive.c
> > @@ -2843,12 +2843,19 @@ static int handle_renames(struct merge_options *o,
> >   head_pairs = get_diffpairs(o, common, head);
> >   merge_pairs = get_diffpairs(o, common, merge);
> >
> > - dir_re_head = get_directory_renames(head_pairs, head);
> > - dir_re_merge = get_directory_renames(merge_pairs, merge);
> > + if (o->detect_directory_renames) {
> > + dir_re_head = get_directory_renames(head_pairs, head);
> > + dir_re_merge = get_directory_renames(merge_pairs, merge);
> >
> > - handle_directory_level_conflicts(o,
> > -  dir_re_head, head,
> > -  dir_re_merge, merge);
> > + handle_directory_level_conflicts(o,
> > +  dir_re_head, head,
> > +  dir_re_merge, merge);
> > + } else {
> > + dir_re_head  = xmalloc(sizeof(*dir_re_head));
> > + dir_re_merge = xmalloc(sizeof(*dir_re_merge));
>
> This is not a suggestion to change anything, but a genuine question out of
> curiosity: would it make sense to put the `dir_re_head` and `dir_re_merge`
> structures into `struct merge_options` to avoid these extra `malloc()`s?
> Or would that cause issues with the recursive nature of the recursive
> merge?

That would work to avoid the extra `malloc()`s, and be inline with the
current usage of merge_options.  However, I'm not sure I like the
current usage of merge_options. That struct is supposed to be public
API, but it's got a lot of private internal-only use stuff (and
putting dir_re_head and dir_re_merge there would add more).  I'm
tempted to go the other way and eject some of the other internal-only
stuff from merge_options (or wrap it inside an opaque struct
merge_options_internal* internal field, or something like that).


Re: [PATCH 2/3] merge-recursive: add ability to turn off directory rename detection

2018-08-29 Thread Johannes Schindelin
Hi Elijah,

On Wed, 29 Aug 2018, Elijah Newren wrote:

> Signed-off-by: Elijah Newren 
> ---
>  merge-recursive.c | 18 +-
>  merge-recursive.h |  1 +
>  2 files changed, 14 insertions(+), 5 deletions(-)
> 
> diff --git a/merge-recursive.c b/merge-recursive.c
> index f110e1c5ec..bf3cb03d3a 100644
> --- a/merge-recursive.c
> +++ b/merge-recursive.c
> @@ -2843,12 +2843,19 @@ static int handle_renames(struct merge_options *o,
>   head_pairs = get_diffpairs(o, common, head);
>   merge_pairs = get_diffpairs(o, common, merge);
>  
> - dir_re_head = get_directory_renames(head_pairs, head);
> - dir_re_merge = get_directory_renames(merge_pairs, merge);
> + if (o->detect_directory_renames) {
> + dir_re_head = get_directory_renames(head_pairs, head);
> + dir_re_merge = get_directory_renames(merge_pairs, merge);
>  
> - handle_directory_level_conflicts(o,
> -  dir_re_head, head,
> -  dir_re_merge, merge);
> + handle_directory_level_conflicts(o,
> +  dir_re_head, head,
> +  dir_re_merge, merge);
> + } else {
> + dir_re_head  = xmalloc(sizeof(*dir_re_head));
> + dir_re_merge = xmalloc(sizeof(*dir_re_merge));

This is not a suggestion to change anything, but a genuine question out of
curiosity: would it make sense to put the `dir_re_head` and `dir_re_merge`
structures into `struct merge_options` to avoid these extra `malloc()`s?
Or would that cause issues with the recursive nature of the recursive
merge?

Ciao,
Dscho

> + dir_rename_init(dir_re_head);
> + dir_rename_init(dir_re_merge);
> + }
>  
>   ri->head_renames  = get_renames(o, head_pairs,
>   dir_re_merge, dir_re_head, head,
> @@ -3541,6 +3548,7 @@ void init_merge_options(struct merge_options *o)
>   o->renormalize = 0;
>   o->diff_detect_rename = -1;
>   o->merge_detect_rename = -1;
> + o->detect_directory_renames = 1;
>   merge_recursive_config(o);
>   merge_verbosity = getenv("GIT_MERGE_VERBOSITY");
>   if (merge_verbosity)
> diff --git a/merge-recursive.h b/merge-recursive.h
> index fa7bc6b683..e39ee5d78b 100644
> --- a/merge-recursive.h
> +++ b/merge-recursive.h
> @@ -18,6 +18,7 @@ struct merge_options {
>   unsigned renormalize : 1;
>   long xdl_opts;
>   int verbosity;
> + int detect_directory_renames;
>   int diff_detect_rename;
>   int merge_detect_rename;
>   int diff_rename_limit;
> -- 
> 2.18.0.12.g97a29da30a
> 
> 


Re: [PATCH 3/3] am: avoid directory rename detection when calling recursive merge machinery

2018-08-29 Thread Johannes Schindelin
Hi Elijah,

On Wed, 29 Aug 2018, Elijah Newren wrote:

> Let's say you have the following three trees, where Base is from one commit
> behind either master or branch:
> 
>Base  : bar_v1, foo/{file1, file2, file3}
>branch: bar_v2, foo/{file1, file2},   goo/file3
>master: bar_v3, foo/{file1, file2, file3}
> 
> Using git-am (or am-based rebase) to apply the changes from branch onto
> master results in the following tree:
> 
>Result: bar_merged, goo/{file1, file2, file3}
> 
> This is not what users want; they did not rename foo/ -> goo/, they only
> renamed one file within that directory.  The reason this happens is am
> constructs fake trees (via build_fake_ancestor()) of the following form:
> 
>Base_bfa  : bar_v1, foo/file3
>branch_bfa: bar_v2, goo/file3
> 
> Combining these two trees with master's tree:
> 
>master: bar_v3, foo/{file1, file2, file3},
> 
> You can see that merge_recursive_generic() would see branch_bfa as renaming
> foo/ -> goo/, and master as just adding both foo/file1 and foo/file2.  As
> such, it ends up with goo/{file1, file2, file3}
> 
> The core problem is that am does not have access to the original trees; it
> can only construct trees using the blobs involved in the patch.  As such,
> it is not safe to perform directory rename detection within am -3.

I read through all three patches, and they look fine to me!

Ciao,
Dscho

> 
> Signed-off-by: Elijah Newren 
> ---
>  builtin/am.c| 1 +
>  t/t3401-rebase-and-am-rename.sh | 4 ++--
>  2 files changed, 3 insertions(+), 2 deletions(-)
> 
> diff --git a/builtin/am.c b/builtin/am.c
> index 2fc2d1e82c..1494a9be84 100644
> --- a/builtin/am.c
> +++ b/builtin/am.c
> @@ -1596,6 +1596,7 @@ static int fall_back_threeway(const struct am_state 
> *state, const char *index_pa
>   o.branch1 = "HEAD";
>   their_tree_name = xstrfmt("%.*s", linelen(state->msg), state->msg);
>   o.branch2 = their_tree_name;
> + o.detect_directory_renames = 0;
>  
>   if (state->quiet)
>   o.verbosity = 0;
> diff --git a/t/t3401-rebase-and-am-rename.sh b/t/t3401-rebase-and-am-rename.sh
> index a87df9e675..94bdfbd69c 100755
> --- a/t/t3401-rebase-and-am-rename.sh
> +++ b/t/t3401-rebase-and-am-rename.sh
> @@ -152,7 +152,7 @@ test_expect_success 'rebase --interactive: NO directory 
> rename' '
>   )
>  '
>  
> -test_expect_failure 'rebase (am): NO directory rename' '
> +test_expect_success 'rebase (am): NO directory rename' '
>   test_when_finished "git -C no-dir-rename rebase --abort" &&
>   (
>   cd no-dir-rename &&
> @@ -190,7 +190,7 @@ test_expect_success 'rebase --merge: NO directory rename' 
> '
>   )
>  '
>  
> -test_expect_failure 'am: NO directory rename' '
> +test_expect_success 'am: NO directory rename' '
>   test_when_finished "git -C no-dir-rename am --abort" &&
>   (
>   cd no-dir-rename &&
> -- 
> 2.18.0.12.g97a29da30a
> 
> 


[PATCH 3/3] am: avoid directory rename detection when calling recursive merge machinery

2018-08-29 Thread Elijah Newren
Let's say you have the following three trees, where Base is from one commit
behind either master or branch:

   Base  : bar_v1, foo/{file1, file2, file3}
   branch: bar_v2, foo/{file1, file2},   goo/file3
   master: bar_v3, foo/{file1, file2, file3}

Using git-am (or am-based rebase) to apply the changes from branch onto
master results in the following tree:

   Result: bar_merged, goo/{file1, file2, file3}

This is not what users want; they did not rename foo/ -> goo/, they only
renamed one file within that directory.  The reason this happens is am
constructs fake trees (via build_fake_ancestor()) of the following form:

   Base_bfa  : bar_v1, foo/file3
   branch_bfa: bar_v2, goo/file3

Combining these two trees with master's tree:

   master: bar_v3, foo/{file1, file2, file3},

You can see that merge_recursive_generic() would see branch_bfa as renaming
foo/ -> goo/, and master as just adding both foo/file1 and foo/file2.  As
such, it ends up with goo/{file1, file2, file3}

The core problem is that am does not have access to the original trees; it
can only construct trees using the blobs involved in the patch.  As such,
it is not safe to perform directory rename detection within am -3.

Signed-off-by: Elijah Newren 
---
 builtin/am.c| 1 +
 t/t3401-rebase-and-am-rename.sh | 4 ++--
 2 files changed, 3 insertions(+), 2 deletions(-)

diff --git a/builtin/am.c b/builtin/am.c
index 2fc2d1e82c..1494a9be84 100644
--- a/builtin/am.c
+++ b/builtin/am.c
@@ -1596,6 +1596,7 @@ static int fall_back_threeway(const struct am_state 
*state, const char *index_pa
o.branch1 = "HEAD";
their_tree_name = xstrfmt("%.*s", linelen(state->msg), state->msg);
o.branch2 = their_tree_name;
+   o.detect_directory_renames = 0;
 
if (state->quiet)
o.verbosity = 0;
diff --git a/t/t3401-rebase-and-am-rename.sh b/t/t3401-rebase-and-am-rename.sh
index a87df9e675..94bdfbd69c 100755
--- a/t/t3401-rebase-and-am-rename.sh
+++ b/t/t3401-rebase-and-am-rename.sh
@@ -152,7 +152,7 @@ test_expect_success 'rebase --interactive: NO directory 
rename' '
)
 '
 
-test_expect_failure 'rebase (am): NO directory rename' '
+test_expect_success 'rebase (am): NO directory rename' '
test_when_finished "git -C no-dir-rename rebase --abort" &&
(
cd no-dir-rename &&
@@ -190,7 +190,7 @@ test_expect_success 'rebase --merge: NO directory rename' '
)
 '
 
-test_expect_failure 'am: NO directory rename' '
+test_expect_success 'am: NO directory rename' '
test_when_finished "git -C no-dir-rename am --abort" &&
(
cd no-dir-rename &&
-- 
2.18.0.12.g97a29da30a



[PATCH 0/3] Turn off directory rename detection in am -3

2018-08-29 Thread Elijah Newren
On Tue, Aug 28, 2018 at 9:58 AM Junio C Hamano  wrote:
> Elijah Newren  writes:
>
> >   - Add a flag to turn off directory rename detection, and set the
> > flag for every call from am.c in order to avoid problems like this.
>
> I'd say this is the only practical solution, before you deprecate
> the "pipe format-patch output to am -3" style of "git rebase" (and
> optionally replace with something else).
>
> The whole point of "am -3" is to do _better_ than just "patch" with
> minimum amount of information available on the pre- and post- image
> blobs, without knowing the remainder of the tree that the patch did
> not touch.  It is not surprising that the heuristics that look at
> the unchanging part of the tree to infer renames that may or may not
> exist guesses incorrectly, either with false positive or negative.
> In the context of "rebase", we always have all the trees that are
> involved.  We should be able to do better than "am -3".

Here are patches to do so; they are built on the top of
en/rebase-consistency, since I wanted to re-use some test code and the
testfile introduced in that series, and to keep similar tests
together.

Elijah Newren (3):
  t3401: add another directory rename testcase for rebase and am
  merge-recursive: add ability to turn off directory rename detection
  am: avoid directory rename detection when calling recursive merge
machinery

 builtin/am.c|   1 +
 merge-recursive.c   |  18 --
 merge-recursive.h   |   1 +
 t/t3401-rebase-and-am-rename.sh | 110 +++-
 4 files changed, 124 insertions(+), 6 deletions(-)

-- 
2.18.0.12.g97a29da30a


[PATCH 2/3] merge-recursive: add ability to turn off directory rename detection

2018-08-29 Thread Elijah Newren
Signed-off-by: Elijah Newren 
---
 merge-recursive.c | 18 +-
 merge-recursive.h |  1 +
 2 files changed, 14 insertions(+), 5 deletions(-)

diff --git a/merge-recursive.c b/merge-recursive.c
index f110e1c5ec..bf3cb03d3a 100644
--- a/merge-recursive.c
+++ b/merge-recursive.c
@@ -2843,12 +2843,19 @@ static int handle_renames(struct merge_options *o,
head_pairs = get_diffpairs(o, common, head);
merge_pairs = get_diffpairs(o, common, merge);
 
-   dir_re_head = get_directory_renames(head_pairs, head);
-   dir_re_merge = get_directory_renames(merge_pairs, merge);
+   if (o->detect_directory_renames) {
+   dir_re_head = get_directory_renames(head_pairs, head);
+   dir_re_merge = get_directory_renames(merge_pairs, merge);
 
-   handle_directory_level_conflicts(o,
-dir_re_head, head,
-dir_re_merge, merge);
+   handle_directory_level_conflicts(o,
+dir_re_head, head,
+dir_re_merge, merge);
+   } else {
+   dir_re_head  = xmalloc(sizeof(*dir_re_head));
+   dir_re_merge = xmalloc(sizeof(*dir_re_merge));
+   dir_rename_init(dir_re_head);
+   dir_rename_init(dir_re_merge);
+   }
 
ri->head_renames  = get_renames(o, head_pairs,
dir_re_merge, dir_re_head, head,
@@ -3541,6 +3548,7 @@ void init_merge_options(struct merge_options *o)
o->renormalize = 0;
o->diff_detect_rename = -1;
o->merge_detect_rename = -1;
+   o->detect_directory_renames = 1;
merge_recursive_config(o);
merge_verbosity = getenv("GIT_MERGE_VERBOSITY");
if (merge_verbosity)
diff --git a/merge-recursive.h b/merge-recursive.h
index fa7bc6b683..e39ee5d78b 100644
--- a/merge-recursive.h
+++ b/merge-recursive.h
@@ -18,6 +18,7 @@ struct merge_options {
unsigned renormalize : 1;
long xdl_opts;
int verbosity;
+   int detect_directory_renames;
int diff_detect_rename;
int merge_detect_rename;
int diff_rename_limit;
-- 
2.18.0.12.g97a29da30a



Re: [PATCH] RelNotes 2.18: clarify where directory rename detection applies

2018-06-13 Thread Junio C Hamano
Elijah Newren  writes:

> Also, since the directory rename detection from this cycle was
> specifically added in merge-recursive and not diffcore-rename, remove the
> 'in "diff" family" phrase from the note.

Thanks.


[PATCH] RelNotes 2.18: clarify where directory rename detection applies

2018-06-12 Thread Elijah Newren
Mention that this feature works with some commands (merge and cherry-pick,
implying that it also works with commands that build on these like rebase
-m and rebase -i).  Explicitly mentioning two commands hopefully implies
that it may not always work with other commands (am, and rebase without
flags that imply either -m or -i).

Also, since the directory rename detection from this cycle was
specifically added in merge-recursive and not diffcore-rename, remove the
'in "diff" family" phrase from the note.  (Folks have requested in the
past that `git diff` detect directory renames and somehow simplify its
output, so it may be helpful to avoid implying that diff has any new
capability here.)

Signed-off-by: Elijah Newren 
---
After thinking for a while about my RFC at
  
https://public-inbox.org/git/cabpp-bf4gbwvrha3d1vqxusnh3as9xvlqteuemfmldkpccy...@mail.gmail.com/
this commit seems like a simple fix.  We can discuss during the 2.19 and
2.20 cycles what to do with rebase, if anything.

Also, if the above commit message feels incomplete without an explanation
of why directory rename detection doesn't work with git-am, the following
could be included:


More details about the git-am limitation for the curious:

git-am tries to avoid a full three way merge, instead calling git-apply.
That prevents us from detecting renames at all, which may defeat the
directory rename detection.  There is a fallback, though; if the initial
git-apply fails and the user has specified the -3 option, git-am will
fall back to a three way merge.  However, git-am lacks the necessary
information to do a "real" three way merge.  Instead, it has to use
build_fake_ancestor() to get a merge base that is missing files whose
rename may have been important to detect for directory rename detection
to function.

am-based rebases work by first generating a bunch of patches (which no
longer record what the original commits were and thus don't have the
necessary info from which we can find a real merge-base), and then
calling `git am`.  This implies that am-based rebases will not always
successfully detect directory renames either.  merged-based rebases
(rebase -m) and cherry-pick-based rebases (rebase -i) are not affected
by this shortcoming.

 Documentation/RelNotes/2.18.0.txt | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/Documentation/RelNotes/2.18.0.txt 
b/Documentation/RelNotes/2.18.0.txt
index ed80e5485b..449e49e0eb 100644
--- a/Documentation/RelNotes/2.18.0.txt
+++ b/Documentation/RelNotes/2.18.0.txt
@@ -6,7 +6,7 @@ Updates since v2.17
 
 UI, Workflows & Features
 
- * Rename detection logic in "diff" family that is used in "merge" has
+ * Rename detection logic that is used in "merge" and "cherry-pick" has
learned to guess when all of x/a, x/b and x/c have moved to z/a,
z/b and z/c, it is likely that x/d added in the meantime would also
want to move to z/d by taking the hint that the entire directory
-- 
2.18.0.rc0.3.gda9bce4c68



Re: [PATCH v3] add status config and command line options for rename detection

2018-05-14 Thread Ben Peart



On 5/12/2018 4:04 AM, Eckhard Maaß wrote:

On Fri, May 11, 2018 at 12:56:39PM +, Ben Peart wrote:

After performing a merge that has conflicts git status will, by default,
attempt to detect renames which causes many objects to be examined.  In a
virtualized repo, those objects do not exist locally so the rename logic
triggers them to be fetched from the server. This results in the status call
taking hours to complete on very large repos vs seconds with this patch.


I see where your need comes from, but as you based this on my little
patch one can achieve this already with tweaking diff.renames itself. I
do wonder why there is a special need for the status command here


The rename detection feature is nice and we'd like to leave it on 
whenever possible.  The performance issues only occur when in the middle 
of a merge - normal status commands behave reasonably.  As a result, we 
don't want to just turn it off completely by setting diff.renames.


Until we come up with a more elegant solution, we currently turn it off 
completely for merge via the new merge settings and then intercept calls 
to status and if there is a MERGE_HEAD we turn it off for status just 
for that specific call.  I view this as a temporary solution so would 
not want to put that logic into git proper as it is quite specific to 
when running git on a virtualized repo.



if there is, I personally would like it more in a style that you could
take all the options provided by diff.*-configuration and prefix that
with status, eg status.diff.renames = true. What do you think? If you
really only need this for merges, maybe a more specialised option is
called for that only kicks in when there is a merge going on?

I would like that status behaves as similar as possible to
diff/show/log. Special options will pull away from that again - passing
-m to show or log will lead to the same performance issues, correct?
Could it be feasible to impose an overall time limit on the detection?



I agree that they should behave as similar as possible which is why all 
the new settings default to the diff setting when not explicitly set.  I 
believe this is a good model - if you don't do anything special you get 
the default/same behavior but if you know and need special behavior, you 
now have that option.



And after writing this I wonder what were your experience with just
tweaking renameLimit - setting it very low should have helped the
fetching from server part already, shouldn't it?


Add --no-renames command line option to status that enables overriding the
config setting from the command line. Add --find-renames[=] command line
option to status that enables detecting renames and optionally setting the
similarity index.


Would it be reasonable to extend this so that we just use the same
machinery for parsing command line options for the diffcore options and
pass this along? It seems to me that git status wants the same init as
diff/show/log has anyway. But I like the direction towards passing more
command line options to the git status command.



I agree that it is unfortunate that diff/merge/status all parse and deal 
with config settings differently.  I'd be happy to see someone tackle 
that and move the code to a single, coherent model but that is beyond 
the scope of this patch.



  static void wt_longstatus_print_unmerged_header(struct wt_status *s)
@@ -592,6 +595,9 @@ static void wt_status_collect_changes_worktree(struct 
wt_status *s)
}
rev.diffopt.format_callback = wt_status_collect_changed_cb;
rev.diffopt.format_callback_data = s;
+   rev.diffopt.detect_rename = s->detect_rename >= 0 ? s->detect_rename : 
rev.diffopt.detect_rename;
+   rev.diffopt.rename_limit = s->rename_limit >= 0 ? s->rename_limit : 
rev.diffopt.rename_limit;
+   rev.diffopt.rename_score = s->rename_score >= 0 ? s->rename_score : 
rev.diffopt.rename_score;
copy_pathspec(_data, >pathspec);
run_diff_files(, 0);
  }
@@ -625,6 +631,9 @@ static void wt_status_collect_changes_index(struct 
wt_status *s)
rev.diffopt.output_format |= DIFF_FORMAT_CALLBACK;
rev.diffopt.format_callback = wt_status_collect_updated_cb;
rev.diffopt.format_callback_data = s;
+   rev.diffopt.detect_rename = s->detect_rename >= 0 ? s->detect_rename : 
rev.diffopt.detect_rename;
+   rev.diffopt.rename_limit = s->rename_limit >= 0 ? s->rename_limit : 
rev.diffopt.rename_limit;
+   rev.diffopt.rename_score = s->rename_score >= 0 ? s->rename_score : 
rev.diffopt.rename_score;
copy_pathspec(_data, >pathspec);
run_diff_index(, 1);
  }
@@ -982,6 +991,9 @@ static void wt_longstatus_print_verbose(struct wt_status *s)
setup_revisions(0, NULL, , );
  
  	rev.diffopt.output_format |= DIFF_FORMAT_PATCH;

+   rev.diffopt.detect_rename = s->detect_rename >= 0 ? s->detect_rename : 
rev.diffopt.detect_rename;
+   

Re: [PATCH v3] add status config and command line options for rename detection

2018-05-12 Thread Eckhard Maaß
On Fri, May 11, 2018 at 12:56:39PM +, Ben Peart wrote:
> After performing a merge that has conflicts git status will, by default,
> attempt to detect renames which causes many objects to be examined.  In a
> virtualized repo, those objects do not exist locally so the rename logic
> triggers them to be fetched from the server. This results in the status call
> taking hours to complete on very large repos vs seconds with this patch.

I see where your need comes from, but as you based this on my little
patch one can achieve this already with tweaking diff.renames itself. I
do wonder why there is a special need for the status command here. And
if there is, I personally would like it more in a style that you could
take all the options provided by diff.*-configuration and prefix that
with status, eg status.diff.renames = true. What do you think? If you
really only need this for merges, maybe a more specialised option is
called for that only kicks in when there is a merge going on?

I would like that status behaves as similar as possible to
diff/show/log. Special options will pull away from that again - passing
-m to show or log will lead to the same performance issues, correct?
Could it be feasible to impose an overall time limit on the detection?

And after writing this I wonder what were your experience with just
tweaking renameLimit - setting it very low should have helped the
fetching from server part already, shouldn't it?

> Add --no-renames command line option to status that enables overriding the
> config setting from the command line. Add --find-renames[=] command line
> option to status that enables detecting renames and optionally setting the
> similarity index.

Would it be reasonable to extend this so that we just use the same
machinery for parsing command line options for the diffcore options and
pass this along? It seems to me that git status wants the same init as
diff/show/log has anyway. But I like the direction towards passing more
command line options to the git status command. 

>  static void wt_longstatus_print_unmerged_header(struct wt_status *s)
> @@ -592,6 +595,9 @@ static void wt_status_collect_changes_worktree(struct 
> wt_status *s)
>   }
>   rev.diffopt.format_callback = wt_status_collect_changed_cb;
>   rev.diffopt.format_callback_data = s;
> + rev.diffopt.detect_rename = s->detect_rename >= 0 ? s->detect_rename : 
> rev.diffopt.detect_rename;
> + rev.diffopt.rename_limit = s->rename_limit >= 0 ? s->rename_limit : 
> rev.diffopt.rename_limit;
> + rev.diffopt.rename_score = s->rename_score >= 0 ? s->rename_score : 
> rev.diffopt.rename_score;
>   copy_pathspec(_data, >pathspec);
>   run_diff_files(, 0);
>  }
> @@ -625,6 +631,9 @@ static void wt_status_collect_changes_index(struct 
> wt_status *s)
>   rev.diffopt.output_format |= DIFF_FORMAT_CALLBACK;
>   rev.diffopt.format_callback = wt_status_collect_updated_cb;
>   rev.diffopt.format_callback_data = s;
> + rev.diffopt.detect_rename = s->detect_rename >= 0 ? s->detect_rename : 
> rev.diffopt.detect_rename;
> + rev.diffopt.rename_limit = s->rename_limit >= 0 ? s->rename_limit : 
> rev.diffopt.rename_limit;
> + rev.diffopt.rename_score = s->rename_score >= 0 ? s->rename_score : 
> rev.diffopt.rename_score;
>   copy_pathspec(_data, >pathspec);
>   run_diff_index(, 1);
>  }
> @@ -982,6 +991,9 @@ static void wt_longstatus_print_verbose(struct wt_status 
> *s)
>   setup_revisions(0, NULL, , );
>  
>   rev.diffopt.output_format |= DIFF_FORMAT_PATCH;
> + rev.diffopt.detect_rename = s->detect_rename >= 0 ? s->detect_rename : 
> rev.diffopt.detect_rename;
> + rev.diffopt.rename_limit = s->rename_limit >= 0 ? s->rename_limit : 
> rev.diffopt.rename_limit;
> + rev.diffopt.rename_score = s->rename_score >= 0 ? s->rename_score : 
> rev.diffopt.rename_score;
>   rev.diffopt.file = s->fp;
>   rev.diffopt.close_file = 0;
>   /*

Somehow I am inclined that those should be factored out to a common
method if the rest of the patch stays as it is.

Greetings,
Eckhard


[PATCH v4] add status config and command line options for rename detection

2018-05-11 Thread Ben Peart
After performing a merge that has conflicts git status will, by default,
attempt to detect renames which causes many objects to be examined.  In a
virtualized repo, those objects do not exist locally so the rename logic
triggers them to be fetched from the server. This results in the status call
taking hours to complete on very large repos vs seconds with this patch.

Add a new config status.renames setting to enable turning off rename
detection during status and commit.  This setting will default to the value
of diff.renames.

Add a new config status.renamelimit setting to to enable bounding the time
spent finding out inexact renames during status and commit.  This setting
will default to the value of diff.renamelimit.

Add --no-renames command line option to status that enables overriding the
config setting from the command line. Add --find-renames[=] command line
option to status that enables detecting renames and optionally setting the
similarity index.

Reviewed-by: Elijah Newren <new...@gmail.com>
Original-Patch-by: Alejandro Pauly <alpa...@microsoft.com>
Signed-off-by: Ben Peart <ben.pe...@microsoft.com>
---

Notes:
Base Ref: commit dc6b1d92ca9c0c538daa244e3910bb8b2a50d959 
(em/status-rename-config)
Web-Diff: https://github.com/benpeart/git/commit/95974d512b
Checkout: git fetch https://github.com/benpeart/git status-renames-v4 && 
git checkout 95974d512b

### Interdiff (v3..v4):

### Patches

 Documentation/config.txt |  12 
 Documentation/git-status.txt |  10 
 builtin/commit.c |  42 +
 diff.c   |   2 +-
 diff.h   |   1 +
 t/t7525-status-rename.sh | 113 +++
 wt-status.c  |  12 
 wt-status.h  |   4 +-
 8 files changed, 194 insertions(+), 2 deletions(-)
 create mode 100755 t/t7525-status-rename.sh

diff --git a/Documentation/config.txt b/Documentation/config.txt
index 2659153cb3..4f1ead 100644
--- a/Documentation/config.txt
+++ b/Documentation/config.txt
@@ -3119,6 +3119,18 @@ status.displayCommentPrefix::
behavior of linkgit:git-status[1] in Git 1.8.4 and previous.
Defaults to false.
 
+status.renameLimit::
+   The number of files to consider when performing rename detection
+   in linkgit:git-status[1] and linkgit:git-commit[1]. Defaults to
+   the value of diff.renameLimit.
+
+status.renames::
+   Whether and how Git detects renames in linkgit:git-status[1] and
+   linkgit:git-commit[1] .  If set to "false", rename detection is
+   disabled. If set to "true", basic rename detection is enabled.
+   If set to "copies" or "copy", Git will detect copies, as well.
+   Defaults to the value of diff.renames.
+
 status.showStash::
If set to true, linkgit:git-status[1] will display the number of
entries currently stashed away.
diff --git a/Documentation/git-status.txt b/Documentation/git-status.txt
index c16e27e63d..c4467ffb98 100644
--- a/Documentation/git-status.txt
+++ b/Documentation/git-status.txt
@@ -135,6 +135,16 @@ ignored, then the directory is not shown, but all contents 
are shown.
Display or do not display detailed ahead/behind counts for the
branch relative to its upstream branch.  Defaults to true.
 
+--renames::
+--no-renames::
+   Turn on/off rename detection regardless of user configuration.
+   See also linkgit:git-diff[1] `--no-renames`.
+
+--find-renames[=]::
+   Turn on rename detection, optionally setting the similarity
+   threshold.
+   See also linkgit:git-diff[1] `--find-renames`.
+
 ...::
See the 'pathspec' entry in linkgit:gitglossary[7].
 
diff --git a/builtin/commit.c b/builtin/commit.c
index 5240f11225..b50e33ef48 100644
--- a/builtin/commit.c
+++ b/builtin/commit.c
@@ -143,6 +143,16 @@ static int opt_parse_m(const struct option *opt, const 
char *arg, int unset)
return 0;
 }
 
+static int opt_parse_rename_score(const struct option *opt, const char *arg, 
int unset)
+{
+   const char **value = opt->value;
+   if (arg != NULL && *arg == '=')
+   arg = arg + 1;
+
+   *value = arg;
+   return 0;
+}
+
 static void determine_whence(struct wt_status *s)
 {
if (file_exists(git_path_merge_head()))
@@ -1259,11 +1269,31 @@ static int git_status_config(const char *k, const char 
*v, void *cb)
return error(_("Invalid untracked files mode '%s'"), v);
return 0;
}
+   if (!strcmp(k, "diff.renamelimit")) {
+   if (s->rename_limit == -1)
+   s->rename_limit = git_config_int(k, v);
+   return 0;
+   }
+   if (!strcmp(k, "status.renamelimit")) {
+   s->rename_limit = git_config_int(k, v);
+   return 0;
+   }
+   if

Re: [PATCH v3] add status config and command line options for rename detection

2018-05-11 Thread Elijah Newren
Hi Ben,

On Fri, May 11, 2018 at 5:56 AM, Ben Peart <ben.pe...@microsoft.com> wrote:
> After performing a merge that has conflicts git status will, by default,
> attempt to detect renames which causes many objects to be examined.  In a
> virtualized repo, those objects do not exist locally so the rename logic
> triggers them to be fetched from the server. This results in the status call
> taking hours to complete on very large repos vs seconds with this patch.
>
> Add a new config status.renames setting to enable turning off rename detection
> during status and commit.  This setting will default to the value of
> diff.renames.
>
> Add a new config status.renamelimit setting to to enable bounding the time
> spent finding out inexact renames during status and commit.  This setting will
> default to the value of diff.renamelimit.
>
> Add --no-renames command line option to status that enables overriding the
> config setting from the command line. Add --find-renames[=] command line
> option to status that enables detecting renames and optionally setting the
> similarity index.

Any chance I could get you to re-wrap this at a smaller column width?
Doesn't fit in my (80-char) terminal when I run `git log`; a couple
lines run over by a couple characters.  (Sorry for not noticing this
earlier)

> This patch depends on em/status-rename-config

I'd leave this line for the notes.  It's useful information now, but
won't be to someone looking at the commit a year from now, so it
probably doesn't belong in the commit message.


With those two changes:
  Reviewed-by: Elijah Newren <new...@gmail.com>


[PATCH v3] add status config and command line options for rename detection

2018-05-11 Thread Ben Peart
After performing a merge that has conflicts git status will, by default,
attempt to detect renames which causes many objects to be examined.  In a
virtualized repo, those objects do not exist locally so the rename logic
triggers them to be fetched from the server. This results in the status call
taking hours to complete on very large repos vs seconds with this patch.

Add a new config status.renames setting to enable turning off rename detection
during status and commit.  This setting will default to the value of
diff.renames.

Add a new config status.renamelimit setting to to enable bounding the time
spent finding out inexact renames during status and commit.  This setting will
default to the value of diff.renamelimit.

Add --no-renames command line option to status that enables overriding the
config setting from the command line. Add --find-renames[=] command line
option to status that enables detecting renames and optionally setting the
similarity index.

This patch depends on em/status-rename-config

Original-Patch-by: Alejandro Pauly <alpa...@microsoft.com>
Signed-off-by: Ben Peart <ben.pe...@microsoft.com>
---

Notes:
Base Ref: commit dc6b1d92ca9c0c538daa244e3910bb8b2a50d959 
(em/status-rename-config)
Web-Diff: https://github.com/benpeart/git/commit/5bac43610b
Checkout: git fetch https://github.com/benpeart/git status-renames-v3 && 
git checkout 5bac43610b

### Interdiff (v2..v3):

diff --git a/Documentation/config.txt b/Documentation/config.txt
index 9c8eca05b1..4f1ead 100644
--- a/Documentation/config.txt
+++ b/Documentation/config.txt
@@ -3120,14 +3120,16 @@ status.displayCommentPrefix::
Defaults to false.

 status.renameLimit::
-   The number of files to consider when performing rename detection;
-   if not specified, defaults to the value of diff.renameLimit.
+   The number of files to consider when performing rename detection
+   in linkgit:git-status[1] and linkgit:git-commit[1]. Defaults to
+   the value of diff.renameLimit.

 status.renames::
-   Whether and how Git detects renames.  If set to "false",
-   rename detection is disabled. If set to "true", basic rename
-   detection is enabled.  If set to "copies" or "copy", Git will
-   detect copies, as well.  Defaults to the value of diff.renames.
+   Whether and how Git detects renames in linkgit:git-status[1] and
+   linkgit:git-commit[1] .  If set to "false", rename detection is
+   disabled. If set to "true", basic rename detection is enabled.
+   If set to "copies" or "copy", Git will detect copies, as well.
+   Defaults to the value of diff.renames.

 status.showStash::
If set to true, linkgit:git-status[1] will display the number of
diff --git a/builtin/commit.c b/builtin/commit.c
index db886277f4..b50e33ef48 100644
--- a/builtin/commit.c
+++ b/builtin/commit.c
@@ -1373,6 +1373,7 @@ int cmd_status(int argc, const char **argv, const 
char *prefix)
if (no_renames != -1)
s.detect_rename = !no_renames;
if ((intptr_t)rename_score_arg != -1) {
+   if (s.detect_rename < DIFF_DETECT_RENAME)
s.detect_rename = DIFF_DETECT_RENAME;
if (rename_score_arg)
s.rename_score = parse_rename_score(_score_arg);
diff --git a/t/t7525-status-rename.sh b/t/t7525-status-rename.sh
old mode 100644
new mode 100755
index 311df8038a..ef8b1b3078
--- a/t/t7525-status-rename.sh
+++ b/t/t7525-status-rename.sh
@@ -10,14 +10,13 @@ test_expect_success 'setup' '
git commit -m"Adding original file." &&
mv original renamed &&
echo 2 >> renamed &&
-   git add .
-'
-
-cat >.gitignore <<\EOF
+   git add . &&
+   cat >.gitignore <<-\EOF
.gitignore
expect*
actual*
EOF
+'

 test_expect_success 'status no-options' '
git status >actual &&
@@ -63,7 +62,18 @@ test_expect_success 'status status.renames=true' '
test_i18ngrep "renamed:" actual
 '

-test_expect_success 'status config overriden' '
+test_expect_success 'commit honors status.renames=false' '
+   git -c status.renames=false commit --dry-run >actual &&
+   test_i18ngrep "deleted:" actual &&
+   test_i18ngrep "new file:" actual
+'
+
+test_expect_success 'commit honors status.renames=true' '
+   git -c status.renames=true commit --dry-run >actual &&
+   test_i18ngrep "renamed:" actual
+'
+
+test_expect_success 'status config overridden' '
git -c status.renames=true status --no-renames >actual &&
 

Re: [PATCH v2] add status config and command line options for rename detection

2018-05-11 Thread Ben Peart



On 5/10/2018 6:31 PM, Elijah Newren wrote:

Hi Ben,

On Thu, May 10, 2018 at 12:09 PM, Ben Peart  wrote:

On 5/10/2018 12:19 PM, Elijah Newren wrote:

On Thu, May 10, 2018 at 7:16 AM, Ben Peart 
wrote:



Given the example perf impact is arbitrary (the actual example that
triggered this patch took status from 2+ hours to seconds) and can't be
replicated using the current performance tools in git, I'm just going drop
the specific numbers.  I believe the patch is worth while just to give users
the flexibility to control these behaviors.


Your parenthetical statement of timing going from hours to seconds I
think would be great; I don't think we need precise numbers.


+   if ((intptr_t)rename_score_arg != -1) {
+   s.detect_rename = DIFF_DETECT_RENAME;



I'd still prefer this was a
  if (s.detect_rename < DIFF_DETECT_RENAME)
  s.detect_rename = DIFF_DETECT_RENAME;

If a user specifies they are willing to pay for copy detection, but
then just passes --find-renames=40% because they want to find more
renames, it seems odd to disable copy detection to me.



I agree and will change it. It is unfortunate this will behave differently
than it does with merge.  Fixing the merge behavior to match is outside the
scope of this patch.


I agree that changing merge is outside the scope of this patch, but
I'm curious what change you have in mind for it to "make it match".
In particular, merge-recursive.c already has (or will shortly have)
+   if (opts.detect_rename > DIFF_DETECT_RENAME)
+   opts.detect_rename = DIFF_DETECT_RENAME;
from your commit 85b460305ce7 ("merge: add merge.renames config
setting", 2018-05-02), 


This is a good point that I missed.  With that recent change to merge, 
it no longer matters that the settings parsing code caps detect_rename 
at DIFF_DETECT_RENAME because it will cap it later anyway so there is no 
need to change the merge option behavior.



The one place copy detection does make sense inside a merge is for the
diffstat shown at the end (from builtin/merge.c), but it currently
isn't controlled by any configuration setting at all.  When it is
hooked up, it'd probably store the value separately from
merge-recursive's internal o->{diff,merge}_detect_rename anyway,
because builtin/merge.c's diffstat should be controlled by the
relevant confiig settings and flags (merge.renames, diff.renames,
-Xfind-renames, etc.) regardless of which merge strategy (recursive,
resolve, octopus, ours, ort) is employed.  And when that is hooked up,
I agree with you that it should look like what you've done with
status.renames here.  In fact, if you'd like to take a crack at it, I
think you'd do a great job.  :-)  If not, it's on my list of things to
do.



Thanks but I'll leave that to you. :)  I have a large backlog of patches 
I would like to see pushed through the mailing list into master.  We've 
been sitting on this one for over a year.  If the current rate is any 
indication, it will take man years to get caught up.



Testcases look good.  It'd be nice to also add a few testcases where
copy detection is turned on -- in particular, I'd like to see one with
--find-renames=$DIFFERENT_THAN_DEFAULT being passed when
merge.renames=copies.



OK.  I also added tests to verify the settings correctly impact commit.


Nice!



Re: [PATCH v2] add status config and command line options for rename detection

2018-05-11 Thread Junio C Hamano
Ben Peart  writes:

>  Documentation/config.txt | 10 
>  Documentation/git-status.txt | 10 
>  builtin/commit.c | 41 
>  diff.c   |  2 +-
>  diff.h   |  1 +
>  t/t7525-status-rename.sh | 90 
>  wt-status.c  | 12 +
>  wt-status.h  |  4 +-
>  8 files changed, 168 insertions(+), 2 deletions(-)
>  create mode 100644 t/t7525-status-rename.sh

I'll mark the new script as executable (otherwise the test will not
even start).



Re: [PATCH v2] add status config and command line options for rename detection

2018-05-10 Thread Junio C Hamano
Elijah Newren  writes:

>> Note: I removed the --no-breaks command line option from the original patch 
>> as
>> it will no longer be needed once the default has been changed [1] to turn it 
>> off.
>>
>> [1] 
>> https://public-inbox.org/git/20180430093421.27551-2-eckhard.s.ma...@gmail.com/
>
> I'd just drop these lines from the commit message, and instead mention
> that your patch depends on em/status-rename-config.
>
>> Original-Patch-by: Alejandro Pauly 
>> Signed-off-by: Ben Peart 
>> ---

Other things seem to have been resolved between you two already, so
I'll only comment on a minor tangent here.

>> Notes:
>> Base Ref: master
>
> This patch does not apply to master; it has conflicts.
>
>> Web-Diff: https://github.com/benpeart/git/commit/823212725b

As Git is distributed, unlike tags that are meant to be global among
project participants by convention, a branch name can never be used
as a trustable base among developers.  Your 'master' branch may
point at a different commit from mine, and my 'master' branch today
may point at a different commit from mine yesterday.

I've seen patches that used a similar note below the three-dash line
that named an exact commit object name.  That is a lot more reliable
way to convey the information necessary to consturct the exact state
the contributor worked on.

> This web diff shows em/status-rename-config as the parent commit, not
> master.  Since your commit message mentions you want the change to
> break detection provided by that series, just listing it as the
> explicit base seems like the right way to go.

Thanks for digging.  That would work well, too.


Re: [PATCH v2] add status config and command line options for rename detection

2018-05-10 Thread Elijah Newren
Hi Ben,

On Thu, May 10, 2018 at 12:09 PM, Ben Peart  wrote:
> On 5/10/2018 12:19 PM, Elijah Newren wrote:
>> On Thu, May 10, 2018 at 7:16 AM, Ben Peart 
>> wrote:

> Given the example perf impact is arbitrary (the actual example that
> triggered this patch took status from 2+ hours to seconds) and can't be
> replicated using the current performance tools in git, I'm just going drop
> the specific numbers.  I believe the patch is worth while just to give users
> the flexibility to control these behaviors.

Your parenthetical statement of timing going from hours to seconds I
think would be great; I don't think we need precise numbers.

>>> +   if ((intptr_t)rename_score_arg != -1) {
>>> +   s.detect_rename = DIFF_DETECT_RENAME;
>>
>>
>> I'd still prefer this was a
>>  if (s.detect_rename < DIFF_DETECT_RENAME)
>>  s.detect_rename = DIFF_DETECT_RENAME;
>>
>> If a user specifies they are willing to pay for copy detection, but
>> then just passes --find-renames=40% because they want to find more
>> renames, it seems odd to disable copy detection to me.
>>
>
> I agree and will change it. It is unfortunate this will behave differently
> than it does with merge.  Fixing the merge behavior to match is outside the
> scope of this patch.

I agree that changing merge is outside the scope of this patch, but
I'm curious what change you have in mind for it to "make it match".
In particular, merge-recursive.c already has (or will shortly have)
+   if (opts.detect_rename > DIFF_DETECT_RENAME)
+   opts.detect_rename = DIFF_DETECT_RENAME;
from your commit 85b460305ce7 ("merge: add merge.renames config
setting", 2018-05-02), so I'm not sure why we'd want to carefully
propagate a larger value for o->{diff,merge}_detect_rename prior to
this point.  If it's just "future proofing" because you suspect that
copy information could be useful to the merging algorithm and we'll
eventually get rid of these two lines of code, then I could get behind
such a change, though color me skeptical that copy information would
ever turn out to be useful in that context.

The one place copy detection does make sense inside a merge is for the
diffstat shown at the end (from builtin/merge.c), but it currently
isn't controlled by any configuration setting at all.  When it is
hooked up, it'd probably store the value separately from
merge-recursive's internal o->{diff,merge}_detect_rename anyway,
because builtin/merge.c's diffstat should be controlled by the
relevant confiig settings and flags (merge.renames, diff.renames,
-Xfind-renames, etc.) regardless of which merge strategy (recursive,
resolve, octopus, ours, ort) is employed.  And when that is hooked up,
I agree with you that it should look like what you've done with
status.renames here.  In fact, if you'd like to take a crack at it, I
think you'd do a great job.  :-)  If not, it's on my list of things to
do.

>> Testcases look good.  It'd be nice to also add a few testcases where
>> copy detection is turned on -- in particular, I'd like to see one with
>> --find-renames=$DIFFERENT_THAN_DEFAULT being passed when
>> merge.renames=copies.
>>
>
> OK.  I also added tests to verify the settings correctly impact commit.

Nice!


Re: [PATCH v2] add status config and command line options for rename detection

2018-05-10 Thread Ben Peart



On 5/10/2018 12:19 PM, Elijah Newren wrote:

Hi Ben,

On Thu, May 10, 2018 at 7:16 AM, Ben Peart <ben.pe...@microsoft.com> wrote:

After performing a merge that has conflicts, git status will by default attempt
to detect renames which causes many objects to be examined.  In a virtualized
repo, those objects do not exist locally so the rename logic triggers them to be
fetched from the server. This results in the status call taking hours to
complete on very large repos.  Even in a small repo (the GVFS repo) turning off
break and rename detection has a significant impact:


It'd be nice if you could show that impact by comparing 'git status'
to 'git status --no-renames', for some repo.  Showing only the latter
gives us no way to assess the impact.



Given the example perf impact is arbitrary (the actual example that 
triggered this patch took status from 2+ hours to seconds) and can't be 
replicated using the current performance tools in git, I'm just going 
drop the specific numbers.  I believe the patch is worth while just to 
give users the flexibility to control these behaviors.



git status --no-renames:
31 secs., 105 loose object downloads

git status --no-breaks
7 secs., 17 loose object downloads

git status --no-breaks --no-renames
1 sec., 1 loose object download


This patch doesn't add a --no-breaks option and it doesn't exist
previously, so adding it to the commit message serves to confuse
rather than help.  I'd just drop the last two of these (and redo the
timing for --no-renames assuming you are built on
em/status-rename-config).



OK


Add a new config status.renames setting to enable turning off rename detection
during status.  This setting will default to the value of diff.renames.

Add a new config status.renamelimit setting to to enable bounding the time spent
finding out inexact renames during status.  This setting will default to the
value of diff.renamelimit.


It may be worth mentioning that these config settings also affect 'git
commit' (and it does, in my testing, which I think is a good thing).



I agree this is a good thing as the other status settings behave the 
same way.  I'll update the documentation to reflect this as well.



Add status --no-renames command line option that enables overriding the config
setting from the command line. Add --find-renames[=] to enable detecting
renames and optionally setting the similarity index from the command line.


The command line options are specific to 'git status'.  I don't really
have a strong opinion on whether they should also be added to
git-commit; I suspect users would be more likely to use the config
options in order to set it once and forget about it and that users
would be more likely to want to override their config setting for
status than for commit.


Note: I removed the --no-breaks command line option from the original patch as
it will no longer be needed once the default has been changed [1] to turn it 
off.

[1] 
https://public-inbox.org/git/20180430093421.27551-2-eckhard.s.ma...@gmail.com/


I'd just drop these lines from the commit message, and instead mention
that your patch depends on em/status-rename-config.



OK


+   if ((intptr_t)rename_score_arg != -1) {
+   s.detect_rename = DIFF_DETECT_RENAME;


I'd still prefer this was a
 if (s.detect_rename < DIFF_DETECT_RENAME)
 s.detect_rename = DIFF_DETECT_RENAME;

If a user specifies they are willing to pay for copy detection, but
then just passes --find-renames=40% because they want to find more
renames, it seems odd to disable copy detection to me.



I agree and will change it. It is unfortunate this will behave 
differently than it does with merge.  Fixing the merge behavior to match 
is outside the scope of this patch.



+++ b/t/t7525-status-rename.sh


Testcases look good.  It'd be nice to also add a few testcases where
copy detection is turned on -- in particular, I'd like to see one with
--find-renames=$DIFFERENT_THAN_DEFAULT being passed when
merge.renames=copies.



OK.  I also added tests to verify the settings correctly impact commit.




+test_expect_success 'setup' '
+   echo 1 >original &&
+   git add . &&
+   git commit -m"Adding original file." &&
+   mv original renamed &&
+   echo 2 >> renamed &&
+   git add .
+'




+cat >.gitignore <<\EOF
+.gitignore
+expect*
+actual*
+EOF


Can this just be included in the setup?



OK



Everything else in the patch looked good to me.



Re: [PATCH v2] add status config and command line options for rename detection

2018-05-10 Thread Elijah Newren
Hi Ben,

On Thu, May 10, 2018 at 7:16 AM, Ben Peart <ben.pe...@microsoft.com> wrote:
> After performing a merge that has conflicts, git status will by default 
> attempt
> to detect renames which causes many objects to be examined.  In a virtualized
> repo, those objects do not exist locally so the rename logic triggers them to 
> be
> fetched from the server. This results in the status call taking hours to
> complete on very large repos.  Even in a small repo (the GVFS repo) turning 
> off
> break and rename detection has a significant impact:

It'd be nice if you could show that impact by comparing 'git status'
to 'git status --no-renames', for some repo.  Showing only the latter
gives us no way to assess the impact.

> git status --no-renames:
> 31 secs., 105 loose object downloads
>
> git status --no-breaks
> 7 secs., 17 loose object downloads
>
> git status --no-breaks --no-renames
> 1 sec., 1 loose object download

This patch doesn't add a --no-breaks option and it doesn't exist
previously, so adding it to the commit message serves to confuse
rather than help.  I'd just drop the last two of these (and redo the
timing for --no-renames assuming you are built on
em/status-rename-config).

> Add a new config status.renames setting to enable turning off rename detection
> during status.  This setting will default to the value of diff.renames.
>
> Add a new config status.renamelimit setting to to enable bounding the time 
> spent
> finding out inexact renames during status.  This setting will default to the
> value of diff.renamelimit.

It may be worth mentioning that these config settings also affect 'git
commit' (and it does, in my testing, which I think is a good thing).

> Add status --no-renames command line option that enables overriding the config
> setting from the command line. Add --find-renames[=] to enable detecting
> renames and optionally setting the similarity index from the command line.

The command line options are specific to 'git status'.  I don't really
have a strong opinion on whether they should also be added to
git-commit; I suspect users would be more likely to use the config
options in order to set it once and forget about it and that users
would be more likely to want to override their config setting for
status than for commit.

> Note: I removed the --no-breaks command line option from the original patch as
> it will no longer be needed once the default has been changed [1] to turn it 
> off.
>
> [1] 
> https://public-inbox.org/git/20180430093421.27551-2-eckhard.s.ma...@gmail.com/

I'd just drop these lines from the commit message, and instead mention
that your patch depends on em/status-rename-config.

> Original-Patch-by: Alejandro Pauly <alpa...@microsoft.com>
> Signed-off-by: Ben Peart <ben.pe...@microsoft.com>
> ---
>
> Notes:
> Base Ref: master

This patch does not apply to master; it has conflicts.

> Web-Diff: https://github.com/benpeart/git/commit/823212725b

This web diff shows em/status-rename-config as the parent commit, not
master.  Since your commit message mentions you want the change to
break detection provided by that series, just listing it as the
explicit base seems like the right way to go.

> ### Interdiff (v1..v2):

Thanks.

> +   if ((intptr_t)rename_score_arg != -1) {
> +   s.detect_rename = DIFF_DETECT_RENAME;

I'd still prefer this was a
if (s.detect_rename < DIFF_DETECT_RENAME)
s.detect_rename = DIFF_DETECT_RENAME;

If a user specifies they are willing to pay for copy detection, but
then just passes --find-renames=40% because they want to find more
renames, it seems odd to disable copy detection to me.

> +++ b/t/t7525-status-rename.sh

Testcases look good.  It'd be nice to also add a few testcases where
copy detection is turned on -- in particular, I'd like to see one with
--find-renames=$DIFFERENT_THAN_DEFAULT being passed when
merge.renames=copies.


> +test_expect_success 'setup' '
> +   echo 1 >original &&
> +   git add . &&
> +   git commit -m"Adding original file." &&
> +   mv original renamed &&
> +   echo 2 >> renamed &&
> +   git add .
> +'


> +cat >.gitignore <<\EOF
> +.gitignore
> +expect*
> +actual*
> +EOF

Can this just be included in the setup?


Everything else in the patch looked good to me.


[PATCH v2] add status config and command line options for rename detection

2018-05-10 Thread Ben Peart
After performing a merge that has conflicts, git status will by default attempt
to detect renames which causes many objects to be examined.  In a virtualized
repo, those objects do not exist locally so the rename logic triggers them to be
fetched from the server. This results in the status call taking hours to
complete on very large repos.  Even in a small repo (the GVFS repo) turning off
break and rename detection has a significant impact:

git status --no-renames:
31 secs., 105 loose object downloads

git status --no-breaks
7 secs., 17 loose object downloads

git status --no-breaks --no-renames
1 sec., 1 loose object download

Add a new config status.renames setting to enable turning off rename detection
during status.  This setting will default to the value of diff.renames.

Add a new config status.renamelimit setting to to enable bounding the time spent
finding out inexact renames during status.  This setting will default to the
value of diff.renamelimit.

Add status --no-renames command line option that enables overriding the config
setting from the command line. Add --find-renames[=] to enable detecting
renames and optionally setting the similarity index from the command line.

Note: I removed the --no-breaks command line option from the original patch as
it will no longer be needed once the default has been changed [1] to turn it 
off.

[1] 
https://public-inbox.org/git/20180430093421.27551-2-eckhard.s.ma...@gmail.com/

Original-Patch-by: Alejandro Pauly <alpa...@microsoft.com>
Signed-off-by: Ben Peart <ben.pe...@microsoft.com>
---

Notes:
Base Ref: master
Web-Diff: https://github.com/benpeart/git/commit/823212725b
Checkout: git fetch https://github.com/benpeart/git status-renames-v2 && 
git checkout 823212725b

### Interdiff (v1..v2):

diff --git a/Documentation/config.txt b/Documentation/config.txt
index b79b83c587..9c8eca05b1 100644
--- a/Documentation/config.txt
+++ b/Documentation/config.txt
@@ -3126,7 +3126,8 @@ status.renameLimit::
 status.renames::
Whether and how Git detects renames.  If set to "false",
rename detection is disabled. If set to "true", basic rename
-   detection is enabled.  Defaults to the value of diff.renames.
+   detection is enabled.  If set to "copies" or "copy", Git will
+   detect copies, as well.  Defaults to the value of diff.renames.

 status.showStash::
If set to true, linkgit:git-status[1] will display the number of
diff --git a/Documentation/git-status.txt b/Documentation/git-status.txt
index c16e27e63d..c4467ffb98 100644
--- a/Documentation/git-status.txt
+++ b/Documentation/git-status.txt
@@ -135,6 +135,16 @@ ignored, then the directory is not shown, but all 
contents are shown.
Display or do not display detailed ahead/behind counts for the
branch relative to its upstream branch.  Defaults to true.
    
    +--renames::
+--no-renames::
+   Turn on/off rename detection regardless of user configuration.
+   See also linkgit:git-diff[1] `--no-renames`.
    +
+--find-renames[=]::
+   Turn on rename detection, optionally setting the similarity
+   threshold.
+   See also linkgit:git-diff[1] `--find-renames`.
+
 ...::
See the 'pathspec' entry in linkgit:gitglossary[7].

diff --git a/builtin/commit.c b/builtin/commit.c
index a545096ddd..db886277f4 100644
--- a/builtin/commit.c
+++ b/builtin/commit.c
@@ -109,10 +109,6 @@ static int have_option_m;
 static struct strbuf message = STRBUF_INIT;

 static enum wt_status_format status_format = STATUS_FORMAT_UNSPECIFIED;
-static int diff_detect_rename = -1;
-static int status_detect_rename = -1;
-static int diff_rename_limit = -1;
-static int status_rename_limit = -1;

 static int opt_parse_porcelain(const struct option *opt, const char *arg, 
int unset)
 {
@@ -1274,19 +1270,21 @@ static int git_status_config(const char *k, const 
char *v, void *cb)
return 0;
}
if (!strcmp(k, "diff.renamelimit")) {
-   diff_rename_limit = git_config_int(k, v);
+   if (s->rename_limit == -1)
+   s->rename_limit = git_config_int(k, v);
return 0;
}
if (!strcmp(k, "status.renamelimit")) {
-   status_rename_limit = git_config_int(k, v);
+   s->rename_limit = git_config_int(k, v);
return 0;
}
if (!strcmp(k, "diff.renames")) {
-   diff_detect_rename = git_config_rename(k, v);
+   if (s->detect_rename == -1)
+   s->detect_rename = git_config_rename(k, v);
return 0;
}
if (!strcmp(k, "status.renames")) {
-   status_detect_rename = git_config_rename(k,

Re: [PATCH v1] add status config and command line options for rename detection

2018-05-09 Thread Ben Peart



On 5/9/2018 12:56 PM, Elijah Newren wrote:

Hi Ben,

Overall I think this is good, but I have lots of nit-picky things to
bring up.  :-)




Thank you for the review.  I appreciate the extra set of eyes on these 
changes.  Especially when dealing with the merge logic and settings 
which I am unfamiliar with.




I suspect that status.renames should mention "copy", just as
diff.renames does.  (We didn't mention it in merge.renames, because
merge isn't an operation for which copy detection can be useful -- at
least not until the diffstat at the end of the merge is controlled by
merge.renames as well...)



I wasn't supporting copy (as you noticed later in the patch) but will 
update the patch to do so and update the documentation appropriately.



Also, do these two config settings only affect 'git status', or does
it also affect the status shown when composing a commit message
(assuming the user hasn't turned commit.status off)?  Does it affect
`git commit --dry-run` too?



The config settings only affect 'git status'


--- a/builtin/commit.c
+++ b/builtin/commit.c
@@ -109,6 +109,10 @@ static int have_option_m;
  static struct strbuf message = STRBUF_INIT;

  static enum wt_status_format status_format = STATUS_FORMAT_UNSPECIFIED;
+static int diff_detect_rename = -1;
+static int status_detect_rename = -1;
+static int diff_rename_limit = -1;
+static int status_rename_limit = -1;


Could we replace these four variables with just two: detect_rename and
rename_limit?  Keeping these separate invites people to write code
using only one of the settings rather than the appropriate inherited
mixture of them, resulting in a weird bug.  More on this below...



This model was inherited from the diff code and replicated to the merge 
code.  However, it would be nice to get rid of these 4 static variables. 
 See below for a proposal on how to do that...



@@ -1259,11 +1273,29 @@ static int git_status_config(const char *k, const char 
*v, void *cb)
 return error(_("Invalid untracked files mode '%s'"), 
v);
 return 0;
 }
+   if (!strcmp(k, "diff.renamelimit")) {
+   diff_rename_limit = git_config_int(k, v);
+   return 0;
+   }
+   if (!strcmp(k, "status.renamelimit")) {
+   status_rename_limit = git_config_int(k, v);
+   return 0;
+   }


Here, since you're already checking diff.renamelimit first, you can
just set rename_limit in both blocks and not need both
diff_rename_limit and status_rename_limit.  (Similar can be said for
diff.renames/status.renames.)


It really doesn't work that way - the call back is called for every 
config setting and there is no specific order they are called with. 
Typically, you just test for and save off any that you care about like 
I"m doing here.


I can update the logic here so that as I save off the settings that it 
will also enforce the priority model (ie the diff setting can't override 
the status setting) and then compute the final value once I have the 
command line arguments as they override either config setting (if present).


On the plus side, this change passes the red/green test but it now 
splits the priority logic into two places and doesn't match with how 
diff and merge handle this same issue.







@@ -1297,6 +1329,10 @@ int cmd_status(int argc, const char **argv, const char 
*prefix)
   N_("ignore changes to submodules, optional when: all, dirty, 
untracked. (Default: all)"),
   PARSE_OPT_OPTARG, NULL, (intptr_t)"all" },
 OPT_COLUMN(0, "column", , N_("list untracked files in 
columns")),
+   OPT_BOOL(0, "no-renames", _renames, N_("do not detect 
renames")),
+   { OPTION_CALLBACK, 'M', "find-renames", _score_arg,
+ N_("n"), N_("detect renames, optionally set similarity 
index"),
+ PARSE_OPT_OPTARG, opt_parse_rename_score },


Should probably also document these options in
Documentation/git-status.txt (and maybe Documentation/git-commit.txt
as well).


Good point, will do.



Not sure if we want to add a flag for copy detection (similar to
git-diff's -C/--find-copies and --find-copies-harder), or just leave
that for when someone finds a need.  If left out, might want to just
mention that it was considered and intentionally omitted for now in
the commit message.



I tend to only implement the features I know are actually needed so 
intentionally omitted this (along with many other potential diff options).



+   if ((intptr_t)rename_score_arg != -1) {


I don't understand why rename_score_arg is a (char*) and then you need
to cast to intptr_t, but that may just be because I haven't done much
of anything with option parsing.  A quick look at the docs isn't
making it clear to me, though; could you enlighten me?



Yes, it is related to making parse_options() do what we need.  -1 means 
the command line option wasn't passed so use the default.  NULL 

Re: [PATCH v1] add status config and command line options for rename detection

2018-05-09 Thread Ben Peart



On 5/9/2018 11:59 AM, Duy Nguyen wrote:

On Wed, May 9, 2018 at 4:42 PM, Ben Peart <ben.pe...@microsoft.com> wrote:

Add a new config status.renames setting to enable turning off rename detection
during status.  This setting will default to the value of diff.renames.


Please add the reason you need this config key in the commit message.
My guess (probably correct) is on super large repo (how large?),
rename detection is just too slow (how long?) that it practically
makes git-status unusable.



Yes, the reasons for this change are the same as for the patch that 
added these same flags for merge and have to do with the poor 
performance of rename detection with large repos.  I'll update the 
commit message to be more descriptive (see below) and correct some 
spelling errors.



add status config and command line options for rename detection

After performing a merge that has conflicts, git status will by default 
attempt to detect renames which causes many objects to be examined.  In 
a virtualized repo, those objects do not exist locally so the rename 
logic triggers them to be fetched from the server. This results in the 
status call taking hours to complete on very large repos.  Even in a 
small repo (the GVFS repo) turning off break and rename detection has a 
significant impact:


git status --no-renames:
31 secs., 105 loose object downloads

git status --no-breaks
7 secs., 17 loose object downloads

git status --no-breaks --no-renames
1 sec., 1 loose object download

Add a new config status.renames setting to enable turning off rename 
detection during status.  This setting will default to the value of 
diff.renames.


Add a new config status.renamelimit setting to to enable bounding the 
time spent finding out inexact renames during status.  This setting will 
default to the value of diff.renamelimit.


Add status --no-renames command line option that enables overriding the 
config setting from the command line. Add --find-renames[=] to enable 
detecting renames and optionally setting the similarity index from the 
command line.


Note: I removed the --no-breaks command line option from the original 
patch as it will no longer be needed once the default has been changed 
[1] to turn it off.


[1] 
https://public-inbox.org/git/20180430093421.27551-2-eckhard.s.ma...@gmail.com/


Original-Patch-by: Alejandro Pauly <alpa...@microsoft.com>
Signed-off-by: Ben Peart <ben.pe...@microsoft.com>



This information could be helpful when we optimize rename detection to
be more efficient.



Add a new config status.renamelimit setting to to enable bounding the time spent
finding out inexact renames during status.  This setting will default to the
value of diff.renamelimit.

Add status --no-renames command line option that enables overriding the config
setting from the command line. Add --find-renames[=] to enable detecting
renames and optionaly setting the similarity index from the command line.

Origional-Patch-by: Alejandro Pauly <alpa...@microsoft.com>
Signed-off-by: Ben Peart <ben.pe...@microsoft.com>


Re: [PATCH v1] add status config and command line options for rename detection

2018-05-09 Thread Elijah Newren
Hi Ben,

Overall I think this is good, but I have lots of nit-picky things to
bring up.  :-)


On Wed, May 9, 2018 at 7:42 AM, Ben Peart <ben.pe...@microsoft.com> wrote:
> Add status --no-renames command line option that enables overriding the config
> setting from the command line. Add --find-renames[=] to enable detecting
> renames and optionaly setting the similarity index from the command line.

s/optionaly/optionally/

> Notes:
> Base Ref:

No base ref?  ;-)

> +status.renameLimit::
> +   The number of files to consider when performing rename detection;
> +   if not specified, defaults to the value of diff.renameLimit.
> +
> +status.renames::
> +   Whether and how Git detects renames.  If set to "false",
> +   rename detection is disabled. If set to "true", basic rename
> +   detection is enabled.  Defaults to the value of diff.renames.

I suspect that status.renames should mention "copy", just as
diff.renames does.  (We didn't mention it in merge.renames, because
merge isn't an operation for which copy detection can be useful -- at
least not until the diffstat at the end of the merge is controlled by
merge.renames as well...)

Also, do these two config settings only affect 'git status', or does
it also affect the status shown when composing a commit message
(assuming the user hasn't turned commit.status off)?  Does it affect
`git commit --dry-run` too?

> --- a/builtin/commit.c
> +++ b/builtin/commit.c
> @@ -109,6 +109,10 @@ static int have_option_m;
>  static struct strbuf message = STRBUF_INIT;
>
>  static enum wt_status_format status_format = STATUS_FORMAT_UNSPECIFIED;
> +static int diff_detect_rename = -1;
> +static int status_detect_rename = -1;
> +static int diff_rename_limit = -1;
> +static int status_rename_limit = -1;

Could we replace these four variables with just two: detect_rename and
rename_limit?  Keeping these separate invites people to write code
using only one of the settings rather than the appropriate inherited
mixture of them, resulting in a weird bug.  More on this below...

> @@ -1259,11 +1273,29 @@ static int git_status_config(const char *k, const 
> char *v, void *cb)
> return error(_("Invalid untracked files mode '%s'"), 
> v);
> return 0;
> }
> +   if (!strcmp(k, "diff.renamelimit")) {
> +   diff_rename_limit = git_config_int(k, v);
> +   return 0;
> +   }
> +   if (!strcmp(k, "status.renamelimit")) {
> +   status_rename_limit = git_config_int(k, v);
> +   return 0;
> +   }

Here, since you're already checking diff.renamelimit first, you can
just set rename_limit in both blocks and not need both
diff_rename_limit and status_rename_limit.  (Similar can be said for
diff.renames/status.renames.)



> @@ -1297,6 +1329,10 @@ int cmd_status(int argc, const char **argv, const char 
> *prefix)
>   N_("ignore changes to submodules, optional when: all, 
> dirty, untracked. (Default: all)"),
>   PARSE_OPT_OPTARG, NULL, (intptr_t)"all" },
> OPT_COLUMN(0, "column", , N_("list untracked files 
> in columns")),
> +   OPT_BOOL(0, "no-renames", _renames, N_("do not detect 
> renames")),
> +   { OPTION_CALLBACK, 'M', "find-renames", _score_arg,
> + N_("n"), N_("detect renames, optionally set similarity 
> index"),
> + PARSE_OPT_OPTARG, opt_parse_rename_score },

Should probably also document these options in
Documentation/git-status.txt (and maybe Documentation/git-commit.txt
as well).

Not sure if we want to add a flag for copy detection (similar to
git-diff's -C/--find-copies and --find-copies-harder), or just leave
that for when someone finds a need.  If left out, might want to just
mention that it was considered and intentionally omitted for now in
the commit message.

> @@ -1336,6 +1372,27 @@ int cmd_status(int argc, const char **argv, const char 
> *prefix)
> s.ignore_submodule_arg = ignore_submodule_arg;
> s.status_format = status_format;
> s.verbose = verbose;
> +   s.detect_rename = no_renames >= 0 ? !no_renames :
> + status_detect_rename >= 0 ? 
> status_detect_rename :
> + diff_detect_rename >= 0 ? 
> diff_detect_rename :

Combining the four vars into two as mentioned above would allow
combining the last two lines above into one.

> +   if ((intptr_t)rename_score_arg != -1) {

I don't understand why rename_score_arg is a (char*) and then you need
to cast to intptr_t, but that may just be becau

Re: [PATCH v1] add status config and command line options for rename detection

2018-05-09 Thread Duy Nguyen
On Wed, May 9, 2018 at 4:42 PM, Ben Peart <ben.pe...@microsoft.com> wrote:
> Add a new config status.renames setting to enable turning off rename detection
> during status.  This setting will default to the value of diff.renames.

Please add the reason you need this config key in the commit message.
My guess (probably correct) is on super large repo (how large?),
rename detection is just too slow (how long?) that it practically
makes git-status unusable.

This information could be helpful when we optimize rename detection to
be more efficient.

>
> Add a new config status.renamelimit setting to to enable bounding the time 
> spent
> finding out inexact renames during status.  This setting will default to the
> value of diff.renamelimit.
>
> Add status --no-renames command line option that enables overriding the config
> setting from the command line. Add --find-renames[=] to enable detecting
> renames and optionaly setting the similarity index from the command line.
>
> Origional-Patch-by: Alejandro Pauly <alpa...@microsoft.com>
> Signed-off-by: Ben Peart <ben.pe...@microsoft.com>
-- 
Duy


[PATCH v1] add status config and command line options for rename detection

2018-05-09 Thread Ben Peart
Add a new config status.renames setting to enable turning off rename detection
during status.  This setting will default to the value of diff.renames.

Add a new config status.renamelimit setting to to enable bounding the time spent
finding out inexact renames during status.  This setting will default to the
value of diff.renamelimit.

Add status --no-renames command line option that enables overriding the config
setting from the command line. Add --find-renames[=] to enable detecting
renames and optionaly setting the similarity index from the command line.

Origional-Patch-by: Alejandro Pauly <alpa...@microsoft.com>
Signed-off-by: Ben Peart <ben.pe...@microsoft.com>
---

Notes:
Base Ref:
Web-Diff: https://github.com/benpeart/git/commit/aa977d2964
Checkout: git fetch https://github.com/benpeart/git status-renames-v1 && 
git checkout aa977d2964

 Documentation/config.txt |  9 
 builtin/commit.c | 57 +
 diff.c   |  2 +-
 diff.h   |  1 +
 t/t7525-status-rename.sh | 90 
 wt-status.c  | 12 ++
 wt-status.h  |  4 +-
 7 files changed, 173 insertions(+), 2 deletions(-)
 create mode 100644 t/t7525-status-rename.sh

diff --git a/Documentation/config.txt b/Documentation/config.txt
index 2659153cb3..b79b83c587 100644
--- a/Documentation/config.txt
+++ b/Documentation/config.txt
@@ -3119,6 +3119,15 @@ status.displayCommentPrefix::
behavior of linkgit:git-status[1] in Git 1.8.4 and previous.
Defaults to false.
 
+status.renameLimit::
+   The number of files to consider when performing rename detection;
+   if not specified, defaults to the value of diff.renameLimit.
+
+status.renames::
+   Whether and how Git detects renames.  If set to "false",
+   rename detection is disabled. If set to "true", basic rename
+   detection is enabled.  Defaults to the value of diff.renames.
+
 status.showStash::
If set to true, linkgit:git-status[1] will display the number of
entries currently stashed away.
diff --git a/builtin/commit.c b/builtin/commit.c
index 5240f11225..a545096ddd 100644
--- a/builtin/commit.c
+++ b/builtin/commit.c
@@ -109,6 +109,10 @@ static int have_option_m;
 static struct strbuf message = STRBUF_INIT;
 
 static enum wt_status_format status_format = STATUS_FORMAT_UNSPECIFIED;
+static int diff_detect_rename = -1;
+static int status_detect_rename = -1;
+static int diff_rename_limit = -1;
+static int status_rename_limit = -1;
 
 static int opt_parse_porcelain(const struct option *opt, const char *arg, int 
unset)
 {
@@ -143,6 +147,16 @@ static int opt_parse_m(const struct option *opt, const 
char *arg, int unset)
return 0;
 }
 
+static int opt_parse_rename_score(const struct option *opt, const char *arg, 
int unset)
+{
+   const char **value = opt->value;
+   if (arg != NULL && *arg == '=')
+   arg = arg + 1;
+
+   *value = arg;
+   return 0;
+}
+
 static void determine_whence(struct wt_status *s)
 {
if (file_exists(git_path_merge_head()))
@@ -1259,11 +1273,29 @@ static int git_status_config(const char *k, const char 
*v, void *cb)
return error(_("Invalid untracked files mode '%s'"), v);
return 0;
}
+   if (!strcmp(k, "diff.renamelimit")) {
+   diff_rename_limit = git_config_int(k, v);
+   return 0;
+   }
+   if (!strcmp(k, "status.renamelimit")) {
+   status_rename_limit = git_config_int(k, v);
+   return 0;
+   }
+   if (!strcmp(k, "diff.renames")) {
+   diff_detect_rename = git_config_rename(k, v);
+   return 0;
+   }
+   if (!strcmp(k, "status.renames")) {
+   status_detect_rename = git_config_rename(k, v);
+   return 0;
+   }
return git_diff_ui_config(k, v, NULL);
 }
 
 int cmd_status(int argc, const char **argv, const char *prefix)
 {
+   static int no_renames = -1;
+   static const char *rename_score_arg = (const char *)-1;
static struct wt_status s;
int fd;
struct object_id oid;
@@ -1297,6 +1329,10 @@ int cmd_status(int argc, const char **argv, const char 
*prefix)
  N_("ignore changes to submodules, optional when: all, dirty, 
untracked. (Default: all)"),
  PARSE_OPT_OPTARG, NULL, (intptr_t)"all" },
OPT_COLUMN(0, "column", , N_("list untracked files in 
columns")),
+   OPT_BOOL(0, "no-renames", _renames, N_("do not detect 
renames")),
+   { OPTION_CALLBACK, 'M', "find-renames", _score_arg,
+ N_("n"), N_("detect renames, optionally set similarity 
index"

[PATCH v4 3/3] merge: pass aggressive when rename detection is turned off

2018-05-02 Thread Ben Peart
Set aggressive flag in git_merge_trees() when rename detection is turned off.
This allows read_tree() to auto resolve more cases that would have otherwise
been handled by the rename detection.

Reviewed-by: Johannes Schindelin <johannes.schinde...@gmx.de>
Signed-off-by: Ben Peart <benpe...@microsoft.com>
---
 merge-recursive.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/merge-recursive.c b/merge-recursive.c
index 372ffbbacc..cea054cfd4 100644
--- a/merge-recursive.c
+++ b/merge-recursive.c
@@ -355,6 +355,7 @@ static int git_merge_trees(struct merge_options *o,
o->unpack_opts.fn = threeway_merge;
o->unpack_opts.src_index = _index;
o->unpack_opts.dst_index = _index;
+   o->unpack_opts.aggressive = !merge_detect_rename(o);
setup_unpack_trees_porcelain(>unpack_opts, "merge");
 
init_tree_desc_from_tree(t+0, common);
-- 
2.17.0.windows.1



Re: [PATCH v3 3/3] merge: pass aggressive when rename detection is turned off

2018-04-26 Thread Elijah Newren
On Thu, Apr 26, 2018 at 1:52 PM, Ben Peart <ben.pe...@microsoft.com> wrote:
> Set aggressive flag in git_merge_trees() when rename detection is turned off.
> This allows read_tree() to auto resolve more cases that would have otherwise
> been handled by the rename detection.
>
> Reviewed-by: Johannes Schindelin <johannes.schinde...@gmx.de>
> Signed-off-by: Ben Peart <benpe...@microsoft.com>
> ---
>  merge-recursive.c | 4 +++-
>  1 file changed, 3 insertions(+), 1 deletion(-)
>
> diff --git a/merge-recursive.c b/merge-recursive.c
> index 2637d34d87..6cc4404144 100644
> --- a/merge-recursive.c
> +++ b/merge-recursive.c
> @@ -276,6 +276,7 @@ static void init_tree_desc_from_tree(struct tree_desc 
> *desc, struct tree *tree)
>  }
>
>  static int git_merge_trees(int index_only,
> +  int aggressive,
>struct tree *common,
>struct tree *head,
>struct tree *merge)
> @@ -294,6 +295,7 @@ static int git_merge_trees(int index_only,
> opts.fn = threeway_merge;
> opts.src_index = _index;
> opts.dst_index = _index;
> +   opts.aggressive = aggressive;
> setup_unpack_trees_porcelain(, "merge");
>
> init_tree_desc_from_tree(t+0, common);
> @@ -1993,7 +1995,7 @@ int merge_trees(struct merge_options *o,
> return 1;
> }
>
> -   code = git_merge_trees(o->call_depth, common, head, merge);
> +   code = git_merge_trees(o->call_depth, !merge_detect_rename(o), 
> common, head, merge);
>
> if (code != 0) {
> if (show(o, 4) || o->call_depth)
> --
> 2.17.0.windows.1

Patch looks fine but as a heads up -- since merge_options is a
parameter in git_merge_trees after the
en/rename-directory-detection-reboot lands, we'll be able to switch
this patch to set opts.aggressive directly instead of needing to pass
it in as a parameter.


[PATCH v3 3/3] merge: pass aggressive when rename detection is turned off

2018-04-26 Thread Ben Peart
Set aggressive flag in git_merge_trees() when rename detection is turned off.
This allows read_tree() to auto resolve more cases that would have otherwise
been handled by the rename detection.

Reviewed-by: Johannes Schindelin <johannes.schinde...@gmx.de>
Signed-off-by: Ben Peart <benpe...@microsoft.com>
---
 merge-recursive.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/merge-recursive.c b/merge-recursive.c
index 2637d34d87..6cc4404144 100644
--- a/merge-recursive.c
+++ b/merge-recursive.c
@@ -276,6 +276,7 @@ static void init_tree_desc_from_tree(struct tree_desc 
*desc, struct tree *tree)
 }
 
 static int git_merge_trees(int index_only,
+  int aggressive,
   struct tree *common,
   struct tree *head,
   struct tree *merge)
@@ -294,6 +295,7 @@ static int git_merge_trees(int index_only,
opts.fn = threeway_merge;
opts.src_index = _index;
opts.dst_index = _index;
+   opts.aggressive = aggressive;
setup_unpack_trees_porcelain(, "merge");
 
init_tree_desc_from_tree(t+0, common);
@@ -1993,7 +1995,7 @@ int merge_trees(struct merge_options *o,
return 1;
}
 
-   code = git_merge_trees(o->call_depth, common, head, merge);
+   code = git_merge_trees(o->call_depth, !merge_detect_rename(o), common, 
head, merge);
 
if (code != 0) {
if (show(o, 4) || o->call_depth)
-- 
2.17.0.windows.1



Re: [PATCH v10 00/36] Add directory rename detection to git

2018-04-23 Thread Elijah Newren
On Mon, Apr 23, 2018 at 4:46 PM, Junio C Hamano <gits...@pobox.com> wrote:
> Elijah Newren <new...@gmail.com> writes:
>
>> Out of 53288 merge commits with exactly two parents in linux.git:
>>   - 48491 merged identically
>>   - 4737 merged the same other than a few different "Auto-merging
>> " output lines (as expected due to patch 35/36)
>>   - 53 merged the same other than different "Checking out files: ..."
>> output (I just did a plain merge; no flags like --no-progress)
>>   - the remaining 7 commits had non-trivial merge differences, all
>> attributable to directory rename detection kicking in
>>
>> So, it looks good to me.  If anyone has suggestions for other testing
>> to do, let me know.
>
> There must have been some merges that stopped due to conflicts among
> those 50k, and I am interested to hear how they were different.  Or
> are they included in the above numbers (e.g. among 48491 there were
> ones that stopped with conflicts, but the results these conflictted
> merge left in the working tree and the index were identical)?

They are included in the categories listed above.  What my comparison
did was for each of the 53288 commits:

1) Do the merge, capture stdout and stderr, and the exit status
2) Record output of 'git ls-files -s'
3) Record output of 'git status | grep -v detached'
4) Record contents of every untracked file (could be created e.g. due
to D/F conflicts)
5) Record contents of 'git diff -M --staged'
6) Record contents of 'git diff -M'
(all of this stuff in 1-6 is recorded into a single text file with
some nice headers to split the sections up).

7) Repeat steps 1-6 with the new version of git, but recording into a
different filename
8) Compare the two text files to see what was different between the
two merges, if anything.
(If they are different, save the files somewhere for me to look at later.)


Then after each merge, there's a bunch of cleanup to make sure things
are in a pristine state for the next merge.


Re: [PATCH v10 00/36] Add directory rename detection to git

2018-04-23 Thread Junio C Hamano
Elijah Newren <new...@gmail.com> writes:

> Out of 53288 merge commits with exactly two parents in linux.git:
>   - 48491 merged identically
>   - 4737 merged the same other than a few different "Auto-merging
> " output lines (as expected due to patch 35/36)
>   - 53 merged the same other than different "Checking out files: ..."
> output (I just did a plain merge; no flags like --no-progress)
>   - the remaining 7 commits had non-trivial merge differences, all
> attributable to directory rename detection kicking in
>
> So, it looks good to me.  If anyone has suggestions for other testing
> to do, let me know.

There must have been some merges that stopped due to conflicts among
those 50k, and I am interested to hear how they were different.  Or
are they included in the above numbers (e.g. among 48491 there were
ones that stopped with conflicts, but the results these conflictted
merge left in the working tree and the index were identical)?


Re: [PATCH v10 00/36] Add directory rename detection to git

2018-04-23 Thread Elijah Newren
Hi Junio,

On Thu, Apr 19, 2018 at 8:05 PM, Junio C Hamano <gits...@pobox.com> wrote:
>> On Thu, Apr 19, 2018 at 10:57 AM, Elijah Newren <new...@gmail.com> wrote:
>>> This series is a reboot of the directory rename detection series that was
>>> merged to master and then reverted due to the final patch having a buggy
>>> can-skip-update check, as noted at
>>>   https://public-inbox.org/git/xmqqmuya43cs@gitster-ct.c.googlers.com/
>>> This series based on top of master.
>
> The series as-is is fine, I think, from the maintainer's point of
> view.  Thanks.

Sorry to be a pest, but now I'm unsure how I should handle the next
round.  I've got:
- two minor fixup commits that can be trivially squashed in (not yet
sent), affecting just the final few patches
- a "year" vs "years" typo in commit message of patch 32 (which is now
in pu as commit 3daa9b3eb6dd)
- an (independent-ish) unpack_trees fix (Message-ID:
20180421193736.12722-1-new...@gmail.com), possibly to be supplemented
by another fix/improvement suggested by Duy

Should I...
- send out a reroll of everything, and include the unpack_trees
fix(es) in the series?
- just resend patches 32-36 with the fixes, and renumber the patches
to include the unpack_trees stuff in the middle?
- just send the two fixup commits, ignore the minor typo, and keep the
unpack_trees fix(es) as a separate topic that we'll just want to
advance first?
- something else?

Thanks,
Elijah


Re: [PATCH v10 00/36] Add directory rename detection to git

2018-04-23 Thread Elijah Newren
On Thu, Apr 19, 2018 at 10:57 AM, Elijah Newren <new...@gmail.com> wrote:
> Additional testing:
>
>   * I've re-merged all ~13k merge commits in git.git with both
> git-2.17.0 and this version of git, comparing the results to each
> other in detail.  (Including stdout & stderr, as well as the output
> of subsequent commands like `git status`, `git ls-files -s`, `git
> diff -M`, `git diff -M --staged`).  The only differences were in 23
> merges of either git-gui or gitk which involved directory renames
> (e.g. git-2.17.0's merge would result in files like 'lib/tools.tcl'
> or 'po/ru.po' instead of the expected 'git-gui/lib/tools.tcl' or
> 'gitk-git/po/ru.po')
>
>   * I'm trying to do the same with linux.git, but it looks like that will
> take nearly a week to complete...

Results after restarting[1] and throwing some big hardware at it to
get faster completion:

Out of 53288 merge commits with exactly two parents in linux.git:
  - 48491 merged identically
  - 4737 merged the same other than a few different "Auto-merging
" output lines (as expected due to patch 35/36)
  - 53 merged the same other than different "Checking out files: ..."
output (I just did a plain merge; no flags like --no-progress)
  - the remaining 7 commits had non-trivial merge differences, all
attributable to directory rename detection kicking in

So, it looks good to me.  If anyone has suggestions for other testing
to do, let me know.

[1] Restarted so it could include my unpack_trees fix (from
message-id20180421193736.12722-1-new...@gmail.com) plus a couple minor
fixup commits (fixing some testcase nits and a comment typo).


Re: [PATCH v10 00/36] Add directory rename detection to git

2018-04-19 Thread Junio C Hamano
Elijah Newren <new...@gmail.com> writes:

> On Thu, Apr 19, 2018 at 10:57 AM, Elijah Newren <new...@gmail.com> wrote:
>> This series is a reboot of the directory rename detection series that was
>> merged to master and then reverted due to the final patch having a buggy
>> can-skip-update check, as noted at
>>   https://public-inbox.org/git/xmqqmuya43cs@gitster-ct.c.googlers.com/
>> This series based on top of master.
>
> ...and merges cleanly to next but apparently has some minor conflicts
> with both ds/lazy-load-trees and ps/test-chmtime-get from pu.
>
> What's the preferred way to resolve this?  Rebase and resubmit my
> series on pu, or something else?

The series as-is is fine, I think, from the maintainer's point of
view.  Thanks.


Re: [PATCH v10 00/36] Add directory rename detection to git

2018-04-19 Thread Elijah Newren
On Thu, Apr 19, 2018 at 11:35 AM, Elijah Newren <new...@gmail.com> wrote:
> On Thu, Apr 19, 2018 at 10:57 AM, Elijah Newren <new...@gmail.com> wrote:
>> This series is a reboot of the directory rename detection series that was
>> merged to master and then reverted due to the final patch having a buggy
>> can-skip-update check, as noted at
>>   https://public-inbox.org/git/xmqqmuya43cs@gitster-ct.c.googlers.com/
>> This series based on top of master.
>
> ...and merges cleanly to next but apparently has some minor conflicts
> with both ds/lazy-load-trees and ps/test-chmtime-get from pu.
>
> What's the preferred way to resolve this?  Rebase and resubmit my
> series on pu, or something else?

Sorry, user error; there are no conflicts with my series.

(I accidentally included Junio's interim round of my own series and
while trying to spot problems I saw commits from these other series
touching relevant files in what looked like nearby areas.  Directly
merging with these other two series or even merging all of pu before
en/rename-directory-detection-reboot followed by individually merging
later series has no conflicts with any of my changes.)


Re: [PATCH v10 00/36] Add directory rename detection to git

2018-04-19 Thread Derrick Stolee

On 4/19/2018 2:41 PM, Stefan Beller wrote:

On Thu, Apr 19, 2018 at 11:35 AM, Elijah Newren <new...@gmail.com> wrote:

On Thu, Apr 19, 2018 at 10:57 AM, Elijah Newren <new...@gmail.com> wrote:

This series is a reboot of the directory rename detection series that was
merged to master and then reverted due to the final patch having a buggy
can-skip-update check, as noted at
   https://public-inbox.org/git/xmqqmuya43cs@gitster-ct.c.googlers.com/
This series based on top of master.

...and merges cleanly to next but apparently has some minor conflicts
with both ds/lazy-load-trees and ps/test-chmtime-get from pu.

What's the preferred way to resolve this?  Rebase and resubmit my
series on pu, or something else?

If you were to base it off of pu, this series would depend on all other
series that pu contains. This is bad for the progress of this series.
(If it were to be merged to next, all other series would automatically
merge to next as well)

If the conflicts are minor, then Junio resolves them; if you want to be
nice, pick your merge point as

 git checkout origin/master
 git merge ds/lazy-load-trees
 git merge ps/test-chmtime-get
 git tag my-anchor

and put the series on top of that anchor.

If you do this, you'd want to be reasonably sure that
those two series are not in too much flux.


I believe ds/lazy-load-trees is queued for 'next'. I'm not surprised 
that there are some conflicts here. Any reference to the 'tree' member 
of a commit should be replaced with 'get_commit_tree(c)', or 
'get_commit_tree_oid(c)' if you only care about the tree's object id.


I think Stefan's suggestion is the best approach to get the right 
conflicts out of the way.


Thanks,
-Stolee


Re: [PATCH v10 00/36] Add directory rename detection to git

2018-04-19 Thread Stefan Beller
On Thu, Apr 19, 2018 at 11:35 AM, Elijah Newren <new...@gmail.com> wrote:
> On Thu, Apr 19, 2018 at 10:57 AM, Elijah Newren <new...@gmail.com> wrote:
>> This series is a reboot of the directory rename detection series that was
>> merged to master and then reverted due to the final patch having a buggy
>> can-skip-update check, as noted at
>>   https://public-inbox.org/git/xmqqmuya43cs@gitster-ct.c.googlers.com/
>> This series based on top of master.
>
> ...and merges cleanly to next but apparently has some minor conflicts
> with both ds/lazy-load-trees and ps/test-chmtime-get from pu.
>
> What's the preferred way to resolve this?  Rebase and resubmit my
> series on pu, or something else?

If you were to base it off of pu, this series would depend on all other
series that pu contains. This is bad for the progress of this series.
(If it were to be merged to next, all other series would automatically
merge to next as well)

If the conflicts are minor, then Junio resolves them; if you want to be
nice, pick your merge point as

git checkout origin/master
git merge ds/lazy-load-trees
git merge ps/test-chmtime-get
git tag my-anchor

and put the series on top of that anchor.

If you do this, you'd want to be reasonably sure that
those two series are not in too much flux.

Thanks,
Stefan


Re: [PATCH v10 00/36] Add directory rename detection to git

2018-04-19 Thread Elijah Newren
On Thu, Apr 19, 2018 at 10:57 AM, Elijah Newren <new...@gmail.com> wrote:
> This series is a reboot of the directory rename detection series that was
> merged to master and then reverted due to the final patch having a buggy
> can-skip-update check, as noted at
>   https://public-inbox.org/git/xmqqmuya43cs@gitster-ct.c.googlers.com/
> This series based on top of master.

...and merges cleanly to next but apparently has some minor conflicts
with both ds/lazy-load-trees and ps/test-chmtime-get from pu.

What's the preferred way to resolve this?  Rebase and resubmit my
series on pu, or something else?


[PATCH v10 05/36] directory rename detection: files/directories in the way of some renames

2018-04-19 Thread Elijah Newren
Reviewed-by: Stefan Beller <sbel...@google.com>
Signed-off-by: Elijah Newren <new...@gmail.com>
Signed-off-by: Junio C Hamano <gits...@pobox.com>
---
 t/t6043-merge-rename-directories.sh | 330 
 1 file changed, 330 insertions(+)

diff --git a/t/t6043-merge-rename-directories.sh 
b/t/t6043-merge-rename-directories.sh
index 713ad2b75e..b469c807c2 100755
--- a/t/t6043-merge-rename-directories.sh
+++ b/t/t6043-merge-rename-directories.sh
@@ -850,4 +850,334 @@ test_expect_success '4a-check: Directory split, with 
original directory still pr
 #   detection.)  But, sadly, see testcase 8b.
 ###
 
+
+###
+# SECTION 5: Files/directories in the way of subset of to-be-renamed paths
+#
+# Implicitly renaming files due to a detected directory rename could run
+# into problems if there are files or directories in the way of the paths
+# we want to rename.  Explore such cases in this section.
+###
+
+# Testcase 5a, Merge directories, other side adds files to original and target
+#   Commit O: z/{b,c},   y/d
+#   Commit A: z/{b,c,e_1,f}, y/{d,e_2}
+#   Commit B: y/{b,c,d}
+#   Expected: z/e_1, y/{b,c,d,e_2,f} + CONFLICT warning
+#   NOTE: While directory rename detection is active here causing z/f to
+# become y/f, we did not apply this for z/e_1 because that would
+# give us an add/add conflict for y/e_1 vs y/e_2.  This problem with
+# this add/add, is that both versions of y/e are from the same side
+# of history, giving us no way to represent this conflict in the
+# index.
+
+test_expect_success '5a-setup: Merge directories, other side adds files to 
original and target' '
+   test_create_repo 5a &&
+   (
+   cd 5a &&
+
+   mkdir z &&
+   echo b >z/b &&
+   echo c >z/c &&
+   mkdir y &&
+   echo d >y/d &&
+   git add z y &&
+   test_tick &&
+   git commit -m "O" &&
+
+   git branch O &&
+   git branch A &&
+   git branch B &&
+
+   git checkout A &&
+   echo e1 >z/e &&
+   echo f >z/f &&
+   echo e2 >y/e &&
+   git add z/e z/f y/e &&
+   test_tick &&
+   git commit -m "A" &&
+
+   git checkout B &&
+   git mv z/b y/ &&
+   git mv z/c y/ &&
+   rmdir z &&
+   test_tick &&
+   git commit -m "B"
+   )
+'
+
+test_expect_failure '5a-check: Merge directories, other side adds files to 
original and target' '
+   (
+   cd 5a &&
+
+   git checkout A^0 &&
+
+   test_must_fail git merge -s recursive B^0 >out &&
+   test_i18ngrep "CONFLICT.*implicit dir rename" out &&
+
+   git ls-files -s >out &&
+   test_line_count = 6 out &&
+   git ls-files -u >out &&
+   test_line_count = 0 out &&
+   git ls-files -o >out &&
+   test_line_count = 1 out &&
+
+   git rev-parse >actual \
+   :0:y/b :0:y/c :0:y/d :0:y/e :0:z/e :0:y/f &&
+   git rev-parse >expect \
+O:z/b  O:z/c  O:y/d  A:y/e  A:z/e  A:z/f &&
+   test_cmp expect actual
+   )
+'
+
+# Testcase 5b, Rename/delete in order to get add/add/add conflict
+#   (Related to testcase 8d; these may appear slightly inconsistent to users;
+#Also related to testcases 7d and 7e)
+#   Commit O: z/{b,c,d_1}
+#   Commit A: y/{b,c,d_2}
+#   Commit B: z/{b,c,d_1,e}, y/d_3
+#   Expected: y/{b,c,e}, CONFLICT(add/add: y/d_2 vs. y/d_3)
+#   NOTE: If z/d_1 in commit B were to be involved in dir rename detection, as
+# we normaly would since z/ is being renamed to y/, then this would be
+# a rename/delete (z/d_1 -> y/d_1 vs. deleted) AND an add/add/add
+# conflict of y/d_1 vs. y/d_2 vs. y/d_3.  Add/add/add is not
+# representable in the index, so the existence of y/d_3 needs to
+# cause us to bail on directory rename detection for that path, falling
+# back to git behavior without the directory rename detection.
+
+test_expect_success '5b-setup: Rename/delete in order to get add/add/add 
conflict' '
+   test_create_repo 5b &&
+   (
+   cd 5b &&

[PATCH v10 03/36] directory rename detection: testcases to avoid taking detection too far

2018-04-19 Thread Elijah Newren
Reviewed-by: Stefan Beller <sbel...@google.com>
Signed-off-by: Elijah Newren <new...@gmail.com>
Signed-off-by: Junio C Hamano <gits...@pobox.com>
---
 t/t6043-merge-rename-directories.sh | 153 
 1 file changed, 153 insertions(+)

diff --git a/t/t6043-merge-rename-directories.sh 
b/t/t6043-merge-rename-directories.sh
index b22a9052b3..8049ed5fc9 100755
--- a/t/t6043-merge-rename-directories.sh
+++ b/t/t6043-merge-rename-directories.sh
@@ -582,4 +582,157 @@ test_expect_success '2b-check: Directory split into two 
on one side, with equal
 #   messages are handled correctly.
 ###
 
+
+###
+# SECTION 3: Path in question is the source path for some rename already
+#
+# Combining cases from Section 1 and trying to handle them could lead to
+# directory renaming detection being over-applied.  So, this section
+# provides some good testcases to check that the implementation doesn't go
+# too far.
+###
+
+# Testcase 3a, Avoid implicit rename if involved as source on other side
+#   (Related to testcases 1c and 1f)
+#   Commit O: z/{b,c,d}
+#   Commit A: z/{b,c,d} (no change)
+#   Commit B: y/{b,c}, x/d
+#   Expected: y/{b,c}, x/d
+test_expect_success '3a-setup: Avoid implicit rename if involved as source on 
other side' '
+   test_create_repo 3a &&
+   (
+   cd 3a &&
+
+   mkdir z &&
+   echo b >z/b &&
+   echo c >z/c &&
+   echo d >z/d &&
+   git add z &&
+   test_tick &&
+   git commit -m "O" &&
+
+   git branch O &&
+   git branch A &&
+   git branch B &&
+
+   git checkout A &&
+   test_tick &&
+   git commit --allow-empty -m "A" &&
+
+   git checkout B &&
+   mkdir y &&
+   mkdir x &&
+   git mv z/b y/ &&
+   git mv z/c y/ &&
+   git mv z/d x/ &&
+   rmdir z &&
+   test_tick &&
+   git commit -m "B"
+   )
+'
+
+test_expect_success '3a-check: Avoid implicit rename if involved as source on 
other side' '
+   (
+   cd 3a &&
+
+   git checkout A^0 &&
+
+   git merge -s recursive B^0 &&
+
+   git ls-files -s >out &&
+   test_line_count = 3 out &&
+
+   git rev-parse >actual \
+   HEAD:y/b HEAD:y/c HEAD:x/d &&
+   git rev-parse >expect \
+   O:z/bO:z/cO:z/d &&
+   test_cmp expect actual
+   )
+'
+
+# Testcase 3b, Avoid implicit rename if involved as source on other side
+#   (Related to testcases 5c and 7c, also kind of 1e and 1f)
+#   Commit O: z/{b,c,d}
+#   Commit A: y/{b,c}, x/d
+#   Commit B: z/{b,c}, w/d
+#   Expected: y/{b,c}, CONFLICT:(z/d -> x/d vs. w/d)
+#   NOTE: We're particularly checking that since z/d is already involved as
+# a source in a file rename on the same side of history, that we don't
+# get it involved in directory rename detection.  If it were, we might
+# end up with CONFLICT:(z/d -> y/d vs. x/d vs. w/d), i.e. a
+# rename/rename/rename(1to3) conflict, which is just weird.
+test_expect_success '3b-setup: Avoid implicit rename if involved as source on 
current side' '
+   test_create_repo 3b &&
+   (
+   cd 3b &&
+
+   mkdir z &&
+   echo b >z/b &&
+   echo c >z/c &&
+   echo d >z/d &&
+   git add z &&
+   test_tick &&
+   git commit -m "O" &&
+
+   git branch O &&
+   git branch A &&
+   git branch B &&
+
+   git checkout A &&
+   mkdir y &&
+   mkdir x &&
+   git mv z/b y/ &&
+   git mv z/c y/ &&
+   git mv z/d x/ &&
+   rmdir z &&
+   test_tick &&
+   git commit -m "A" &&
+
+   git checkout B &&
+   mkdir w &&
+   git mv z/d w/ &&
+   test_tick &&
+   git commit -m "B"
+   )
+'
+
+test_expect_success '3b-check: Av

[PATCH v10 11/36] directory rename detection: tests for handling overwriting dirty files

2018-04-19 Thread Elijah Newren
Reviewed-by: Stefan Beller <sbel...@google.com>
Signed-off-by: Elijah Newren <new...@gmail.com>
Signed-off-by: Junio C Hamano <gits...@pobox.com>
---
 t/t6043-merge-rename-directories.sh | 458 
 1 file changed, 458 insertions(+)

diff --git a/t/t6043-merge-rename-directories.sh 
b/t/t6043-merge-rename-directories.sh
index a6cd38336c..8ea9ec49bc 100755
--- a/t/t6043-merge-rename-directories.sh
+++ b/t/t6043-merge-rename-directories.sh
@@ -3246,4 +3246,462 @@ test_expect_failure '10e-check: Does git complain about 
untracked file that is n
)
 '
 
+###
+# SECTION 11: Handling dirty (not up-to-date) files
+#
+# unpack_trees(), upon which the recursive merge algorithm is based, aborts
+# the operation if untracked or dirty files would be deleted or overwritten
+# by the merge.  Unfortunately, unpack_trees() does not understand renames,
+# and if it doesn't abort, then it muddies up the working directory before
+# we even get to the point of detecting renames, so we need some special
+# handling.  This was true even of normal renames, but there are additional
+# codepaths that need special handling with directory renames.  Add
+# testcases for both renamed-by-directory-rename-detection and standard
+# rename cases.
+###
+
+# Testcase 11a, Avoid losing dirty contents with simple rename
+#   Commit O: z/{a,b_v1},
+#   Commit A: z/{a,c_v1}, and z/c_v1 has uncommitted mods
+#   Commit B: z/{a,b_v2}
+#   Expected: ERROR_MSG(Refusing to lose dirty file at z/c) +
+# z/a, staged version of z/c has sha1sum matching B:z/b_v2,
+# z/c~HEAD with contents of B:z/b_v2,
+# z/c with uncommitted mods on top of A:z/c_v1
+
+test_expect_success '11a-setup: Avoid losing dirty contents with simple 
rename' '
+   test_create_repo 11a &&
+   (
+   cd 11a &&
+
+   mkdir z &&
+   echo a >z/a &&
+   test_seq 1 10 >z/b &&
+   git add z &&
+   test_tick &&
+   git commit -m "O" &&
+
+   git branch O &&
+   git branch A &&
+   git branch B &&
+
+   git checkout A &&
+   git mv z/b z/c &&
+   test_tick &&
+   git commit -m "A" &&
+
+   git checkout B &&
+   echo 11 >>z/b &&
+   git add z/b &&
+   test_tick &&
+   git commit -m "B"
+   )
+'
+
+test_expect_failure '11a-check: Avoid losing dirty contents with simple 
rename' '
+   (
+   cd 11a &&
+
+   git checkout A^0 &&
+   echo stuff >>z/c &&
+
+   test_must_fail git merge -s recursive B^0 >out 2>err &&
+   test_i18ngrep "Refusing to lose dirty file at z/c" out &&
+
+   test_seq 1 10 >expected &&
+   echo stuff >>expected &&
+   test_cmp expected z/c &&
+
+   git ls-files -s >out &&
+   test_line_count = 2 out &&
+   git ls-files -u >out &&
+   test_line_count = 1 out &&
+   git ls-files -o >out &&
+   test_line_count = 4 out &&
+
+   git rev-parse >actual \
+   :0:z/a :2:z/c &&
+   git rev-parse >expect \
+O:z/a  B:z/b &&
+   test_cmp expect actual &&
+
+   git hash-object z/c~HEAD >actual &&
+   git rev-parse B:z/b >expect &&
+   test_cmp expect actual
+   )
+'
+
+# Testcase 11b, Avoid losing dirty file involved in directory rename
+#   Commit O: z/a, x/{b,c_v1}
+#   Commit A: z/{a,c_v1},  x/b,   and z/c_v1 has uncommitted mods
+#   Commit B: y/a, x/{b,c_v2}
+#   Expected: y/{a,c_v2}, x/b, z/c_v1 with uncommitted mods untracked,
+# ERROR_MSG(Refusing to lose dirty file at z/c)
+
+
+test_expect_success '11b-setup: Avoid losing dirty file involved in directory 
rename' '
+   test_create_repo 11b &&
+   (
+   cd 11b &&
+
+   mkdir z x &&
+   echo a >z/a &&
+   echo b >x/b &&
+   test_seq 1 10 >x/c &&
+   git add z x &&
+   test_tick &&
+   git commit -m "O" &&
+
+   git branch O &&
+  

[PATCH v10 01/36] directory rename detection: basic testcases

2018-04-19 Thread Elijah Newren
Reviewed-by: Stefan Beller <sbel...@google.com>
Signed-off-by: Elijah Newren <new...@gmail.com>
Signed-off-by: Junio C Hamano <gits...@pobox.com>
---
 t/t6043-merge-rename-directories.sh | 442 
 1 file changed, 442 insertions(+)
 create mode 100755 t/t6043-merge-rename-directories.sh

diff --git a/t/t6043-merge-rename-directories.sh 
b/t/t6043-merge-rename-directories.sh
new file mode 100755
index 00..d045f0e31e
--- /dev/null
+++ b/t/t6043-merge-rename-directories.sh
@@ -0,0 +1,442 @@
+#!/bin/sh
+
+test_description="recursive merge with directory renames"
+# includes checking of many corner cases, with a similar methodology to:
+#   t6042: corner cases with renames but not criss-cross merges
+#   t6036: corner cases with both renames and criss-cross merges
+#
+# The setup for all of them, pictorially, is:
+#
+#  A
+#  o
+# / \
+#  O o   ?
+# \ /
+#  o
+#  B
+#
+# To help make it easier to follow the flow of tests, they have been
+# divided into sections and each test will start with a quick explanation
+# of what commits O, A, and B contain.
+#
+# Notation:
+#z/{b,c}   means  files z/b and z/c both exist
+#x/d_1 means  file x/d exists with content d1.  (Purpose of the
+# underscore notation is to differentiate different
+# files that might be renamed into each other's paths.)
+
+. ./test-lib.sh
+
+
+###
+# SECTION 1: Basic cases we should be able to handle
+###
+
+# Testcase 1a, Basic directory rename.
+#   Commit O: z/{b,c}
+#   Commit A: y/{b,c}
+#   Commit B: z/{b,c,d,e/f}
+#   Expected: y/{b,c,d,e/f}
+
+test_expect_success '1a-setup: Simple directory rename detection' '
+   test_create_repo 1a &&
+   (
+   cd 1a &&
+
+   mkdir z &&
+   echo b >z/b &&
+   echo c >z/c &&
+   git add z &&
+   test_tick &&
+   git commit -m "O" &&
+
+   git branch O &&
+   git branch A &&
+   git branch B &&
+
+   git checkout A &&
+   git mv z y &&
+   test_tick &&
+   git commit -m "A" &&
+
+   git checkout B &&
+   echo d >z/d &&
+   mkdir z/e &&
+   echo f >z/e/f &&
+   git add z/d z/e/f &&
+   test_tick &&
+   git commit -m "B"
+   )
+'
+
+test_expect_failure '1a-check: Simple directory rename detection' '
+   (
+   cd 1a &&
+
+   git checkout A^0 &&
+
+   git merge -s recursive B^0 &&
+
+   git ls-files -s >out &&
+   test_line_count = 4 out &&
+
+   git rev-parse >actual \
+   HEAD:y/b HEAD:y/c HEAD:y/d HEAD:y/e/f &&
+   git rev-parse >expect \
+   O:z/bO:z/cB:z/dB:z/e/f &&
+   test_cmp expect actual &&
+
+   git hash-object y/d >actual &&
+   git rev-parse B:z/d >expect &&
+   test_cmp expect actual &&
+
+   test_must_fail git rev-parse HEAD:z/d &&
+   test_must_fail git rev-parse HEAD:z/e/f &&
+   test_path_is_missing z/d &&
+   test_path_is_missing z/e/f
+   )
+'
+
+# Testcase 1b, Merge a directory with another
+#   Commit O: z/{b,c},   y/d
+#   Commit A: z/{b,c,e}, y/d
+#   Commit B: y/{b,c,d}
+#   Expected: y/{b,c,d,e}
+
+test_expect_success '1b-setup: Merge a directory with another' '
+   test_create_repo 1b &&
+   (
+   cd 1b &&
+
+   mkdir z &&
+   echo b >z/b &&
+   echo c >z/c &&
+   mkdir y &&
+   echo d >y/d &&
+   git add z y &&
+   test_tick &&
+   git commit -m "O" &&
+
+   git branch O &&
+   git branch A &&
+   git branch B &&
+
+   git checkout A &&
+   echo e >z/e &&
+   git add z/e &&
+   test_tick &&
+   git commit -m "A" &&
+
+   git checkout B &&
+   git mv z/b y &&
+   git mv z/c y &&
+   rmdir z &&
+   test

[PATCH v10 02/36] directory rename detection: directory splitting testcases

2018-04-19 Thread Elijah Newren
Reviewed-by: Stefan Beller 
Signed-off-by: Elijah Newren 
Signed-off-by: Junio C Hamano 
---
 t/t6043-merge-rename-directories.sh | 143 
 1 file changed, 143 insertions(+)

diff --git a/t/t6043-merge-rename-directories.sh 
b/t/t6043-merge-rename-directories.sh
index d045f0e31e..b22a9052b3 100755
--- a/t/t6043-merge-rename-directories.sh
+++ b/t/t6043-merge-rename-directories.sh
@@ -439,4 +439,147 @@ test_expect_failure '1f-check: Split a directory into two 
other directories' '
 #   in section 2, plus testcases 3a and 4a.
 ###
 
+
+###
+# SECTION 2: Split into multiple directories, with equal number of paths
+#
+# Explore the splitting-a-directory rules a bit; what happens in the
+# edge cases?
+#
+# Note that there is a closely related case of a directory not being
+# split on either side of history, but being renamed differently on
+# each side.  See testcase 8e for that.
+###
+
+# Testcase 2a, Directory split into two on one side, with equal numbers of 
paths
+#   Commit O: z/{b,c}
+#   Commit A: y/b, w/c
+#   Commit B: z/{b,c,d}
+#   Expected: y/b, w/c, z/d, with warning about z/ -> (y/ vs. w/) conflict
+test_expect_success '2a-setup: Directory split into two on one side, with 
equal numbers of paths' '
+   test_create_repo 2a &&
+   (
+   cd 2a &&
+
+   mkdir z &&
+   echo b >z/b &&
+   echo c >z/c &&
+   git add z &&
+   test_tick &&
+   git commit -m "O" &&
+
+   git branch O &&
+   git branch A &&
+   git branch B &&
+
+   git checkout A &&
+   mkdir y &&
+   mkdir w &&
+   git mv z/b y/ &&
+   git mv z/c w/ &&
+   test_tick &&
+   git commit -m "A" &&
+
+   git checkout B &&
+   echo d >z/d &&
+   git add z/d &&
+   test_tick &&
+   git commit -m "B"
+   )
+'
+
+test_expect_failure '2a-check: Directory split into two on one side, with 
equal numbers of paths' '
+   (
+   cd 2a &&
+
+   git checkout A^0 &&
+
+   test_must_fail git merge -s recursive B^0 >out &&
+   test_i18ngrep "CONFLICT.*directory rename split" out &&
+
+   git ls-files -s >out &&
+   test_line_count = 3 out &&
+   git ls-files -u >out &&
+   test_line_count = 0 out &&
+   git ls-files -o >out &&
+   test_line_count = 1 out &&
+
+   git rev-parse >actual \
+   :0:y/b :0:w/c :0:z/d &&
+   git rev-parse >expect \
+O:z/b  O:z/c  B:z/d &&
+   test_cmp expect actual
+   )
+'
+
+# Testcase 2b, Directory split into two on one side, with equal numbers of 
paths
+#   Commit O: z/{b,c}
+#   Commit A: y/b, w/c
+#   Commit B: z/{b,c}, x/d
+#   Expected: y/b, w/c, x/d; No warning about z/ -> (y/ vs. w/) conflict
+test_expect_success '2b-setup: Directory split into two on one side, with 
equal numbers of paths' '
+   test_create_repo 2b &&
+   (
+   cd 2b &&
+
+   mkdir z &&
+   echo b >z/b &&
+   echo c >z/c &&
+   git add z &&
+   test_tick &&
+   git commit -m "O" &&
+
+   git branch O &&
+   git branch A &&
+   git branch B &&
+
+   git checkout A &&
+   mkdir y &&
+   mkdir w &&
+   git mv z/b y/ &&
+   git mv z/c w/ &&
+   test_tick &&
+   git commit -m "A" &&
+
+   git checkout B &&
+   mkdir x &&
+   echo d >x/d &&
+   git add x/d &&
+   test_tick &&
+   git commit -m "B"
+   )
+'
+
+test_expect_success '2b-check: Directory split into two on one side, with 
equal numbers of paths' '
+   (
+   cd 2b &&
+
+   git checkout A^0 &&
+
+   git merge -s recursive B^0 >out &&
+
+   git ls-files -s >out &&
+   test_line_count = 3 out &&
+   git ls-files -u >out &&
+   test_line_count = 0 out &&
+   git ls-files -o >out &&
+   test_line_count = 1 out &&
+
+   git rev-parse >actual \
+   :0:y/b :0:w/c :0:x/d &&
+   git rev-parse >expect \
+O:z/b  O:z/c  B:x/d &&
+   test_cmp expect actual &&
+   test_i18ngrep ! "CONFLICT.*directory rename split" out
+   )
+'
+

[PATCH v10 09/36] directory rename detection: miscellaneous testcases to complete coverage

2018-04-19 Thread Elijah Newren
I came up with the testcases in the first eight sections before coding up
the implementation.  The testcases in this section were mostly ones I
thought of while coding/debugging, and which I was too lazy to insert
into the previous sections because I didn't want to re-label with all the
testcase references.  :-)

Reviewed-by: Stefan Beller 
Signed-off-by: Elijah Newren 
Signed-off-by: Junio C Hamano 
---
 t/t6043-merge-rename-directories.sh | 565 +++-
 1 file changed, 564 insertions(+), 1 deletion(-)

diff --git a/t/t6043-merge-rename-directories.sh 
b/t/t6043-merge-rename-directories.sh
index e211e8ca31..cbbb949014 100755
--- a/t/t6043-merge-rename-directories.sh
+++ b/t/t6043-merge-rename-directories.sh
@@ -305,6 +305,7 @@ test_expect_failure '1d-check: Directory renames cause a 
rename/rename(2to1) con
 '
 
 # Testcase 1e, Renamed directory, with all filenames being renamed too
+#   (Related to testcases 9f & 9g)
 #   Commit O: z/{oldb,oldc}
 #   Commit A: y/{newb,newc}
 #   Commit B: z/{oldb,oldc,d}
@@ -593,7 +594,7 @@ test_expect_success '2b-check: Directory split into two on 
one side, with equal
 ###
 
 # Testcase 3a, Avoid implicit rename if involved as source on other side
-#   (Related to testcases 1c and 1f)
+#   (Related to testcases 1c, 1f, and 9h)
 #   Commit O: z/{b,c,d}
 #   Commit A: z/{b,c,d} (no change)
 #   Commit B: y/{b,c}, x/d
@@ -2316,4 +2317,566 @@ test_expect_failure '8e-check: Both sides rename, one 
side adds to original dire
)
 '
 
+###
+# SECTION 9: Other testcases
+#
+# This section consists of miscellaneous testcases I thought of during
+# the implementation which round out the testing.
+###
+
+# Testcase 9a, Inner renamed directory within outer renamed directory
+#   (Related to testcase 1f)
+#   Commit O: z/{b,c,d/{e,f,g}}
+#   Commit A: y/{b,c}, x/w/{e,f,g}
+#   Commit B: z/{b,c,d/{e,f,g,h},i}
+#   Expected: y/{b,c,i}, x/w/{e,f,g,h}
+#   NOTE: The only reason this one is interesting is because when a directory
+# is split into multiple other directories, we determine by the weight
+# of which one had the most paths going to it.  A naive implementation
+# of that could take the new file in commit B at z/i to x/w/i or x/i.
+
+test_expect_success '9a-setup: Inner renamed directory within outer renamed 
directory' '
+   test_create_repo 9a &&
+   (
+   cd 9a &&
+
+   mkdir -p z/d &&
+   echo b >z/b &&
+   echo c >z/c &&
+   echo e >z/d/e &&
+   echo f >z/d/f &&
+   echo g >z/d/g &&
+   git add z &&
+   test_tick &&
+   git commit -m "O" &&
+
+   git branch O &&
+   git branch A &&
+   git branch B &&
+
+   git checkout A &&
+   mkdir x &&
+   git mv z/d x/w &&
+   git mv z y &&
+   test_tick &&
+   git commit -m "A" &&
+
+   git checkout B &&
+   echo h >z/d/h &&
+   echo i >z/i &&
+   git add z &&
+   test_tick &&
+   git commit -m "B"
+   )
+'
+
+test_expect_failure '9a-check: Inner renamed directory within outer renamed 
directory' '
+   (
+   cd 9a &&
+
+   git checkout A^0 &&
+
+   git merge -s recursive B^0 &&
+
+   git ls-files -s >out &&
+   test_line_count = 7 out &&
+   git ls-files -u >out &&
+   test_line_count = 0 out &&
+   git ls-files -o >out &&
+   test_line_count = 1 out &&
+
+   git rev-parse >actual \
+   HEAD:y/b HEAD:y/c HEAD:y/i &&
+   git rev-parse >expect \
+   O:z/bO:z/cB:z/i &&
+   test_cmp expect actual &&
+
+   git rev-parse >actual \
+   HEAD:x/w/e HEAD:x/w/f HEAD:x/w/g HEAD:x/w/h &&
+   git rev-parse >expect \
+   O:z/d/eO:z/d/fO:z/d/gB:z/d/h &&
+   test_cmp expect actual
+   )
+'
+
+# Testcase 9b, Transitive rename with content merge
+#   (Related to testcase 1c)
+#   Commit O: z/{b,c},   x/d_1
+#   Commit A: y/{b,c},   x/d_2
+#   Commit B: z/{b,c,d_3}
+#   Expected: y/{b,c,d_merged}
+
+test_expect_success '9b-setup: Transitive rename with content merge' '
+   test_create_repo 9b &&
+   (
+   cd 9b &&
+
+   mkdir z &&
+   echo b >z/b &&
+   echo c >z/c &&
+   mkdir x &&
+   test_seq 1 10 >x/d &&
+   git add z x &&
+   test_tick &&

[PATCH v10 04/36] directory rename detection: partially renamed directory testcase/discussion

2018-04-19 Thread Elijah Newren
Add a long note about why we are not considering "partial directory
renames" for the current directory rename detection implementation.

Reviewed-by: Stefan Beller <sbel...@google.com>
Signed-off-by: Elijah Newren <new...@gmail.com>
Signed-off-by: Junio C Hamano <gits...@pobox.com>
---
 t/t6043-merge-rename-directories.sh | 115 
 1 file changed, 115 insertions(+)

diff --git a/t/t6043-merge-rename-directories.sh 
b/t/t6043-merge-rename-directories.sh
index 8049ed5fc9..713ad2b75e 100755
--- a/t/t6043-merge-rename-directories.sh
+++ b/t/t6043-merge-rename-directories.sh
@@ -735,4 +735,119 @@ test_expect_success '3b-check: Avoid implicit rename if 
involved as source on cu
 #   of a rename on either side of a merge.
 ###
 
+
+###
+# SECTION 4: Partially renamed directory; still exists on both sides of merge
+#
+# What if we were to attempt to do directory rename detection when someone
+# "mostly" moved a directory but still left some files around, or,
+# equivalently, fully renamed a directory in one commmit and then recreated
+# that directory in a later commit adding some new files and then tried to
+# merge?
+#
+# It's hard to divine user intent in these cases, because you can make an
+# argument that, depending on the intermediate history of the side being
+# merged, that some users will want files in that directory to
+# automatically be detected and renamed, while users with a different
+# intermediate history wouldn't want that rename to happen.
+#
+# I think that it is best to simply not have directory rename detection
+# apply to such cases.  My reasoning for this is four-fold: (1) it's
+# easiest for users in general to figure out what happened if we don't
+# apply directory rename detection in any such case, (2) it's an easy rule
+# to explain ["We don't do directory rename detection if the directory
+# still exists on both sides of the merge"], (3) we can get some hairy
+# edge/corner cases that would be really confusing and possibly not even
+# representable in the index if we were to even try, and [related to 3] (4)
+# attempting to resolve this issue of divining user intent by examining
+# intermediate history goes against the spirit of three-way merges and is a
+# path towards crazy corner cases that are far more complex than what we're
+# already dealing with.
+#
+# Note that the wording of the rule ("We don't do directory rename
+# detection if the directory still exists on both sides of the merge.")
+# also excludes "renaming" of a directory into a subdirectory of itself
+# (e.g. /some/dir/* -> /some/dir/subdir/*).  It may be possible to carve
+# out an exception for "renaming"-beneath-itself cases without opening
+# weird edge/corner cases for other partial directory renames, but for now
+# we are keeping the rule simple.
+#
+# This section contains a test for a partially-renamed-directory case.
+###
+
+# Testcase 4a, Directory split, with original directory still present
+#   (Related to testcase 1f)
+#   Commit O: z/{b,c,d,e}
+#   Commit A: y/{b,c,d}, z/e
+#   Commit B: z/{b,c,d,e,f}
+#   Expected: y/{b,c,d}, z/{e,f}
+#   NOTE: Even though most files from z moved to y, we don't want f to follow.
+
+test_expect_success '4a-setup: Directory split, with original directory still 
present' '
+   test_create_repo 4a &&
+   (
+   cd 4a &&
+
+   mkdir z &&
+   echo b >z/b &&
+   echo c >z/c &&
+   echo d >z/d &&
+   echo e >z/e &&
+   git add z &&
+   test_tick &&
+   git commit -m "O" &&
+
+   git branch O &&
+   git branch A &&
+   git branch B &&
+
+   git checkout A &&
+   mkdir y &&
+   git mv z/b y/ &&
+   git mv z/c y/ &&
+   git mv z/d y/ &&
+   test_tick &&
+   git commit -m "A" &&
+
+   git checkout B &&
+   echo f >z/f &&
+   git add z/f &&
+   test_tick &&
+   git commit -m "B"
+   )
+'
+
+test_expect_success '4a-check: Directory split, with original directory still 
present' '
+   (
+   cd 4a &&
+
+   git checkout A^0 &&
+
+   git merge -s recursive B^0 &&
+
+   git ls-files -s >out &&
+   test_line_count = 5 out &&
+  

[PATCH v10 08/36] directory rename detection: testcases exploring possibly suboptimal merges

2018-04-19 Thread Elijah Newren
Reviewed-by: Stefan Beller <sbel...@google.com>
Signed-off-by: Elijah Newren <new...@gmail.com>
Signed-off-by: Junio C Hamano <gits...@pobox.com>
---
 t/t6043-merge-rename-directories.sh | 404 
 1 file changed, 404 insertions(+)

diff --git a/t/t6043-merge-rename-directories.sh 
b/t/t6043-merge-rename-directories.sh
index 6db1439675..e211e8ca31 100755
--- a/t/t6043-merge-rename-directories.sh
+++ b/t/t6043-merge-rename-directories.sh
@@ -1912,4 +1912,408 @@ test_expect_failure '7e-check: transitive rename in 
rename/delete AND dirs in th
)
 '
 
+###
+# SECTION 8: Suboptimal merges
+#
+# As alluded to in the last section, the ruleset we have built up for
+# detecting directory renames unfortunately has some special cases where it
+# results in slightly suboptimal or non-intuitive behavior.  This section
+# explores these cases.
+#
+# To be fair, we already had non-intuitive or suboptimal behavior for most
+# of these cases in git before introducing implicit directory rename
+# detection, but it'd be nice if there was a modified ruleset out there
+# that handled these cases a bit better.
+###
+
+# Testcase 8a, Dual-directory rename, one into the others' way
+#   Commit O. x/{a,b},   y/{c,d}
+#   Commit A. x/{a,b,e}, y/{c,d,f}
+#   Commit B. y/{a,b},   z/{c,d}
+#
+# Possible Resolutions:
+#   w/o dir-rename detection: y/{a,b,f},   z/{c,d},   x/e
+#   Currently expected:   y/{a,b,e,f}, z/{c,d}
+#   Optimal:  y/{a,b,e},   z/{c,d,f}
+#
+# Note: Both x and y got renamed and it'd be nice to detect both, and we do
+# better with directory rename detection than git did without, but the
+# simple rule from section 5 prevents me from handling this as optimally as
+# we potentially could.
+
+test_expect_success '8a-setup: Dual-directory rename, one into the others way' 
'
+   test_create_repo 8a &&
+   (
+   cd 8a &&
+
+   mkdir x &&
+   mkdir y &&
+   echo a >x/a &&
+   echo b >x/b &&
+   echo c >y/c &&
+   echo d >y/d &&
+   git add x y &&
+   test_tick &&
+   git commit -m "O" &&
+
+   git branch O &&
+   git branch A &&
+   git branch B &&
+
+   git checkout A &&
+   echo e >x/e &&
+   echo f >y/f &&
+   git add x/e y/f &&
+   test_tick &&
+   git commit -m "A" &&
+
+   git checkout B &&
+   git mv y z &&
+   git mv x y &&
+   test_tick &&
+   git commit -m "B"
+   )
+'
+
+test_expect_failure '8a-check: Dual-directory rename, one into the others way' 
'
+   (
+   cd 8a &&
+
+   git checkout A^0 &&
+
+   git merge -s recursive B^0 &&
+
+   git ls-files -s >out &&
+   test_line_count = 6 out &&
+   git ls-files -u >out &&
+   test_line_count = 0 out &&
+   git ls-files -o >out &&
+   test_line_count = 1 out &&
+
+   git rev-parse >actual \
+   HEAD:y/a HEAD:y/b HEAD:y/e HEAD:y/f HEAD:z/c HEAD:z/d &&
+   git rev-parse >expect \
+   O:x/a    O:x/b    A:x/eA:y/fO:y/cO:y/d &&
+   test_cmp expect actual
+   )
+'
+
+# Testcase 8b, Dual-directory rename, one into the others' way, with 
conflicting filenames
+#   Commit O. x/{a_1,b_1}, y/{a_2,b_2}
+#   Commit A. x/{a_1,b_1,e_1}, y/{a_2,b_2,e_2}
+#   Commit B. y/{a_1,b_1}, z/{a_2,b_2}
+#
+#   w/o dir-rename detection: y/{a_1,b_1,e_2}, z/{a_2,b_2}, x/e_1
+#   Currently expected:   
+#   Scary:    y/{a_1,b_1}, z/{a_2,b_2}, CONFLICT(add/add, 
e_1 vs. e_2)
+#   Optimal:  y/{a_1,b_1,e_1}, z/{a_2,b_2,e_2}
+#
+# Note: Very similar to 8a, except instead of 'e' and 'f' in directories x and
+# y, both are named 'e'.  Without directory rename detection, neither file
+# moves directories.  Implement directory rename detection suboptimally, and
+# you get an add/add conflict, but both files were added in commit A, so this
+# is an add/add conflict where one side of history added both files --
+# something we can't represent in the index.  Obviously, we'd prefer the last
+# resolution, but our previous rules are too coarse to allow it.  Using both
+# the rules from section 4 and sectio

[PATCH v10 06/36] directory rename detection: testcases checking which side did the rename

2018-04-19 Thread Elijah Newren
Reviewed-by: Stefan Beller <sbel...@google.com>
Signed-off-by: Elijah Newren <new...@gmail.com>
Signed-off-by: Junio C Hamano <gits...@pobox.com>
---
 t/t6043-merge-rename-directories.sh | 336 
 1 file changed, 336 insertions(+)

diff --git a/t/t6043-merge-rename-directories.sh 
b/t/t6043-merge-rename-directories.sh
index b469c807c2..9ae245a522 100755
--- a/t/t6043-merge-rename-directories.sh
+++ b/t/t6043-merge-rename-directories.sh
@@ -1180,4 +1180,340 @@ test_expect_failure '5d-check: Directory/file/file 
conflict due to directory ren
 #   back to old handling.  But, sadly, see testcases 8a and 8b.
 ###
 
+
+###
+# SECTION 6: Same side of the merge was the one that did the rename
+#
+# It may sound obvious that you only want to apply implicit directory
+# renames to directories if the _other_ side of history did the renaming.
+# If you did make an implementation that didn't explicitly enforce this
+# rule, the majority of cases that would fall under this section would
+# also be solved by following the rules from the above sections.  But
+# there are still a few that stick out, so this section covers them just
+# to make sure we also get them right.
+###
+
+# Testcase 6a, Tricky rename/delete
+#   Commit O: z/{b,c,d}
+#   Commit A: z/b
+#   Commit B: y/{b,c}, z/d
+#   Expected: y/b, CONFLICT(rename/delete, z/c -> y/c vs. NULL)
+#   Note: We're just checking here that the rename of z/b and z/c to put
+# them under y/ doesn't accidentally catch z/d and make it look like
+# it is also involved in a rename/delete conflict.
+
+test_expect_success '6a-setup: Tricky rename/delete' '
+   test_create_repo 6a &&
+   (
+   cd 6a &&
+
+   mkdir z &&
+   echo b >z/b &&
+   echo c >z/c &&
+   echo d >z/d &&
+   git add z &&
+   test_tick &&
+   git commit -m "O" &&
+
+   git branch O &&
+   git branch A &&
+   git branch B &&
+
+   git checkout A &&
+   git rm z/c &&
+   git rm z/d &&
+   test_tick &&
+   git commit -m "A" &&
+
+   git checkout B &&
+   mkdir y &&
+   git mv z/b y/ &&
+   git mv z/c y/ &&
+   test_tick &&
+   git commit -m "B"
+   )
+'
+
+test_expect_success '6a-check: Tricky rename/delete' '
+   (
+   cd 6a &&
+
+   git checkout A^0 &&
+
+   test_must_fail git merge -s recursive B^0 >out &&
+   test_i18ngrep "CONFLICT (rename/delete).*z/c.*y/c" out &&
+
+   git ls-files -s >out &&
+   test_line_count = 2 out &&
+   git ls-files -u >out &&
+   test_line_count = 1 out &&
+   git ls-files -o >out &&
+   test_line_count = 1 out &&
+
+   git rev-parse >actual \
+   :0:y/b :3:y/c &&
+       git rev-parse >expect \
+O:z/b  O:z/c &&
+   test_cmp expect actual
+   )
+'
+
+# Testcase 6b, Same rename done on both sides
+#   (Related to testcases 6c and 8e)
+#   Commit O: z/{b,c}
+#   Commit A: y/{b,c}
+#   Commit B: y/{b,c}, z/d
+#   Expected: y/{b,c}, z/d
+#   Note: If we did directory rename detection here, we'd move z/d into y/,
+# but B did that rename and still decided to put the file into z/,
+# so we probably shouldn't apply directory rename detection for it.
+
+test_expect_success '6b-setup: Same rename done on both sides' '
+   test_create_repo 6b &&
+   (
+   cd 6b &&
+
+   mkdir z &&
+   echo b >z/b &&
+   echo c >z/c &&
+   git add z &&
+   test_tick &&
+   git commit -m "O" &&
+
+   git branch O &&
+   git branch A &&
+   git branch B &&
+
+   git checkout A &&
+   git mv z y &&
+   test_tick &&
+   git commit -m "A" &&
+
+   git checkout B &&
+   git mv z y &&
+   mkdir z &&
+   echo d >z/d &&
+ 

[PATCH v10 10/36] directory rename detection: tests for handling overwriting untracked files

2018-04-19 Thread Elijah Newren
Reviewed-by: Stefan Beller 
Signed-off-by: Elijah Newren 
Signed-off-by: Junio C Hamano 
---
 t/t6043-merge-rename-directories.sh | 367 
 1 file changed, 367 insertions(+)

diff --git a/t/t6043-merge-rename-directories.sh 
b/t/t6043-merge-rename-directories.sh
index cbbb949014..a6cd38336c 100755
--- a/t/t6043-merge-rename-directories.sh
+++ b/t/t6043-merge-rename-directories.sh
@@ -2879,4 +2879,371 @@ test_expect_failure '9g-check: Renamed directory that 
only contained immediate s
 #   side of history for any implicit directory renames.
 ###
 
+###
+# SECTION 10: Handling untracked files
+#
+# unpack_trees(), upon which the recursive merge algorithm is based, aborts
+# the operation if untracked or dirty files would be deleted or overwritten
+# by the merge.  Unfortunately, unpack_trees() does not understand renames,
+# and if it doesn't abort, then it muddies up the working directory before
+# we even get to the point of detecting renames, so we need some special
+# handling, at least in the case of directory renames.
+###
+
+# Testcase 10a, Overwrite untracked: normal rename/delete
+#   Commit O: z/{b,c_1}
+#   Commit A: z/b + untracked z/c + untracked z/d
+#   Commit B: z/{b,d_1}
+#   Expected: Aborted Merge +
+#   ERROR_MSG(untracked working tree files would be overwritten by merge)
+
+test_expect_success '10a-setup: Overwrite untracked with normal rename/delete' 
'
+   test_create_repo 10a &&
+   (
+   cd 10a &&
+
+   mkdir z &&
+   echo b >z/b &&
+   echo c >z/c &&
+   git add z &&
+   test_tick &&
+   git commit -m "O" &&
+
+   git branch O &&
+   git branch A &&
+   git branch B &&
+
+   git checkout A &&
+   git rm z/c &&
+   test_tick &&
+   git commit -m "A" &&
+
+   git checkout B &&
+   git mv z/c z/d &&
+   test_tick &&
+   git commit -m "B"
+   )
+'
+
+test_expect_success '10a-check: Overwrite untracked with normal rename/delete' 
'
+   (
+   cd 10a &&
+
+   git checkout A^0 &&
+   echo very >z/c &&
+   echo important >z/d &&
+
+   test_must_fail git merge -s recursive B^0 >out 2>err &&
+   test_i18ngrep "The following untracked working tree files would 
be overwritten by merge" err &&
+
+   git ls-files -s >out &&
+   test_line_count = 1 out &&
+   git ls-files -o >out &&
+   test_line_count = 4 out &&
+
+   echo very >expect &&
+   test_cmp expect z/c &&
+
+   echo important >expect &&
+   test_cmp expect z/d &&
+
+   git rev-parse HEAD:z/b >actual &&
+   git rev-parse O:z/b >expect &&
+   test_cmp expect actual
+   )
+'
+
+# Testcase 10b, Overwrite untracked: dir rename + delete
+#   Commit O: z/{b,c_1}
+#   Commit A: y/b + untracked y/{c,d,e}
+#   Commit B: z/{b,d_1,e}
+#   Expected: Failed Merge; y/b + untracked y/c + untracked y/d on disk +
+# z/c_1 -> z/d_1 rename recorded at stage 3 for y/d +
+#   ERROR_MSG(refusing to lose untracked file at 'y/d')
+
+test_expect_success '10b-setup: Overwrite untracked with dir rename + delete' '
+   test_create_repo 10b &&
+   (
+   cd 10b &&
+
+   mkdir z &&
+   echo b >z/b &&
+   echo c >z/c &&
+   git add z &&
+   test_tick &&
+   git commit -m "O" &&
+
+   git branch O &&
+   git branch A &&
+   git branch B &&
+
+   git checkout A &&
+   git rm z/c &&
+   git mv z/ y/ &&
+   test_tick &&
+   git commit -m "A" &&
+
+   git checkout B &&
+   git mv z/c z/d &&
+   echo e >z/e &&
+   git add z/e &&
+   test_tick &&
+   git commit -m "B"
+   )
+'
+
+test_expect_failure '10b-check: Overwrite untracked with dir rename + delete' '
+   (
+   cd 10b &&
+
+   git checkout A^0 &&
+   echo very >y/c &&
+   echo important >y/d &&
+   echo contents >y/e &&
+
+   test_must_fail git merge -s recursive B^0 >out 2>err &&
+   test_i18ngrep "CONFLICT (rename/delete).*Version B\^0 of y/d 
left in tree at y/d~B\^0" out &&
+   test_i18ngrep "Error: Refusing to lose untracked file at y/e; 
writing to y/e~B\^0 instead" out &&
+
+   git ls-files -s >out 

[PATCH v10 27/36] directory rename detection: new testcases showcasing a pair of bugs

2018-04-19 Thread Elijah Newren
Add a testcase showing spurious rename/rename(1to2) conflicts occurring
due to directory rename detection.

Also add a pair of testcases dealing with moving directory hierarchies
around that were suggested by Stefan Beller as "food for thought" during
his review of an earlier patch series, but which actually uncovered a
bug.  Round things out with a test that is a cross between the two
testcases that showed existing bugs in order to make sure we aren't
merely addressing problems in isolation but in general.

Reviewed-by: Stefan Beller <sbel...@google.com>
Signed-off-by: Elijah Newren <new...@gmail.com>
Signed-off-by: Junio C Hamano <gits...@pobox.com>
---
 t/t6043-merge-rename-directories.sh | 296 
 1 file changed, 296 insertions(+)

diff --git a/t/t6043-merge-rename-directories.sh 
b/t/t6043-merge-rename-directories.sh
index 33e2649824..5b84591445 100755
--- a/t/t6043-merge-rename-directories.sh
+++ b/t/t6043-merge-rename-directories.sh
@@ -159,6 +159,7 @@ test_expect_success '1b-check: Merge a directory with 
another' '
 # Testcase 1c, Transitive renaming
 #   (Related to testcases 3a and 6d -- when should a transitive rename apply?)
 #   (Related to testcases 9c and 9d -- can transitivity repeat?)
+#   (Related to testcase 12b -- joint-transitivity?)
 #   Commit O: z/{b,c},   x/d
 #   Commit A: y/{b,c},   x/d
 #   Commit B: z/{b,c,d}
@@ -2871,6 +2872,68 @@ test_expect_failure '9g-check: Renamed directory that 
only contained immediate s
)
 '
 
+# Testcase 9h, Avoid implicit rename if involved as source on other side
+#   (Extremely closely related to testcase 3a)
+#   Commit O: z/{b,c,d_1}
+#   Commit A: z/{b,c,d_2}
+#   Commit B: y/{b,c}, x/d_1
+#   Expected: y/{b,c}, x/d_2
+#   NOTE: If we applied the z/ -> y/ rename to z/d, then we'd end up with
+# a rename/rename(1to2) conflict (z/d -> y/d vs. x/d)
+test_expect_success '9h-setup: Avoid dir rename on merely modified path' '
+   test_create_repo 9h &&
+   (
+   cd 9h &&
+
+   mkdir z &&
+   echo b >z/b &&
+   echo c >z/c &&
+   printf "1\n2\n3\n4\n5\n6\n7\n8\nd\n" >z/d &&
+   git add z &&
+   test_tick &&
+   git commit -m "O" &&
+
+   git branch O &&
+   git branch A &&
+   git branch B &&
+
+   git checkout A &&
+   test_tick &&
+   echo more >>z/d &&
+   git add z/d &&
+   git commit -m "A" &&
+
+   git checkout B &&
+   mkdir y &&
+   mkdir x &&
+   git mv z/b y/ &&
+   git mv z/c y/ &&
+   git mv z/d x/ &&
+   rmdir z &&
+   test_tick &&
+   git commit -m "B"
+   )
+'
+
+test_expect_failure '9h-check: Avoid dir rename on merely modified path' '
+   (
+   cd 9h &&
+
+   git checkout A^0 &&
+
+   git merge -s recursive B^0 &&
+
+   git ls-files -s >out &&
+   test_line_count = 3 out &&
+
+   git rev-parse >actual \
+   HEAD:y/b HEAD:y/c HEAD:x/d &&
+   git rev-parse >expect \
+   O:z/bO:z/cA:z/d &&
+   test_cmp expect actual
+   )
+'
+
 ###
 # Rules suggested by section 9:
 #
@@ -3704,4 +3767,237 @@ test_expect_success '11f-check: Avoid deleting 
not-uptodate with dir rename/rena
)
 '
 
+###
+# SECTION 12: Everything else
+#
+# Tests suggested by others.  Tests added after implementation completed
+# and submitted.  Grab bag.
+###
+
+# Testcase 12a, Moving one directory hierarchy into another
+#   (Related to testcase 9a)
+#   Commit O: node1/{leaf1,leaf2}, node2/{leaf3,leaf4}
+#   Commit A: node1/{leaf1,leaf2,node2/{leaf3,leaf4}}
+#   Commit B: node1/{leaf1,leaf2,leaf5}, node2/{leaf3,leaf4,leaf6}
+#   Expected: node1/{leaf1,leaf2,leaf5,node2/{leaf3,leaf4,leaf6}}
+
+test_expect_success '12a-setup: Moving one directory hierarchy into another' '
+   test_create_repo 12a &&
+   (
+   cd 12a &&
+
+   mkdir -p node1 node2 &&
+   echo leaf1 >node1/leaf1 &&
+   echo leaf2 >node1/leaf2 &&
+   echo leaf3 >node2/leaf3 &&
+   echo leaf4 >node2/leaf4 &am

[PATCH v10 00/36] Add directory rename detection to git

2018-04-19 Thread Elijah Newren
This series is a reboot of the directory rename detection series that was
merged to master and then reverted due to the final patch having a buggy
can-skip-update check, as noted at
  https://public-inbox.org/git/xmqqmuya43cs@gitster-ct.c.googlers.com/
This series based on top of master.

This updated series fixes the problem found with the previous series, and
also fixes Linus' issue with unnecessary rebuilds noted at
  
https://public-inbox.org/git/CA+55aFzLZ3UkG5svqZwSnhNk75=fxjrkvu1m_rhbg54nooa...@mail.gmail.com/

For the original details about design considerations surrounding
directory rename detection, see
  https://public-inbox.org/git/20171110190550.27059-1-new...@gmail.com/

Patches 1--28 are identical to what was previously merged to master,
modulo trivial compilation fixes due to the fact that I've rebased on
master which now includes commit 916bc35b29af ("tree-walk: convert tree
entry functions to object_id", 2018-03-12).  As such, I've retained the
Reviewed-by and Signed-off-by tags for these first 28 patches.  (The
final patch of the original series, patch 29, has been rewritten and
replaced in this series.)

The remaining eight patches are new; a brief summary:

  merge-recursive: improve add_cacheinfo error handling
  merge-recursive: move more is_dirty handling to merge_content
  merge-recursive: avoid triggering add_cacheinfo error with dirty mod

When Junio was bit by the previous series, the code reached a
detected error state that should not ever be hit in production.
That was bad enough, but the problem compounded because the code
simply printed a vague not-very-scary-sounding error, and returned
an error code that the caller ignored (which not only proceeded to
then handle other paths which might print messages causing the error
to scroll off the screen, but could result in a "clean" merge).  Fix
issues with the error handling...and then deal with the breakage of
one particular test that was triggering this codepath.

  t6046: testcases checking whether updates can be skipped in a merge

Add a fairly comprehensive set of tests for the skipability of
working tree updates.

  merge-recursive: fix was_tracked() to quit lying with some renamed
paths
  merge-recursive: fix remainder of was_dirty() to use original index

Instead of using the current index as a (rather imperfect) proxy for
the state of the index just before the merge, keep a copy of the
original index around so we can get correct answers to whether
certain paths were tracked or dirty before the merge.

  merge-recursive: make "Auto-merging" comment show for other merges
  merge-recursive: fix check for skipability of working tree updates

Fix and simplify the skipability check.  Due to some tests being
picky about output, the first of these two patches exists to avoid
triggering the "Auto-merging $FILE" message too often with the
simplified logic; in the process, it fixes a pair of existing issues
with when those messages are shown, making it more accurate in
general.

Additional testing:

  * I've re-merged all ~13k merge commits in git.git with both
git-2.17.0 and this version of git, comparing the results to each
other in detail.  (Including stdout & stderr, as well as the output
of subsequent commands like `git status`, `git ls-files -s`, `git
diff -M`, `git diff -M --staged`).  The only differences were in 23
merges of either git-gui or gitk which involved directory renames
(e.g. git-2.17.0's merge would result in files like 'lib/tools.tcl'
or 'po/ru.po' instead of the expected 'git-gui/lib/tools.tcl' or
'gitk-git/po/ru.po')

  * I'm trying to do the same with linux.git, but it looks like that will
take nearly a week to complete...

My biggest question:

  * Is there any other testing others would like to see, in order to avoid
a repeat of the pain from my previous series and allow us to safely
merge this newer one?

Elijah Newren (36):
  directory rename detection: basic testcases
  directory rename detection: directory splitting testcases
  directory rename detection: testcases to avoid taking detection too
far
  directory rename detection: partially renamed directory
testcase/discussion
  directory rename detection: files/directories in the way of some
renames
  directory rename detection: testcases checking which side did the
rename
  directory rename detection: more involved edge/corner testcases
  directory rename detection: testcases exploring possibly suboptimal
merges
  directory rename detection: miscellaneous testcases to complete
coverage
  directory rename detection: tests for handling overwriting untracked
files
  directory rename detection: tests for handling overwriting dirty files
  merge-recursive: move the get_renames() function
  merge-recursive: introduce new functions to handle rename log

[PATCH v10 07/36] directory rename detection: more involved edge/corner testcases

2018-04-19 Thread Elijah Newren
Reviewed-by: Stefan Beller 
Signed-off-by: Elijah Newren 
Signed-off-by: Junio C Hamano 
---
 t/t6043-merge-rename-directories.sh | 396 
 1 file changed, 396 insertions(+)

diff --git a/t/t6043-merge-rename-directories.sh 
b/t/t6043-merge-rename-directories.sh
index 9ae245a522..6db1439675 100755
--- a/t/t6043-merge-rename-directories.sh
+++ b/t/t6043-merge-rename-directories.sh
@@ -1516,4 +1516,400 @@ test_expect_success '6e-check: Add/add from one side' '
 #   side of history is the one doing the renaming.
 ###
 
+
+###
+# SECTION 7: More involved Edge/Corner cases
+#
+# The ruleset we have generated in the above sections seems to provide
+# well-defined merges.  But can we find edge/corner cases that either (a)
+# are harder for users to understand, or (b) have a resolution that is
+# non-intuitive or suboptimal?
+#
+# The testcases in this section dive into cases that I've tried to craft in
+# a way to find some that might be surprising to users or difficult for
+# them to understand (the next section will look at non-intuitive or
+# suboptimal merge results).  Some of the testcases are similar to ones
+# from past sections, but have been simplified to try to highlight error
+# messages using a "modified" path (due to the directory rename).  Are
+# users okay with these?
+#
+# In my opinion, testcases that are difficult to understand from this
+# section is due to difficulty in the testcase rather than the directory
+# renaming (similar to how t6042 and t6036 have difficult resolutions due
+# to the problem setup itself being complex).  And I don't think the
+# error messages are a problem.
+#
+# On the other hand, the testcases in section 8 worry me slightly more...
+###
+
+# Testcase 7a, rename-dir vs. rename-dir (NOT split evenly) PLUS add-other-file
+#   Commit O: z/{b,c}
+#   Commit A: y/{b,c}
+#   Commit B: w/b, x/c, z/d
+#   Expected: y/d, CONFLICT(rename/rename for both z/b and z/c)
+#   NOTE: There's a rename of z/ here, y/ has more renames, so z/d -> y/d.
+
+test_expect_success '7a-setup: rename-dir vs. rename-dir (NOT split evenly) 
PLUS add-other-file' '
+   test_create_repo 7a &&
+   (
+   cd 7a &&
+
+   mkdir z &&
+   echo b >z/b &&
+   echo c >z/c &&
+   git add z &&
+   test_tick &&
+   git commit -m "O" &&
+
+   git branch O &&
+   git branch A &&
+   git branch B &&
+
+   git checkout A &&
+   git mv z y &&
+   test_tick &&
+   git commit -m "A" &&
+
+   git checkout B &&
+   mkdir w &&
+   mkdir x &&
+   git mv z/b w/ &&
+   git mv z/c x/ &&
+   echo d > z/d &&
+   git add z/d &&
+   test_tick &&
+   git commit -m "B"
+   )
+'
+
+test_expect_failure '7a-check: rename-dir vs. rename-dir (NOT split evenly) 
PLUS add-other-file' '
+   (
+   cd 7a &&
+
+   git checkout A^0 &&
+
+   test_must_fail git merge -s recursive B^0 >out &&
+   test_i18ngrep "CONFLICT (rename/rename).*z/b.*y/b.*w/b" out &&
+   test_i18ngrep "CONFLICT (rename/rename).*z/c.*y/c.*x/c" out &&
+
+   git ls-files -s >out &&
+   test_line_count = 7 out &&
+   git ls-files -u >out &&
+   test_line_count = 6 out &&
+   git ls-files -o >out &&
+   test_line_count = 1 out &&
+
+   git rev-parse >actual \
+   :1:z/b :2:y/b :3:w/b :1:z/c :2:y/c :3:x/c :0:y/d &&
+   git rev-parse >expect \
+O:z/b  O:z/b  O:z/b  O:z/c  O:z/c  O:z/c  B:z/d &&
+   test_cmp expect actual &&
+
+   git hash-object >actual \
+   y/b   w/b   y/c   x/c &&
+   git rev-parse >expect \
+   O:z/b O:z/b O:z/c O:z/c &&
+   test_cmp expect actual
+   )
+'
+
+# Testcase 7b, rename/rename(2to1), but only due to transitive rename
+#   (Related to testcase 1d)
+#   Commit O: z/{b,c}, x/d_1, w/d_2
+#   Commit A: y/{b,c,d_2}, x/d_1
+#   Commit B: z/{b,c,d_1},w/d_2
+#   Expected: y/{b,c}, CONFLICT(rename/rename(2to1): x/d_1, w/d_2 -> y_d)
+
+test_expect_success '7b-setup: rename/rename(2to1), but only due to transitive 
rename' '
+   test_create_repo 7b &&
+   (
+   cd 7b &&
+
+   mkdir z &&
+   mkdir x &&
+   mkdir w &&
+   echo b >z/b &&
+   echo c >z/c &&
+   echo d1 > x/d &&
+   

[RFC PATCH v9 0/30] Add directory rename detection to git

2018-04-13 Thread Elijah Newren
This series builds on commit febb3a86098f ("merge-recursive: avoid
spurious rename/rename conflict from dir renames", 2018-02-14), also known
as patch 29/30 of en/rename-directory-detection.  That patch series has
been reverted from master[1], due to a bug with patch 30/30, so does not
apply to current master.  But I didn't want to resend the whole series for
an RFC.

These four patches replace that patch and:
  - fixes Linus' rewriting-of-unchanged-files bug[1]
  - fixes the problems that broke Junio's merges after my series[2]
  - fixes the problem the original patch 30/30 was intended to solve
  - adds lots of testcases to make sure this doesn't regress.

Linus' alternative of stupid-brute-force[3], would also work here,
though I feel the first three patches are useful even if we take some
form of his patch.  Long term, the most correct solution would involve
a rewrite to merge-recursive that would simplify this code[4], though
I think the changes in this series brings this part of the code closer
to that end state.

The big questions here are:
  1) The last time my rename-directory-detection series was merged into
 master it bit Junio badly.  I'm planning to redo all merges of
 git.git and linux.git and comparing v2.17.0 to what I get after
 my changes.  What else should I test?
  2) What do folks thing about stupid-brute-force vs. the explanation in
 my final patch?


[1] 
https://public-inbox.org/git/CA+55aFzLZ3UkG5svqZwSnhNk75=fxjrkvu1m_rhbg54nooa...@mail.gmail.com/
[2] https://public-inbox.org/git/xmqqmuya43cs@gitster-ct.c.googlers.com/
[3] 
https://public-inbox.org/git/CA+55aFwi9pTAJT_qtv=vHLgu=B1fdXBoD96i8Y5xnbS=zrf...@mail.gmail.com/
[4] https://public-inbox.org/git/xmqqd147kpdm@gitster.mtv.corp.google.com/


Elijah Newren (4):
  merge-recursive: improve output precision around skipping updates
  t6046: testcases checking whether updates can be skipped in a merge
  merge-recursive: Fix was_tracked() to quit lying with some renamed
paths
  merge-recursive: fix check for skipability of working tree updates

 merge-recursive.c  | 109 +---
 merge-recursive.h  |   1 +
 t/t6022-merge-rename.sh|   2 +-
 t/t6043-merge-rename-directories.sh|   2 +-
 t/t6046-merge-skip-unneeded-updates.sh | 497 +
 5 files changed, 575 insertions(+), 36 deletions(-)
 create mode 100755 t/t6046-merge-skip-unneeded-updates.sh

-- 
2.16.0.35.g6dd7ede834



Re: [PATCH v8 00/29] Add directory rename detection to git

2018-02-14 Thread Stefan Beller
On Wed, Feb 14, 2018 at 10:51 AM, Elijah Newren <new...@gmail.com> wrote:
> This patchset introduces directory rename detection to merge-recursive.  See
>   https://public-inbox.org/git/20171110190550.27059-1-new...@gmail.com/
> for the first series (including design considerations, etc.)  This series
> continues to depend on en/merge-recursive-fixes in next, at least
> contextually.  For the curious, follow-up series and comments can also be
> found at
>   https://public-inbox.org/git/20171120220209.15111-1-new...@gmail.com/
>   https://public-inbox.org/git/20171121080059.32304-1-new...@gmail.com/
>   https://public-inbox.org/git/20171129014237.32570-1-new...@gmail.com/
>   https://public-inbox.org/git/20171228041352.27880-1-new...@gmail.com/
>   https://public-inbox.org/git/20180105202711.24311-1-new...@gmail.com/
>   https://public-inbox.org/git/20180130232533.25846-1-new...@gmail.com/
>
> Also, as a reminder, this series fixes a few bugs somewhat as a side effect:
>   * a bug causing dirty files involved in a rename to be overwritten
>   * a few memory leaks
>
> Changes since v7 (full tbdiff follows below):
>   * Added Stefan's Reviewed-by.
>   * Squashed commits introducing new hash structs and associated functions
> into the commit that used them to avoid unused function
> warnings/errors.
>   * Added or clarified a number of comments where things were unclear
>   * Minor stuff:
> * Style (and typo) fixes for commit message and comments
> * Avoiding casting with hash initialization function
> * s/malloc/xmalloc/
> * struct assignment
> * s/20/GIT_MAX_RAWSZ/

Even the interdiff has Stefan's Reviewed-by.

Thanks for being persistent,
Stefan


[PATCH v8 10/29] directory rename detection: tests for handling overwriting untracked files

2018-02-14 Thread Elijah Newren
Reviewed-by: Stefan Beller 
Signed-off-by: Elijah Newren 
---
 t/t6043-merge-rename-directories.sh | 367 
 1 file changed, 367 insertions(+)

diff --git a/t/t6043-merge-rename-directories.sh 
b/t/t6043-merge-rename-directories.sh
index cbbb949014..a6cd38336c 100755
--- a/t/t6043-merge-rename-directories.sh
+++ b/t/t6043-merge-rename-directories.sh
@@ -2879,4 +2879,371 @@ test_expect_failure '9g-check: Renamed directory that 
only contained immediate s
 #   side of history for any implicit directory renames.
 ###
 
+###
+# SECTION 10: Handling untracked files
+#
+# unpack_trees(), upon which the recursive merge algorithm is based, aborts
+# the operation if untracked or dirty files would be deleted or overwritten
+# by the merge.  Unfortunately, unpack_trees() does not understand renames,
+# and if it doesn't abort, then it muddies up the working directory before
+# we even get to the point of detecting renames, so we need some special
+# handling, at least in the case of directory renames.
+###
+
+# Testcase 10a, Overwrite untracked: normal rename/delete
+#   Commit O: z/{b,c_1}
+#   Commit A: z/b + untracked z/c + untracked z/d
+#   Commit B: z/{b,d_1}
+#   Expected: Aborted Merge +
+#   ERROR_MSG(untracked working tree files would be overwritten by merge)
+
+test_expect_success '10a-setup: Overwrite untracked with normal rename/delete' 
'
+   test_create_repo 10a &&
+   (
+   cd 10a &&
+
+   mkdir z &&
+   echo b >z/b &&
+   echo c >z/c &&
+   git add z &&
+   test_tick &&
+   git commit -m "O" &&
+
+   git branch O &&
+   git branch A &&
+   git branch B &&
+
+   git checkout A &&
+   git rm z/c &&
+   test_tick &&
+   git commit -m "A" &&
+
+   git checkout B &&
+   git mv z/c z/d &&
+   test_tick &&
+   git commit -m "B"
+   )
+'
+
+test_expect_success '10a-check: Overwrite untracked with normal rename/delete' 
'
+   (
+   cd 10a &&
+
+   git checkout A^0 &&
+   echo very >z/c &&
+   echo important >z/d &&
+
+   test_must_fail git merge -s recursive B^0 >out 2>err &&
+   test_i18ngrep "The following untracked working tree files would 
be overwritten by merge" err &&
+
+   git ls-files -s >out &&
+   test_line_count = 1 out &&
+   git ls-files -o >out &&
+   test_line_count = 4 out &&
+
+   echo very >expect &&
+   test_cmp expect z/c &&
+
+   echo important >expect &&
+   test_cmp expect z/d &&
+
+   git rev-parse HEAD:z/b >actual &&
+   git rev-parse O:z/b >expect &&
+   test_cmp expect actual
+   )
+'
+
+# Testcase 10b, Overwrite untracked: dir rename + delete
+#   Commit O: z/{b,c_1}
+#   Commit A: y/b + untracked y/{c,d,e}
+#   Commit B: z/{b,d_1,e}
+#   Expected: Failed Merge; y/b + untracked y/c + untracked y/d on disk +
+# z/c_1 -> z/d_1 rename recorded at stage 3 for y/d +
+#   ERROR_MSG(refusing to lose untracked file at 'y/d')
+
+test_expect_success '10b-setup: Overwrite untracked with dir rename + delete' '
+   test_create_repo 10b &&
+   (
+   cd 10b &&
+
+   mkdir z &&
+   echo b >z/b &&
+   echo c >z/c &&
+   git add z &&
+   test_tick &&
+   git commit -m "O" &&
+
+   git branch O &&
+   git branch A &&
+   git branch B &&
+
+   git checkout A &&
+   git rm z/c &&
+   git mv z/ y/ &&
+   test_tick &&
+   git commit -m "A" &&
+
+   git checkout B &&
+   git mv z/c z/d &&
+   echo e >z/e &&
+   git add z/e &&
+   test_tick &&
+   git commit -m "B"
+   )
+'
+
+test_expect_failure '10b-check: Overwrite untracked with dir rename + delete' '
+   (
+   cd 10b &&
+
+   git checkout A^0 &&
+   echo very >y/c &&
+   echo important >y/d &&
+   echo contents >y/e &&
+
+   test_must_fail git merge -s recursive B^0 >out 2>err &&
+   test_i18ngrep "CONFLICT (rename/delete).*Version B\^0 of y/d 
left in tree at y/d~B\^0" out &&
+   test_i18ngrep "Error: Refusing to lose untracked file at y/e; 
writing to y/e~B\^0 instead" out &&
+
+   git ls-files -s >out &&
+   test_line_count = 3 out 

[PATCH v8 08/29] directory rename detection: testcases exploring possibly suboptimal merges

2018-02-14 Thread Elijah Newren
Reviewed-by: Stefan Beller <sbel...@google.com>
Signed-off-by: Elijah Newren <new...@gmail.com>
---
 t/t6043-merge-rename-directories.sh | 404 
 1 file changed, 404 insertions(+)

diff --git a/t/t6043-merge-rename-directories.sh 
b/t/t6043-merge-rename-directories.sh
index 6db1439675..e211e8ca31 100755
--- a/t/t6043-merge-rename-directories.sh
+++ b/t/t6043-merge-rename-directories.sh
@@ -1912,4 +1912,408 @@ test_expect_failure '7e-check: transitive rename in 
rename/delete AND dirs in th
)
 '
 
+###
+# SECTION 8: Suboptimal merges
+#
+# As alluded to in the last section, the ruleset we have built up for
+# detecting directory renames unfortunately has some special cases where it
+# results in slightly suboptimal or non-intuitive behavior.  This section
+# explores these cases.
+#
+# To be fair, we already had non-intuitive or suboptimal behavior for most
+# of these cases in git before introducing implicit directory rename
+# detection, but it'd be nice if there was a modified ruleset out there
+# that handled these cases a bit better.
+###
+
+# Testcase 8a, Dual-directory rename, one into the others' way
+#   Commit O. x/{a,b},   y/{c,d}
+#   Commit A. x/{a,b,e}, y/{c,d,f}
+#   Commit B. y/{a,b},   z/{c,d}
+#
+# Possible Resolutions:
+#   w/o dir-rename detection: y/{a,b,f},   z/{c,d},   x/e
+#   Currently expected:   y/{a,b,e,f}, z/{c,d}
+#   Optimal:  y/{a,b,e},   z/{c,d,f}
+#
+# Note: Both x and y got renamed and it'd be nice to detect both, and we do
+# better with directory rename detection than git did without, but the
+# simple rule from section 5 prevents me from handling this as optimally as
+# we potentially could.
+
+test_expect_success '8a-setup: Dual-directory rename, one into the others way' 
'
+   test_create_repo 8a &&
+   (
+   cd 8a &&
+
+   mkdir x &&
+   mkdir y &&
+   echo a >x/a &&
+   echo b >x/b &&
+   echo c >y/c &&
+   echo d >y/d &&
+   git add x y &&
+   test_tick &&
+   git commit -m "O" &&
+
+   git branch O &&
+   git branch A &&
+   git branch B &&
+
+   git checkout A &&
+   echo e >x/e &&
+   echo f >y/f &&
+   git add x/e y/f &&
+   test_tick &&
+   git commit -m "A" &&
+
+   git checkout B &&
+   git mv y z &&
+   git mv x y &&
+   test_tick &&
+   git commit -m "B"
+   )
+'
+
+test_expect_failure '8a-check: Dual-directory rename, one into the others way' 
'
+   (
+   cd 8a &&
+
+   git checkout A^0 &&
+
+   git merge -s recursive B^0 &&
+
+   git ls-files -s >out &&
+   test_line_count = 6 out &&
+   git ls-files -u >out &&
+   test_line_count = 0 out &&
+   git ls-files -o >out &&
+   test_line_count = 1 out &&
+
+   git rev-parse >actual \
+   HEAD:y/a HEAD:y/b HEAD:y/e HEAD:y/f HEAD:z/c HEAD:z/d &&
+   git rev-parse >expect \
+   O:x/a    O:x/bA:x/eA:y/fO:y/cO:y/d &&
+   test_cmp expect actual
+   )
+'
+
+# Testcase 8b, Dual-directory rename, one into the others' way, with 
conflicting filenames
+#   Commit O. x/{a_1,b_1}, y/{a_2,b_2}
+#   Commit A. x/{a_1,b_1,e_1}, y/{a_2,b_2,e_2}
+#   Commit B. y/{a_1,b_1}, z/{a_2,b_2}
+#
+#   w/o dir-rename detection: y/{a_1,b_1,e_2}, z/{a_2,b_2}, x/e_1
+#   Currently expected:   
+#   Scary:    y/{a_1,b_1}, z/{a_2,b_2}, CONFLICT(add/add, 
e_1 vs. e_2)
+#   Optimal:  y/{a_1,b_1,e_1}, z/{a_2,b_2,e_2}
+#
+# Note: Very similar to 8a, except instead of 'e' and 'f' in directories x and
+# y, both are named 'e'.  Without directory rename detection, neither file
+# moves directories.  Implement directory rename detection suboptimally, and
+# you get an add/add conflict, but both files were added in commit A, so this
+# is an add/add conflict where one side of history added both files --
+# something we can't represent in the index.  Obviously, we'd prefer the last
+# resolution, but our previous rules are too coarse to allow it.  Using both
+# the rules from section 4 and section 5 save us from the Scary resolutio

[PATCH v8 27/29] directory rename detection: new testcases showcasing a pair of bugs

2018-02-14 Thread Elijah Newren
Add a testcase showing spurious rename/rename(1to2) conflicts occurring
due to directory rename detection.

Also add a pair of testcases dealing with moving directory hierarchies
around that were suggested by Stefan Beller as "food for thought" during
his review of an earlier patch series, but which actually uncovered a
bug.  Round things out with a test that is a cross between the two
testcases that showed existing bugs in order to make sure we aren't
merely addressing problems in isolation but in general.

Reviewed-by: Stefan Beller <sbel...@google.com>
Signed-off-by: Elijah Newren <new...@gmail.com>
---
 t/t6043-merge-rename-directories.sh | 296 
 1 file changed, 296 insertions(+)

diff --git a/t/t6043-merge-rename-directories.sh 
b/t/t6043-merge-rename-directories.sh
index 33e2649824..5b84591445 100755
--- a/t/t6043-merge-rename-directories.sh
+++ b/t/t6043-merge-rename-directories.sh
@@ -159,6 +159,7 @@ test_expect_success '1b-check: Merge a directory with 
another' '
 # Testcase 1c, Transitive renaming
 #   (Related to testcases 3a and 6d -- when should a transitive rename apply?)
 #   (Related to testcases 9c and 9d -- can transitivity repeat?)
+#   (Related to testcase 12b -- joint-transitivity?)
 #   Commit O: z/{b,c},   x/d
 #   Commit A: y/{b,c},   x/d
 #   Commit B: z/{b,c,d}
@@ -2871,6 +2872,68 @@ test_expect_failure '9g-check: Renamed directory that 
only contained immediate s
)
 '
 
+# Testcase 9h, Avoid implicit rename if involved as source on other side
+#   (Extremely closely related to testcase 3a)
+#   Commit O: z/{b,c,d_1}
+#   Commit A: z/{b,c,d_2}
+#   Commit B: y/{b,c}, x/d_1
+#   Expected: y/{b,c}, x/d_2
+#   NOTE: If we applied the z/ -> y/ rename to z/d, then we'd end up with
+# a rename/rename(1to2) conflict (z/d -> y/d vs. x/d)
+test_expect_success '9h-setup: Avoid dir rename on merely modified path' '
+   test_create_repo 9h &&
+   (
+   cd 9h &&
+
+   mkdir z &&
+   echo b >z/b &&
+   echo c >z/c &&
+   printf "1\n2\n3\n4\n5\n6\n7\n8\nd\n" >z/d &&
+   git add z &&
+   test_tick &&
+   git commit -m "O" &&
+
+   git branch O &&
+   git branch A &&
+   git branch B &&
+
+   git checkout A &&
+   test_tick &&
+   echo more >>z/d &&
+   git add z/d &&
+   git commit -m "A" &&
+
+   git checkout B &&
+   mkdir y &&
+   mkdir x &&
+   git mv z/b y/ &&
+   git mv z/c y/ &&
+   git mv z/d x/ &&
+   rmdir z &&
+   test_tick &&
+   git commit -m "B"
+   )
+'
+
+test_expect_failure '9h-check: Avoid dir rename on merely modified path' '
+   (
+   cd 9h &&
+
+   git checkout A^0 &&
+
+   git merge -s recursive B^0 &&
+
+   git ls-files -s >out &&
+   test_line_count = 3 out &&
+
+   git rev-parse >actual \
+   HEAD:y/b HEAD:y/c HEAD:x/d &&
+   git rev-parse >expect \
+   O:z/bO:z/cA:z/d &&
+   test_cmp expect actual
+   )
+'
+
 ###
 # Rules suggested by section 9:
 #
@@ -3704,4 +3767,237 @@ test_expect_success '11f-check: Avoid deleting 
not-uptodate with dir rename/rena
)
 '
 
+###
+# SECTION 12: Everything else
+#
+# Tests suggested by others.  Tests added after implementation completed
+# and submitted.  Grab bag.
+###
+
+# Testcase 12a, Moving one directory hierarchy into another
+#   (Related to testcase 9a)
+#   Commit O: node1/{leaf1,leaf2}, node2/{leaf3,leaf4}
+#   Commit A: node1/{leaf1,leaf2,node2/{leaf3,leaf4}}
+#   Commit B: node1/{leaf1,leaf2,leaf5}, node2/{leaf3,leaf4,leaf6}
+#   Expected: node1/{leaf1,leaf2,leaf5,node2/{leaf3,leaf4,leaf6}}
+
+test_expect_success '12a-setup: Moving one directory hierarchy into another' '
+   test_create_repo 12a &&
+   (
+   cd 12a &&
+
+   mkdir -p node1 node2 &&
+   echo leaf1 >node1/leaf1 &&
+   echo leaf2 >node1/leaf2 &&
+   echo leaf3 >node2/leaf3 &&
+   echo leaf4 >node2/leaf4 &&
+   git add node1 node2 &

[PATCH v8 07/29] directory rename detection: more involved edge/corner testcases

2018-02-14 Thread Elijah Newren
Reviewed-by: Stefan Beller 
Signed-off-by: Elijah Newren 
---
 t/t6043-merge-rename-directories.sh | 396 
 1 file changed, 396 insertions(+)

diff --git a/t/t6043-merge-rename-directories.sh 
b/t/t6043-merge-rename-directories.sh
index 9ae245a522..6db1439675 100755
--- a/t/t6043-merge-rename-directories.sh
+++ b/t/t6043-merge-rename-directories.sh
@@ -1516,4 +1516,400 @@ test_expect_success '6e-check: Add/add from one side' '
 #   side of history is the one doing the renaming.
 ###
 
+
+###
+# SECTION 7: More involved Edge/Corner cases
+#
+# The ruleset we have generated in the above sections seems to provide
+# well-defined merges.  But can we find edge/corner cases that either (a)
+# are harder for users to understand, or (b) have a resolution that is
+# non-intuitive or suboptimal?
+#
+# The testcases in this section dive into cases that I've tried to craft in
+# a way to find some that might be surprising to users or difficult for
+# them to understand (the next section will look at non-intuitive or
+# suboptimal merge results).  Some of the testcases are similar to ones
+# from past sections, but have been simplified to try to highlight error
+# messages using a "modified" path (due to the directory rename).  Are
+# users okay with these?
+#
+# In my opinion, testcases that are difficult to understand from this
+# section is due to difficulty in the testcase rather than the directory
+# renaming (similar to how t6042 and t6036 have difficult resolutions due
+# to the problem setup itself being complex).  And I don't think the
+# error messages are a problem.
+#
+# On the other hand, the testcases in section 8 worry me slightly more...
+###
+
+# Testcase 7a, rename-dir vs. rename-dir (NOT split evenly) PLUS add-other-file
+#   Commit O: z/{b,c}
+#   Commit A: y/{b,c}
+#   Commit B: w/b, x/c, z/d
+#   Expected: y/d, CONFLICT(rename/rename for both z/b and z/c)
+#   NOTE: There's a rename of z/ here, y/ has more renames, so z/d -> y/d.
+
+test_expect_success '7a-setup: rename-dir vs. rename-dir (NOT split evenly) 
PLUS add-other-file' '
+   test_create_repo 7a &&
+   (
+   cd 7a &&
+
+   mkdir z &&
+   echo b >z/b &&
+   echo c >z/c &&
+   git add z &&
+   test_tick &&
+   git commit -m "O" &&
+
+   git branch O &&
+   git branch A &&
+   git branch B &&
+
+   git checkout A &&
+   git mv z y &&
+   test_tick &&
+   git commit -m "A" &&
+
+   git checkout B &&
+   mkdir w &&
+   mkdir x &&
+   git mv z/b w/ &&
+   git mv z/c x/ &&
+   echo d > z/d &&
+   git add z/d &&
+   test_tick &&
+   git commit -m "B"
+   )
+'
+
+test_expect_failure '7a-check: rename-dir vs. rename-dir (NOT split evenly) 
PLUS add-other-file' '
+   (
+   cd 7a &&
+
+   git checkout A^0 &&
+
+   test_must_fail git merge -s recursive B^0 >out &&
+   test_i18ngrep "CONFLICT (rename/rename).*z/b.*y/b.*w/b" out &&
+   test_i18ngrep "CONFLICT (rename/rename).*z/c.*y/c.*x/c" out &&
+
+   git ls-files -s >out &&
+   test_line_count = 7 out &&
+   git ls-files -u >out &&
+   test_line_count = 6 out &&
+   git ls-files -o >out &&
+   test_line_count = 1 out &&
+
+   git rev-parse >actual \
+   :1:z/b :2:y/b :3:w/b :1:z/c :2:y/c :3:x/c :0:y/d &&
+   git rev-parse >expect \
+O:z/b  O:z/b  O:z/b  O:z/c  O:z/c  O:z/c  B:z/d &&
+   test_cmp expect actual &&
+
+   git hash-object >actual \
+   y/b   w/b   y/c   x/c &&
+   git rev-parse >expect \
+   O:z/b O:z/b O:z/c O:z/c &&
+   test_cmp expect actual
+   )
+'
+
+# Testcase 7b, rename/rename(2to1), but only due to transitive rename
+#   (Related to testcase 1d)
+#   Commit O: z/{b,c}, x/d_1, w/d_2
+#   Commit A: y/{b,c,d_2}, x/d_1
+#   Commit B: z/{b,c,d_1},w/d_2
+#   Expected: y/{b,c}, CONFLICT(rename/rename(2to1): x/d_1, w/d_2 -> y_d)
+
+test_expect_success '7b-setup: rename/rename(2to1), but only due to transitive 
rename' '
+   test_create_repo 7b &&
+   (
+   cd 7b &&
+
+   mkdir z &&
+   mkdir x &&
+   mkdir w &&
+   echo b >z/b &&
+   echo c >z/c &&
+   echo d1 > x/d &&
+   echo d2 > w/d &&
+   git add z x 

[PATCH v8 03/29] directory rename detection: testcases to avoid taking detection too far

2018-02-14 Thread Elijah Newren
Reviewed-by: Stefan Beller <sbel...@google.com>
Signed-off-by: Elijah Newren <new...@gmail.com>
---
 t/t6043-merge-rename-directories.sh | 153 
 1 file changed, 153 insertions(+)

diff --git a/t/t6043-merge-rename-directories.sh 
b/t/t6043-merge-rename-directories.sh
index b22a9052b3..8049ed5fc9 100755
--- a/t/t6043-merge-rename-directories.sh
+++ b/t/t6043-merge-rename-directories.sh
@@ -582,4 +582,157 @@ test_expect_success '2b-check: Directory split into two 
on one side, with equal
 #   messages are handled correctly.
 ###
 
+
+###
+# SECTION 3: Path in question is the source path for some rename already
+#
+# Combining cases from Section 1 and trying to handle them could lead to
+# directory renaming detection being over-applied.  So, this section
+# provides some good testcases to check that the implementation doesn't go
+# too far.
+###
+
+# Testcase 3a, Avoid implicit rename if involved as source on other side
+#   (Related to testcases 1c and 1f)
+#   Commit O: z/{b,c,d}
+#   Commit A: z/{b,c,d} (no change)
+#   Commit B: y/{b,c}, x/d
+#   Expected: y/{b,c}, x/d
+test_expect_success '3a-setup: Avoid implicit rename if involved as source on 
other side' '
+   test_create_repo 3a &&
+   (
+   cd 3a &&
+
+   mkdir z &&
+   echo b >z/b &&
+   echo c >z/c &&
+   echo d >z/d &&
+   git add z &&
+   test_tick &&
+   git commit -m "O" &&
+
+   git branch O &&
+   git branch A &&
+   git branch B &&
+
+   git checkout A &&
+   test_tick &&
+   git commit --allow-empty -m "A" &&
+
+   git checkout B &&
+   mkdir y &&
+   mkdir x &&
+   git mv z/b y/ &&
+   git mv z/c y/ &&
+   git mv z/d x/ &&
+   rmdir z &&
+   test_tick &&
+   git commit -m "B"
+   )
+'
+
+test_expect_success '3a-check: Avoid implicit rename if involved as source on 
other side' '
+   (
+   cd 3a &&
+
+   git checkout A^0 &&
+
+   git merge -s recursive B^0 &&
+
+   git ls-files -s >out &&
+   test_line_count = 3 out &&
+
+   git rev-parse >actual \
+   HEAD:y/b HEAD:y/c HEAD:x/d &&
+   git rev-parse >expect \
+   O:z/bO:z/cO:z/d &&
+   test_cmp expect actual
+   )
+'
+
+# Testcase 3b, Avoid implicit rename if involved as source on other side
+#   (Related to testcases 5c and 7c, also kind of 1e and 1f)
+#   Commit O: z/{b,c,d}
+#   Commit A: y/{b,c}, x/d
+#   Commit B: z/{b,c}, w/d
+#   Expected: y/{b,c}, CONFLICT:(z/d -> x/d vs. w/d)
+#   NOTE: We're particularly checking that since z/d is already involved as
+# a source in a file rename on the same side of history, that we don't
+# get it involved in directory rename detection.  If it were, we might
+# end up with CONFLICT:(z/d -> y/d vs. x/d vs. w/d), i.e. a
+# rename/rename/rename(1to3) conflict, which is just weird.
+test_expect_success '3b-setup: Avoid implicit rename if involved as source on 
current side' '
+   test_create_repo 3b &&
+   (
+   cd 3b &&
+
+   mkdir z &&
+   echo b >z/b &&
+   echo c >z/c &&
+   echo d >z/d &&
+   git add z &&
+   test_tick &&
+   git commit -m "O" &&
+
+   git branch O &&
+   git branch A &&
+   git branch B &&
+
+   git checkout A &&
+   mkdir y &&
+   mkdir x &&
+   git mv z/b y/ &&
+   git mv z/c y/ &&
+   git mv z/d x/ &&
+   rmdir z &&
+   test_tick &&
+   git commit -m "A" &&
+
+   git checkout B &&
+   mkdir w &&
+   git mv z/d w/ &&
+   test_tick &&
+   git commit -m "B"
+   )
+'
+
+test_expect_success '3b-check: Avoid implicit rename if involved as source on 
current side' '
+   (
+ 

[PATCH v8 06/29] directory rename detection: testcases checking which side did the rename

2018-02-14 Thread Elijah Newren
Reviewed-by: Stefan Beller <sbel...@google.com>
Signed-off-by: Elijah Newren <new...@gmail.com>
---
 t/t6043-merge-rename-directories.sh | 336 
 1 file changed, 336 insertions(+)

diff --git a/t/t6043-merge-rename-directories.sh 
b/t/t6043-merge-rename-directories.sh
index b469c807c2..9ae245a522 100755
--- a/t/t6043-merge-rename-directories.sh
+++ b/t/t6043-merge-rename-directories.sh
@@ -1180,4 +1180,340 @@ test_expect_failure '5d-check: Directory/file/file 
conflict due to directory ren
 #   back to old handling.  But, sadly, see testcases 8a and 8b.
 ###
 
+
+###
+# SECTION 6: Same side of the merge was the one that did the rename
+#
+# It may sound obvious that you only want to apply implicit directory
+# renames to directories if the _other_ side of history did the renaming.
+# If you did make an implementation that didn't explicitly enforce this
+# rule, the majority of cases that would fall under this section would
+# also be solved by following the rules from the above sections.  But
+# there are still a few that stick out, so this section covers them just
+# to make sure we also get them right.
+###
+
+# Testcase 6a, Tricky rename/delete
+#   Commit O: z/{b,c,d}
+#   Commit A: z/b
+#   Commit B: y/{b,c}, z/d
+#   Expected: y/b, CONFLICT(rename/delete, z/c -> y/c vs. NULL)
+#   Note: We're just checking here that the rename of z/b and z/c to put
+# them under y/ doesn't accidentally catch z/d and make it look like
+# it is also involved in a rename/delete conflict.
+
+test_expect_success '6a-setup: Tricky rename/delete' '
+   test_create_repo 6a &&
+   (
+   cd 6a &&
+
+   mkdir z &&
+   echo b >z/b &&
+   echo c >z/c &&
+   echo d >z/d &&
+   git add z &&
+   test_tick &&
+   git commit -m "O" &&
+
+   git branch O &&
+   git branch A &&
+   git branch B &&
+
+   git checkout A &&
+   git rm z/c &&
+   git rm z/d &&
+   test_tick &&
+   git commit -m "A" &&
+
+   git checkout B &&
+   mkdir y &&
+   git mv z/b y/ &&
+   git mv z/c y/ &&
+   test_tick &&
+   git commit -m "B"
+   )
+'
+
+test_expect_success '6a-check: Tricky rename/delete' '
+   (
+   cd 6a &&
+
+   git checkout A^0 &&
+
+   test_must_fail git merge -s recursive B^0 >out &&
+   test_i18ngrep "CONFLICT (rename/delete).*z/c.*y/c" out &&
+
+   git ls-files -s >out &&
+   test_line_count = 2 out &&
+   git ls-files -u >out &&
+   test_line_count = 1 out &&
+   git ls-files -o >out &&
+   test_line_count = 1 out &&
+
+   git rev-parse >actual \
+   :0:y/b :3:y/c &&
+   git rev-parse >expect \
+O:z/b  O:z/c &&
+   test_cmp expect actual
+   )
+'
+
+# Testcase 6b, Same rename done on both sides
+#   (Related to testcases 6c and 8e)
+#   Commit O: z/{b,c}
+#   Commit A: y/{b,c}
+#   Commit B: y/{b,c}, z/d
+#   Expected: y/{b,c}, z/d
+#   Note: If we did directory rename detection here, we'd move z/d into y/,
+# but B did that rename and still decided to put the file into z/,
+# so we probably shouldn't apply directory rename detection for it.
+
+test_expect_success '6b-setup: Same rename done on both sides' '
+   test_create_repo 6b &&
+   (
+   cd 6b &&
+
+   mkdir z &&
+   echo b >z/b &&
+   echo c >z/c &&
+   git add z &&
+   test_tick &&
+   git commit -m "O" &&
+
+   git branch O &&
+   git branch A &&
+   git branch B &&
+
+   git checkout A &&
+   git mv z y &&
+   test_tick &&
+   git commit -m "A" &&
+
+   git checkout B &&
+   git mv z y &&
+   mkdir z &&
+   echo d >z/d &&
+   git add z/d &&
+   

[PATCH v8 09/29] directory rename detection: miscellaneous testcases to complete coverage

2018-02-14 Thread Elijah Newren
I came up with the testcases in the first eight sections before coding up
the implementation.  The testcases in this section were mostly ones I
thought of while coding/debugging, and which I was too lazy to insert
into the previous sections because I didn't want to re-label with all the
testcase references.  :-)

Reviewed-by: Stefan Beller 
Signed-off-by: Elijah Newren 
---
 t/t6043-merge-rename-directories.sh | 565 +++-
 1 file changed, 564 insertions(+), 1 deletion(-)

diff --git a/t/t6043-merge-rename-directories.sh 
b/t/t6043-merge-rename-directories.sh
index e211e8ca31..cbbb949014 100755
--- a/t/t6043-merge-rename-directories.sh
+++ b/t/t6043-merge-rename-directories.sh
@@ -305,6 +305,7 @@ test_expect_failure '1d-check: Directory renames cause a 
rename/rename(2to1) con
 '
 
 # Testcase 1e, Renamed directory, with all filenames being renamed too
+#   (Related to testcases 9f & 9g)
 #   Commit O: z/{oldb,oldc}
 #   Commit A: y/{newb,newc}
 #   Commit B: z/{oldb,oldc,d}
@@ -593,7 +594,7 @@ test_expect_success '2b-check: Directory split into two on 
one side, with equal
 ###
 
 # Testcase 3a, Avoid implicit rename if involved as source on other side
-#   (Related to testcases 1c and 1f)
+#   (Related to testcases 1c, 1f, and 9h)
 #   Commit O: z/{b,c,d}
 #   Commit A: z/{b,c,d} (no change)
 #   Commit B: y/{b,c}, x/d
@@ -2316,4 +2317,566 @@ test_expect_failure '8e-check: Both sides rename, one 
side adds to original dire
)
 '
 
+###
+# SECTION 9: Other testcases
+#
+# This section consists of miscellaneous testcases I thought of during
+# the implementation which round out the testing.
+###
+
+# Testcase 9a, Inner renamed directory within outer renamed directory
+#   (Related to testcase 1f)
+#   Commit O: z/{b,c,d/{e,f,g}}
+#   Commit A: y/{b,c}, x/w/{e,f,g}
+#   Commit B: z/{b,c,d/{e,f,g,h},i}
+#   Expected: y/{b,c,i}, x/w/{e,f,g,h}
+#   NOTE: The only reason this one is interesting is because when a directory
+# is split into multiple other directories, we determine by the weight
+# of which one had the most paths going to it.  A naive implementation
+# of that could take the new file in commit B at z/i to x/w/i or x/i.
+
+test_expect_success '9a-setup: Inner renamed directory within outer renamed 
directory' '
+   test_create_repo 9a &&
+   (
+   cd 9a &&
+
+   mkdir -p z/d &&
+   echo b >z/b &&
+   echo c >z/c &&
+   echo e >z/d/e &&
+   echo f >z/d/f &&
+   echo g >z/d/g &&
+   git add z &&
+   test_tick &&
+   git commit -m "O" &&
+
+   git branch O &&
+   git branch A &&
+   git branch B &&
+
+   git checkout A &&
+   mkdir x &&
+   git mv z/d x/w &&
+   git mv z y &&
+   test_tick &&
+   git commit -m "A" &&
+
+   git checkout B &&
+   echo h >z/d/h &&
+   echo i >z/i &&
+   git add z &&
+   test_tick &&
+   git commit -m "B"
+   )
+'
+
+test_expect_failure '9a-check: Inner renamed directory within outer renamed 
directory' '
+   (
+   cd 9a &&
+
+   git checkout A^0 &&
+
+   git merge -s recursive B^0 &&
+
+   git ls-files -s >out &&
+   test_line_count = 7 out &&
+   git ls-files -u >out &&
+   test_line_count = 0 out &&
+   git ls-files -o >out &&
+   test_line_count = 1 out &&
+
+   git rev-parse >actual \
+   HEAD:y/b HEAD:y/c HEAD:y/i &&
+   git rev-parse >expect \
+   O:z/bO:z/cB:z/i &&
+   test_cmp expect actual &&
+
+   git rev-parse >actual \
+   HEAD:x/w/e HEAD:x/w/f HEAD:x/w/g HEAD:x/w/h &&
+   git rev-parse >expect \
+   O:z/d/eO:z/d/fO:z/d/gB:z/d/h &&
+   test_cmp expect actual
+   )
+'
+
+# Testcase 9b, Transitive rename with content merge
+#   (Related to testcase 1c)
+#   Commit O: z/{b,c},   x/d_1
+#   Commit A: y/{b,c},   x/d_2
+#   Commit B: z/{b,c,d_3}
+#   Expected: y/{b,c,d_merged}
+
+test_expect_success '9b-setup: Transitive rename with content merge' '
+   test_create_repo 9b &&
+   (
+   cd 9b &&
+
+   mkdir z &&
+   echo b >z/b &&
+   echo c >z/c &&
+   mkdir x &&
+   test_seq 1 10 >x/d &&
+   git add z x &&
+   test_tick &&
+   git commit -m "O" &&
+
+  

[PATCH v8 04/29] directory rename detection: partially renamed directory testcase/discussion

2018-02-14 Thread Elijah Newren
Add a long note about why we are not considering "partial directory
renames" for the current directory rename detection implementation.

Reviewed-by: Stefan Beller <sbel...@google.com>
Signed-off-by: Elijah Newren <new...@gmail.com>
---
 t/t6043-merge-rename-directories.sh | 115 
 1 file changed, 115 insertions(+)

diff --git a/t/t6043-merge-rename-directories.sh 
b/t/t6043-merge-rename-directories.sh
index 8049ed5fc9..713ad2b75e 100755
--- a/t/t6043-merge-rename-directories.sh
+++ b/t/t6043-merge-rename-directories.sh
@@ -735,4 +735,119 @@ test_expect_success '3b-check: Avoid implicit rename if 
involved as source on cu
 #   of a rename on either side of a merge.
 ###
 
+
+###
+# SECTION 4: Partially renamed directory; still exists on both sides of merge
+#
+# What if we were to attempt to do directory rename detection when someone
+# "mostly" moved a directory but still left some files around, or,
+# equivalently, fully renamed a directory in one commmit and then recreated
+# that directory in a later commit adding some new files and then tried to
+# merge?
+#
+# It's hard to divine user intent in these cases, because you can make an
+# argument that, depending on the intermediate history of the side being
+# merged, that some users will want files in that directory to
+# automatically be detected and renamed, while users with a different
+# intermediate history wouldn't want that rename to happen.
+#
+# I think that it is best to simply not have directory rename detection
+# apply to such cases.  My reasoning for this is four-fold: (1) it's
+# easiest for users in general to figure out what happened if we don't
+# apply directory rename detection in any such case, (2) it's an easy rule
+# to explain ["We don't do directory rename detection if the directory
+# still exists on both sides of the merge"], (3) we can get some hairy
+# edge/corner cases that would be really confusing and possibly not even
+# representable in the index if we were to even try, and [related to 3] (4)
+# attempting to resolve this issue of divining user intent by examining
+# intermediate history goes against the spirit of three-way merges and is a
+# path towards crazy corner cases that are far more complex than what we're
+# already dealing with.
+#
+# Note that the wording of the rule ("We don't do directory rename
+# detection if the directory still exists on both sides of the merge.")
+# also excludes "renaming" of a directory into a subdirectory of itself
+# (e.g. /some/dir/* -> /some/dir/subdir/*).  It may be possible to carve
+# out an exception for "renaming"-beneath-itself cases without opening
+# weird edge/corner cases for other partial directory renames, but for now
+# we are keeping the rule simple.
+#
+# This section contains a test for a partially-renamed-directory case.
+###
+
+# Testcase 4a, Directory split, with original directory still present
+#   (Related to testcase 1f)
+#   Commit O: z/{b,c,d,e}
+#   Commit A: y/{b,c,d}, z/e
+#   Commit B: z/{b,c,d,e,f}
+#   Expected: y/{b,c,d}, z/{e,f}
+#   NOTE: Even though most files from z moved to y, we don't want f to follow.
+
+test_expect_success '4a-setup: Directory split, with original directory still 
present' '
+   test_create_repo 4a &&
+   (
+   cd 4a &&
+
+   mkdir z &&
+   echo b >z/b &&
+   echo c >z/c &&
+   echo d >z/d &&
+   echo e >z/e &&
+   git add z &&
+   test_tick &&
+   git commit -m "O" &&
+
+   git branch O &&
+   git branch A &&
+   git branch B &&
+
+   git checkout A &&
+   mkdir y &&
+   git mv z/b y/ &&
+   git mv z/c y/ &&
+   git mv z/d y/ &&
+   test_tick &&
+   git commit -m "A" &&
+
+   git checkout B &&
+   echo f >z/f &&
+   git add z/f &&
+   test_tick &&
+   git commit -m "B"
+   )
+'
+
+test_expect_success '4a-check: Directory split, with original directory still 
present' '
+   (
+   cd 4a &&
+
+   git checkout A^0 &&
+
+   git merge -s recursive B^0 &&
+
+   git ls-files -s >out &&
+   test_line_count = 5 out &&
+   git ls-files -u >out &&
+  

[PATCH v8 11/29] directory rename detection: tests for handling overwriting dirty files

2018-02-14 Thread Elijah Newren
Reviewed-by: Stefan Beller <sbel...@google.com>
Signed-off-by: Elijah Newren <new...@gmail.com>
---
 t/t6043-merge-rename-directories.sh | 458 
 1 file changed, 458 insertions(+)

diff --git a/t/t6043-merge-rename-directories.sh 
b/t/t6043-merge-rename-directories.sh
index a6cd38336c..8ea9ec49bc 100755
--- a/t/t6043-merge-rename-directories.sh
+++ b/t/t6043-merge-rename-directories.sh
@@ -3246,4 +3246,462 @@ test_expect_failure '10e-check: Does git complain about 
untracked file that is n
)
 '
 
+###
+# SECTION 11: Handling dirty (not up-to-date) files
+#
+# unpack_trees(), upon which the recursive merge algorithm is based, aborts
+# the operation if untracked or dirty files would be deleted or overwritten
+# by the merge.  Unfortunately, unpack_trees() does not understand renames,
+# and if it doesn't abort, then it muddies up the working directory before
+# we even get to the point of detecting renames, so we need some special
+# handling.  This was true even of normal renames, but there are additional
+# codepaths that need special handling with directory renames.  Add
+# testcases for both renamed-by-directory-rename-detection and standard
+# rename cases.
+###
+
+# Testcase 11a, Avoid losing dirty contents with simple rename
+#   Commit O: z/{a,b_v1},
+#   Commit A: z/{a,c_v1}, and z/c_v1 has uncommitted mods
+#   Commit B: z/{a,b_v2}
+#   Expected: ERROR_MSG(Refusing to lose dirty file at z/c) +
+# z/a, staged version of z/c has sha1sum matching B:z/b_v2,
+# z/c~HEAD with contents of B:z/b_v2,
+# z/c with uncommitted mods on top of A:z/c_v1
+
+test_expect_success '11a-setup: Avoid losing dirty contents with simple 
rename' '
+   test_create_repo 11a &&
+   (
+   cd 11a &&
+
+   mkdir z &&
+   echo a >z/a &&
+   test_seq 1 10 >z/b &&
+   git add z &&
+   test_tick &&
+   git commit -m "O" &&
+
+   git branch O &&
+   git branch A &&
+   git branch B &&
+
+   git checkout A &&
+   git mv z/b z/c &&
+   test_tick &&
+   git commit -m "A" &&
+
+   git checkout B &&
+   echo 11 >>z/b &&
+   git add z/b &&
+   test_tick &&
+   git commit -m "B"
+   )
+'
+
+test_expect_failure '11a-check: Avoid losing dirty contents with simple 
rename' '
+   (
+   cd 11a &&
+
+   git checkout A^0 &&
+   echo stuff >>z/c &&
+
+   test_must_fail git merge -s recursive B^0 >out 2>err &&
+   test_i18ngrep "Refusing to lose dirty file at z/c" out &&
+
+   test_seq 1 10 >expected &&
+   echo stuff >>expected &&
+   test_cmp expected z/c &&
+
+   git ls-files -s >out &&
+   test_line_count = 2 out &&
+   git ls-files -u >out &&
+   test_line_count = 1 out &&
+   git ls-files -o >out &&
+   test_line_count = 4 out &&
+
+   git rev-parse >actual \
+   :0:z/a :2:z/c &&
+   git rev-parse >expect \
+O:z/a  B:z/b &&
+   test_cmp expect actual &&
+
+   git hash-object z/c~HEAD >actual &&
+   git rev-parse B:z/b >expect &&
+   test_cmp expect actual
+   )
+'
+
+# Testcase 11b, Avoid losing dirty file involved in directory rename
+#   Commit O: z/a, x/{b,c_v1}
+#   Commit A: z/{a,c_v1},  x/b,   and z/c_v1 has uncommitted mods
+#   Commit B: y/a, x/{b,c_v2}
+#   Expected: y/{a,c_v2}, x/b, z/c_v1 with uncommitted mods untracked,
+# ERROR_MSG(Refusing to lose dirty file at z/c)
+
+
+test_expect_success '11b-setup: Avoid losing dirty file involved in directory 
rename' '
+   test_create_repo 11b &&
+   (
+   cd 11b &&
+
+   mkdir z x &&
+   echo a >z/a &&
+   echo b >x/b &&
+   test_seq 1 10 >x/c &&
+   git add z x &&
+   test_tick &&
+   git commit -m "O" &&
+
+   git branch O &&
+   git branch A &&
+   git

[PATCH v8 01/29] directory rename detection: basic testcases

2018-02-14 Thread Elijah Newren
Reviewed-by: Stefan Beller <sbel...@google.com>
Signed-off-by: Elijah Newren <new...@gmail.com>
---
 t/t6043-merge-rename-directories.sh | 442 
 1 file changed, 442 insertions(+)
 create mode 100755 t/t6043-merge-rename-directories.sh

diff --git a/t/t6043-merge-rename-directories.sh 
b/t/t6043-merge-rename-directories.sh
new file mode 100755
index 00..d045f0e31e
--- /dev/null
+++ b/t/t6043-merge-rename-directories.sh
@@ -0,0 +1,442 @@
+#!/bin/sh
+
+test_description="recursive merge with directory renames"
+# includes checking of many corner cases, with a similar methodology to:
+#   t6042: corner cases with renames but not criss-cross merges
+#   t6036: corner cases with both renames and criss-cross merges
+#
+# The setup for all of them, pictorially, is:
+#
+#  A
+#  o
+# / \
+#  O o   ?
+# \ /
+#  o
+#  B
+#
+# To help make it easier to follow the flow of tests, they have been
+# divided into sections and each test will start with a quick explanation
+# of what commits O, A, and B contain.
+#
+# Notation:
+#z/{b,c}   means  files z/b and z/c both exist
+#x/d_1 means  file x/d exists with content d1.  (Purpose of the
+# underscore notation is to differentiate different
+# files that might be renamed into each other's paths.)
+
+. ./test-lib.sh
+
+
+###
+# SECTION 1: Basic cases we should be able to handle
+###
+
+# Testcase 1a, Basic directory rename.
+#   Commit O: z/{b,c}
+#   Commit A: y/{b,c}
+#   Commit B: z/{b,c,d,e/f}
+#   Expected: y/{b,c,d,e/f}
+
+test_expect_success '1a-setup: Simple directory rename detection' '
+   test_create_repo 1a &&
+   (
+   cd 1a &&
+
+   mkdir z &&
+   echo b >z/b &&
+   echo c >z/c &&
+   git add z &&
+   test_tick &&
+   git commit -m "O" &&
+
+   git branch O &&
+   git branch A &&
+   git branch B &&
+
+   git checkout A &&
+   git mv z y &&
+   test_tick &&
+   git commit -m "A" &&
+
+   git checkout B &&
+   echo d >z/d &&
+   mkdir z/e &&
+       echo f >z/e/f &&
+   git add z/d z/e/f &&
+   test_tick &&
+   git commit -m "B"
+   )
+'
+
+test_expect_failure '1a-check: Simple directory rename detection' '
+   (
+   cd 1a &&
+
+   git checkout A^0 &&
+
+   git merge -s recursive B^0 &&
+
+   git ls-files -s >out &&
+   test_line_count = 4 out &&
+
+   git rev-parse >actual \
+   HEAD:y/b HEAD:y/c HEAD:y/d HEAD:y/e/f &&
+   git rev-parse >expect \
+   O:z/bO:z/cB:z/dB:z/e/f &&
+   test_cmp expect actual &&
+
+   git hash-object y/d >actual &&
+   git rev-parse B:z/d >expect &&
+   test_cmp expect actual &&
+
+   test_must_fail git rev-parse HEAD:z/d &&
+   test_must_fail git rev-parse HEAD:z/e/f &&
+   test_path_is_missing z/d &&
+   test_path_is_missing z/e/f
+   )
+'
+
+# Testcase 1b, Merge a directory with another
+#   Commit O: z/{b,c},   y/d
+#   Commit A: z/{b,c,e}, y/d
+#   Commit B: y/{b,c,d}
+#   Expected: y/{b,c,d,e}
+
+test_expect_success '1b-setup: Merge a directory with another' '
+   test_create_repo 1b &&
+   (
+   cd 1b &&
+
+   mkdir z &&
+   echo b >z/b &&
+   echo c >z/c &&
+   mkdir y &&
+   echo d >y/d &&
+   git add z y &&
+   test_tick &&
+   git commit -m "O" &&
+
+   git branch O &&
+   git branch A &&
+   git branch B &&
+
+   git checkout A &&
+   echo e >z/e &&
+   git add z/e &&
+   test_tick &&
+   git commit -m "A" &&
+
+   git checkout B &&
+   git mv z/b y &&
+   git mv z/c y &&
+   rmdir z &&
+   test_tick &&
+   git commit -m "B

[PATCH v8 05/29] directory rename detection: files/directories in the way of some renames

2018-02-14 Thread Elijah Newren
Reviewed-by: Stefan Beller <sbel...@google.com>
Signed-off-by: Elijah Newren <new...@gmail.com>
---
 t/t6043-merge-rename-directories.sh | 330 
 1 file changed, 330 insertions(+)

diff --git a/t/t6043-merge-rename-directories.sh 
b/t/t6043-merge-rename-directories.sh
index 713ad2b75e..b469c807c2 100755
--- a/t/t6043-merge-rename-directories.sh
+++ b/t/t6043-merge-rename-directories.sh
@@ -850,4 +850,334 @@ test_expect_success '4a-check: Directory split, with 
original directory still pr
 #   detection.)  But, sadly, see testcase 8b.
 ###
 
+
+###
+# SECTION 5: Files/directories in the way of subset of to-be-renamed paths
+#
+# Implicitly renaming files due to a detected directory rename could run
+# into problems if there are files or directories in the way of the paths
+# we want to rename.  Explore such cases in this section.
+###
+
+# Testcase 5a, Merge directories, other side adds files to original and target
+#   Commit O: z/{b,c},   y/d
+#   Commit A: z/{b,c,e_1,f}, y/{d,e_2}
+#   Commit B: y/{b,c,d}
+#   Expected: z/e_1, y/{b,c,d,e_2,f} + CONFLICT warning
+#   NOTE: While directory rename detection is active here causing z/f to
+# become y/f, we did not apply this for z/e_1 because that would
+# give us an add/add conflict for y/e_1 vs y/e_2.  This problem with
+# this add/add, is that both versions of y/e are from the same side
+# of history, giving us no way to represent this conflict in the
+# index.
+
+test_expect_success '5a-setup: Merge directories, other side adds files to 
original and target' '
+   test_create_repo 5a &&
+   (
+   cd 5a &&
+
+   mkdir z &&
+   echo b >z/b &&
+   echo c >z/c &&
+   mkdir y &&
+   echo d >y/d &&
+   git add z y &&
+   test_tick &&
+   git commit -m "O" &&
+
+   git branch O &&
+   git branch A &&
+   git branch B &&
+
+   git checkout A &&
+   echo e1 >z/e &&
+   echo f >z/f &&
+   echo e2 >y/e &&
+   git add z/e z/f y/e &&
+   test_tick &&
+   git commit -m "A" &&
+
+   git checkout B &&
+   git mv z/b y/ &&
+   git mv z/c y/ &&
+   rmdir z &&
+   test_tick &&
+   git commit -m "B"
+   )
+'
+
+test_expect_failure '5a-check: Merge directories, other side adds files to 
original and target' '
+   (
+   cd 5a &&
+
+   git checkout A^0 &&
+
+   test_must_fail git merge -s recursive B^0 >out &&
+   test_i18ngrep "CONFLICT.*implicit dir rename" out &&
+
+   git ls-files -s >out &&
+   test_line_count = 6 out &&
+   git ls-files -u >out &&
+   test_line_count = 0 out &&
+   git ls-files -o >out &&
+   test_line_count = 1 out &&
+
+   git rev-parse >actual \
+   :0:y/b :0:y/c :0:y/d :0:y/e :0:z/e :0:y/f &&
+   git rev-parse >expect \
+O:z/b  O:z/c  O:y/d  A:y/e  A:z/e  A:z/f &&
+   test_cmp expect actual
+   )
+'
+
+# Testcase 5b, Rename/delete in order to get add/add/add conflict
+#   (Related to testcase 8d; these may appear slightly inconsistent to users;
+#Also related to testcases 7d and 7e)
+#   Commit O: z/{b,c,d_1}
+#   Commit A: y/{b,c,d_2}
+#   Commit B: z/{b,c,d_1,e}, y/d_3
+#   Expected: y/{b,c,e}, CONFLICT(add/add: y/d_2 vs. y/d_3)
+#   NOTE: If z/d_1 in commit B were to be involved in dir rename detection, as
+# we normaly would since z/ is being renamed to y/, then this would be
+# a rename/delete (z/d_1 -> y/d_1 vs. deleted) AND an add/add/add
+# conflict of y/d_1 vs. y/d_2 vs. y/d_3.  Add/add/add is not
+# representable in the index, so the existence of y/d_3 needs to
+# cause us to bail on directory rename detection for that path, falling
+# back to git behavior without the directory rename detection.
+
+test_expect_success '5b-setup: Rename/delete in order to get add/add/add 
conflict' '
+   test_create_repo 5b &&
+   (
+   cd 5b &&
+
+   mkdir z &&

[PATCH v8 00/29] Add directory rename detection to git

2018-02-14 Thread Elijah Newren
This patchset introduces directory rename detection to merge-recursive.  See
  https://public-inbox.org/git/20171110190550.27059-1-new...@gmail.com/
for the first series (including design considerations, etc.)  This series
continues to depend on en/merge-recursive-fixes in next, at least
contextually.  For the curious, follow-up series and comments can also be
found at
  https://public-inbox.org/git/20171120220209.15111-1-new...@gmail.com/
  https://public-inbox.org/git/20171121080059.32304-1-new...@gmail.com/
  https://public-inbox.org/git/20171129014237.32570-1-new...@gmail.com/
  https://public-inbox.org/git/20171228041352.27880-1-new...@gmail.com/
  https://public-inbox.org/git/20180105202711.24311-1-new...@gmail.com/
  https://public-inbox.org/git/20180130232533.25846-1-new...@gmail.com/

Also, as a reminder, this series fixes a few bugs somewhat as a side effect:
  * a bug causing dirty files involved in a rename to be overwritten
  * a few memory leaks

Changes since v7 (full tbdiff follows below):
  * Added Stefan's Reviewed-by.
  * Squashed commits introducing new hash structs and associated functions
into the commit that used them to avoid unused function
warnings/errors.
  * Added or clarified a number of comments where things were unclear
  * Minor stuff:
* Style (and typo) fixes for commit message and comments
* Avoiding casting with hash initialization function
* s/malloc/xmalloc/
* struct assignment
* s/20/GIT_MAX_RAWSZ/

Elijah Newren (29):
  directory rename detection: basic testcases
  directory rename detection: directory splitting testcases
  directory rename detection: testcases to avoid taking detection too
far
  directory rename detection: partially renamed directory
testcase/discussion
  directory rename detection: files/directories in the way of some
renames
  directory rename detection: testcases checking which side did the
rename
  directory rename detection: more involved edge/corner testcases
  directory rename detection: testcases exploring possibly suboptimal
merges
  directory rename detection: miscellaneous testcases to complete
coverage
  directory rename detection: tests for handling overwriting untracked
files
  directory rename detection: tests for handling overwriting dirty files
  merge-recursive: move the get_renames() function
  merge-recursive: introduce new functions to handle rename logic
  merge-recursive: fix leaks of allocated renames and diff_filepairs
  merge-recursive: make !o->detect_rename codepath more obvious
  merge-recursive: split out code for determining diff_filepairs
  merge-recursive: make a helper function for cleanup for handle_renames
  merge-recursive: add get_directory_renames()
  merge-recursive: check for directory level conflicts
  merge-recursive: add computation of collisions due to dir rename &
merging
  merge-recursive: check for file level conflicts then get new name
  merge-recursive: when comparing files, don't include trees
  merge-recursive: apply necessary modifications for directory renames
  merge-recursive: avoid clobbering untracked files with directory
renames
  merge-recursive: fix overwriting dirty files involved in renames
  merge-recursive: fix remaining directory rename + dirty overwrite
cases
  directory rename detection: new testcases showcasing a pair of bugs
  merge-recursive: avoid spurious rename/rename conflict from dir
renames
  merge-recursive: ensure we write updates for directory-renamed file

 merge-recursive.c   | 1243 ++-
 merge-recursive.h   |   27 +
 strbuf.c|   16 +
 strbuf.h|   16 +
 t/t3501-revert-cherry-pick.sh   |2 +-
 t/t6043-merge-rename-directories.sh | 3998 +++
 t/t7607-merge-overwrite.sh  |2 +-
 unpack-trees.c  |4 +-
 unpack-trees.h  |4 +
 9 files changed, 5197 insertions(+), 115 deletions(-)
 create mode 100755 t/t6043-merge-rename-directories.sh

Full tbdiff (the biggest code changes come from commit squashing):

 1: 5ba69c9c7b !  1: 9f1d894d89 directory rename detection: basic testcases
@@ -2,6 +2,7 @@
 
 directory rename detection: basic testcases
 
+Reviewed-by: Stefan Beller <sbel...@google.com>
 Signed-off-by: Elijah Newren <new...@gmail.com>
 
 diff --git a/t/t6043-merge-rename-directories.sh 
b/t/t6043-merge-rename-directories.sh
 2: e1d23f7f95 !  2: 36a4b05757 directory rename detection: directory splitting 
testcases
@@ -2,6 +2,7 @@
 
     directory rename detection: directory splitting testcases
 
+Reviewed-by: Stefan Beller <sbel...@google.com>
 Signed-off-by: Elijah Newren <new...@gmail.com>
 
 diff --git a/t/t6043-merge-rename-directories.sh 
b/t/t6043-merge-rename-directories.sh
 3: b10cb4

[PATCH v8 02/29] directory rename detection: directory splitting testcases

2018-02-14 Thread Elijah Newren
Reviewed-by: Stefan Beller 
Signed-off-by: Elijah Newren 
---
 t/t6043-merge-rename-directories.sh | 143 
 1 file changed, 143 insertions(+)

diff --git a/t/t6043-merge-rename-directories.sh 
b/t/t6043-merge-rename-directories.sh
index d045f0e31e..b22a9052b3 100755
--- a/t/t6043-merge-rename-directories.sh
+++ b/t/t6043-merge-rename-directories.sh
@@ -439,4 +439,147 @@ test_expect_failure '1f-check: Split a directory into two 
other directories' '
 #   in section 2, plus testcases 3a and 4a.
 ###
 
+
+###
+# SECTION 2: Split into multiple directories, with equal number of paths
+#
+# Explore the splitting-a-directory rules a bit; what happens in the
+# edge cases?
+#
+# Note that there is a closely related case of a directory not being
+# split on either side of history, but being renamed differently on
+# each side.  See testcase 8e for that.
+###
+
+# Testcase 2a, Directory split into two on one side, with equal numbers of 
paths
+#   Commit O: z/{b,c}
+#   Commit A: y/b, w/c
+#   Commit B: z/{b,c,d}
+#   Expected: y/b, w/c, z/d, with warning about z/ -> (y/ vs. w/) conflict
+test_expect_success '2a-setup: Directory split into two on one side, with 
equal numbers of paths' '
+   test_create_repo 2a &&
+   (
+   cd 2a &&
+
+   mkdir z &&
+   echo b >z/b &&
+   echo c >z/c &&
+   git add z &&
+   test_tick &&
+   git commit -m "O" &&
+
+   git branch O &&
+   git branch A &&
+   git branch B &&
+
+   git checkout A &&
+   mkdir y &&
+   mkdir w &&
+   git mv z/b y/ &&
+   git mv z/c w/ &&
+   test_tick &&
+   git commit -m "A" &&
+
+   git checkout B &&
+   echo d >z/d &&
+   git add z/d &&
+   test_tick &&
+   git commit -m "B"
+   )
+'
+
+test_expect_failure '2a-check: Directory split into two on one side, with 
equal numbers of paths' '
+   (
+   cd 2a &&
+
+   git checkout A^0 &&
+
+   test_must_fail git merge -s recursive B^0 >out &&
+   test_i18ngrep "CONFLICT.*directory rename split" out &&
+
+   git ls-files -s >out &&
+   test_line_count = 3 out &&
+   git ls-files -u >out &&
+   test_line_count = 0 out &&
+   git ls-files -o >out &&
+   test_line_count = 1 out &&
+
+   git rev-parse >actual \
+   :0:y/b :0:w/c :0:z/d &&
+   git rev-parse >expect \
+O:z/b  O:z/c  B:z/d &&
+   test_cmp expect actual
+   )
+'
+
+# Testcase 2b, Directory split into two on one side, with equal numbers of 
paths
+#   Commit O: z/{b,c}
+#   Commit A: y/b, w/c
+#   Commit B: z/{b,c}, x/d
+#   Expected: y/b, w/c, x/d; No warning about z/ -> (y/ vs. w/) conflict
+test_expect_success '2b-setup: Directory split into two on one side, with 
equal numbers of paths' '
+   test_create_repo 2b &&
+   (
+   cd 2b &&
+
+   mkdir z &&
+   echo b >z/b &&
+   echo c >z/c &&
+   git add z &&
+   test_tick &&
+   git commit -m "O" &&
+
+   git branch O &&
+   git branch A &&
+   git branch B &&
+
+   git checkout A &&
+   mkdir y &&
+   mkdir w &&
+   git mv z/b y/ &&
+   git mv z/c w/ &&
+   test_tick &&
+   git commit -m "A" &&
+
+   git checkout B &&
+   mkdir x &&
+   echo d >x/d &&
+   git add x/d &&
+   test_tick &&
+   git commit -m "B"
+   )
+'
+
+test_expect_success '2b-check: Directory split into two on one side, with 
equal numbers of paths' '
+   (
+   cd 2b &&
+
+   git checkout A^0 &&
+
+   git merge -s recursive B^0 >out &&
+
+   git ls-files -s >out &&
+   test_line_count = 3 out &&
+   git ls-files -u >out &&
+   test_line_count = 0 out &&
+   git ls-files -o >out &&
+   test_line_count = 1 out &&
+
+   git rev-parse >actual \
+   :0:y/b :0:w/c :0:x/d &&
+   git rev-parse >expect \
+O:z/b  O:z/c  B:x/d &&
+   test_cmp expect actual &&
+   test_i18ngrep ! "CONFLICT.*directory rename split" out
+   )
+'
+

[PATCH v7 09/31] directory rename detection: miscellaneous testcases to complete coverage

2018-01-30 Thread Elijah Newren
I came up with the testcases in the first eight sections before coding up
the implementation.  The testcases in this section were mostly ones I
thought of while coding/debugging, and which I was too lazy to insert
into the previous sections because I didn't want to re-label with all the
testcase references.  :-)

Signed-off-by: Elijah Newren 
---
 t/t6043-merge-rename-directories.sh | 565 +++-
 1 file changed, 564 insertions(+), 1 deletion(-)

diff --git a/t/t6043-merge-rename-directories.sh 
b/t/t6043-merge-rename-directories.sh
index e2db5d0ac1..b730256653 100755
--- a/t/t6043-merge-rename-directories.sh
+++ b/t/t6043-merge-rename-directories.sh
@@ -305,6 +305,7 @@ test_expect_failure '1d-check: Directory renames cause a 
rename/rename(2to1) con
 '
 
 # Testcase 1e, Renamed directory, with all filenames being renamed too
+#   (Related to testcases 9f & 9g)
 #   Commit O: z/{oldb,oldc}
 #   Commit A: y/{newb,newc}
 #   Commit B: z/{oldb,oldc,d}
@@ -593,7 +594,7 @@ test_expect_success '2b-check: Directory split into two on 
one side, with equal
 ###
 
 # Testcase 3a, Avoid implicit rename if involved as source on other side
-#   (Related to testcases 1c and 1f)
+#   (Related to testcases 1c, 1f, and 9h)
 #   Commit O: z/{b,c,d}
 #   Commit A: z/{b,c,d} (no change)
 #   Commit B: y/{b,c}, x/d
@@ -2308,4 +2309,566 @@ test_expect_failure '8e-check: Both sides rename, one 
side adds to original dire
)
 '
 
+###
+# SECTION 9: Other testcases
+#
+# This section consists of miscellaneous testcases I thought of during
+# the implementation which round out the testing.
+###
+
+# Testcase 9a, Inner renamed directory within outer renamed directory
+#   (Related to testcase 1f)
+#   Commit O: z/{b,c,d/{e,f,g}}
+#   Commit A: y/{b,c}, x/w/{e,f,g}
+#   Commit B: z/{b,c,d/{e,f,g,h},i}
+#   Expected: y/{b,c,i}, x/w/{e,f,g,h}
+#   NOTE: The only reason this one is interesting is because when a directory
+# is split into multiple other directories, we determine by the weight
+# of which one had the most paths going to it.  A naive implementation
+# of that could take the new file in commit B at z/i to x/w/i or x/i.
+
+test_expect_success '9a-setup: Inner renamed directory within outer renamed 
directory' '
+   test_create_repo 9a &&
+   (
+   cd 9a &&
+
+   mkdir -p z/d &&
+   echo b >z/b &&
+   echo c >z/c &&
+   echo e >z/d/e &&
+   echo f >z/d/f &&
+   echo g >z/d/g &&
+   git add z &&
+   test_tick &&
+   git commit -m "O" &&
+
+   git branch O &&
+   git branch A &&
+   git branch B &&
+
+   git checkout A &&
+   mkdir x &&
+   git mv z/d x/w &&
+   git mv z y &&
+   test_tick &&
+   git commit -m "A" &&
+
+   git checkout B &&
+   echo h >z/d/h &&
+   echo i >z/i &&
+   git add z &&
+   test_tick &&
+   git commit -m "B"
+   )
+'
+
+test_expect_failure '9a-check: Inner renamed directory within outer renamed 
directory' '
+   (
+   cd 9a &&
+
+   git checkout A^0 &&
+
+   git merge -s recursive B^0 &&
+
+   git ls-files -s >out &&
+   test_line_count = 7 out &&
+   git ls-files -u >out &&
+   test_line_count = 0 out &&
+   git ls-files -o >out &&
+   test_line_count = 1 out &&
+
+   git rev-parse >actual \
+   HEAD:y/b HEAD:y/c HEAD:y/i &&
+   git rev-parse >expect \
+   O:z/bO:z/cB:z/i &&
+   test_cmp expect actual &&
+
+   git rev-parse >actual \
+   HEAD:x/w/e HEAD:x/w/f HEAD:x/w/g HEAD:x/w/h &&
+   git rev-parse >expect \
+   O:z/d/eO:z/d/fO:z/d/gB:z/d/h &&
+   test_cmp expect actual
+   )
+'
+
+# Testcase 9b, Transitive rename with content merge
+#   (Related to testcase 1c)
+#   Commit O: z/{b,c},   x/d_1
+#   Commit A: y/{b,c},   x/d_2
+#   Commit B: z/{b,c,d_3}
+#   Expected: y/{b,c,d_merged}
+
+test_expect_success '9b-setup: Transitive rename with content merge' '
+   test_create_repo 9b &&
+   (
+   cd 9b &&
+
+   mkdir z &&
+   echo b >z/b &&
+   echo c >z/c &&
+   mkdir x &&
+   test_seq 1 10 >x/d &&
+   git add z x &&
+   test_tick &&
+   git commit -m "O" &&
+
+   git branch O &&
+   

[PATCH v7 07/31] directory rename detection: more involved edge/corner testcases

2018-01-30 Thread Elijah Newren
Signed-off-by: Elijah Newren 
---
 t/t6043-merge-rename-directories.sh | 396 
 1 file changed, 396 insertions(+)

diff --git a/t/t6043-merge-rename-directories.sh 
b/t/t6043-merge-rename-directories.sh
index fbeb8f4316..68bd86f555 100755
--- a/t/t6043-merge-rename-directories.sh
+++ b/t/t6043-merge-rename-directories.sh
@@ -1508,4 +1508,400 @@ test_expect_success '6e-check: Add/add from one side' '
 #   side of history is the one doing the renaming.
 ###
 
+
+###
+# SECTION 7: More involved Edge/Corner cases
+#
+# The ruleset we have generated in the above sections seems to provide
+# well-defined merges.  But can we find edge/corner cases that either (a)
+# are harder for users to understand, or (b) have a resolution that is
+# non-intuitive or suboptimal?
+#
+# The testcases in this section dive into cases that I've tried to craft in
+# a way to find some that might be surprising to users or difficult for
+# them to understand (the next section will look at non-intuitive or
+# suboptimal merge results).  Some of the testcases are similar to ones
+# from past sections, but have been simplified to try to highlight error
+# messages using a "modified" path (due to the directory rename).  Are
+# users okay with these?
+#
+# In my opinion, testcases that are difficult to understand from this
+# section is due to difficulty in the testcase rather than the directory
+# renaming (similar to how t6042 and t6036 have difficult resolutions due
+# to the problem setup itself being complex).  And I don't think the
+# error messages are a problem.
+#
+# On the other hand, the testcases in section 8 worry me slightly more...
+###
+
+# Testcase 7a, rename-dir vs. rename-dir (NOT split evenly) PLUS add-other-file
+#   Commit O: z/{b,c}
+#   Commit A: y/{b,c}
+#   Commit B: w/b, x/c, z/d
+#   Expected: y/d, CONFLICT(rename/rename for both z/b and z/c)
+#   NOTE: There's a rename of z/ here, y/ has more renames, so z/d -> y/d.
+
+test_expect_success '7a-setup: rename-dir vs. rename-dir (NOT split evenly) 
PLUS add-other-file' '
+   test_create_repo 7a &&
+   (
+   cd 7a &&
+
+   mkdir z &&
+   echo b >z/b &&
+   echo c >z/c &&
+   git add z &&
+   test_tick &&
+   git commit -m "O" &&
+
+   git branch O &&
+   git branch A &&
+   git branch B &&
+
+   git checkout A &&
+   git mv z y &&
+   test_tick &&
+   git commit -m "A" &&
+
+   git checkout B &&
+   mkdir w &&
+   mkdir x &&
+   git mv z/b w/ &&
+   git mv z/c x/ &&
+   echo d > z/d &&
+   git add z/d &&
+   test_tick &&
+   git commit -m "B"
+   )
+'
+
+test_expect_failure '7a-check: rename-dir vs. rename-dir (NOT split evenly) 
PLUS add-other-file' '
+   (
+   cd 7a &&
+
+   git checkout A^0 &&
+
+   test_must_fail git merge -s recursive B^0 >out &&
+   test_i18ngrep "CONFLICT (rename/rename).*z/b.*y/b.*w/b" out &&
+   test_i18ngrep "CONFLICT (rename/rename).*z/c.*y/c.*x/c" out &&
+
+   git ls-files -s >out &&
+   test_line_count = 7 out &&
+   git ls-files -u >out &&
+   test_line_count = 6 out &&
+   git ls-files -o >out &&
+   test_line_count = 1 out &&
+
+   git rev-parse >actual \
+   :1:z/b :2:y/b :3:w/b :1:z/c :2:y/c :3:x/c :0:y/d &&
+   git rev-parse >expect \
+O:z/b  O:z/b  O:z/b  O:z/c  O:z/c  O:z/c  B:z/d &&
+   test_cmp expect actual &&
+
+   git hash-object >actual \
+   y/b   w/b   y/c   x/c &&
+   git rev-parse >expect \
+   O:z/b O:z/b O:z/c O:z/c &&
+   test_cmp expect actual
+   )
+'
+
+# Testcase 7b, rename/rename(2to1), but only due to transitive rename
+#   (Related to testcase 1d)
+#   Commit O: z/{b,c}, x/d_1, w/d_2
+#   Commit A: y/{b,c,d_2}, x/d_1
+#   Commit B: z/{b,c,d_1},w/d_2
+#   Expected: y/{b,c}, CONFLICT(rename/rename(2to1): x/d_1, w/d_2 -> y_d)
+
+test_expect_success '7b-setup: rename/rename(2to1), but only due to transitive 
rename' '
+   test_create_repo 7b &&
+   (
+   cd 7b &&
+
+   mkdir z &&
+   mkdir x &&
+   mkdir w &&
+   echo b >z/b &&
+   echo c >z/c &&
+   echo d1 > x/d &&
+   echo d2 > w/d &&
+   git add z x w &&
+   test_tick &&
+ 

[PATCH v7 08/31] directory rename detection: testcases exploring possibly suboptimal merges

2018-01-30 Thread Elijah Newren
Signed-off-by: Elijah Newren <new...@gmail.com>
---
 t/t6043-merge-rename-directories.sh | 404 
 1 file changed, 404 insertions(+)

diff --git a/t/t6043-merge-rename-directories.sh 
b/t/t6043-merge-rename-directories.sh
index 68bd86f555..e2db5d0ac1 100755
--- a/t/t6043-merge-rename-directories.sh
+++ b/t/t6043-merge-rename-directories.sh
@@ -1904,4 +1904,408 @@ test_expect_failure '7e-check: transitive rename in 
rename/delete AND dirs in th
)
 '
 
+###
+# SECTION 8: Suboptimal merges
+#
+# As alluded to in the last section, the ruleset we have built up for
+# detecting directory renames unfortunately has some special cases where it
+# results in slightly suboptimal or non-intuitive behavior.  This section
+# explores these cases.
+#
+# To be fair, we already had non-intuitive or suboptimal behavior for most
+# of these cases in git before introducing implicit directory rename
+# detection, but it'd be nice if there was a modified ruleset out there
+# that handled these cases a bit better.
+###
+
+# Testcase 8a, Dual-directory rename, one into the others' way
+#   Commit O. x/{a,b},   y/{c,d}
+#   Commit A. x/{a,b,e}, y/{c,d,f}
+#   Commit B. y/{a,b},   z/{c,d}
+#
+# Possible Resolutions:
+#   w/o dir-rename detection: y/{a,b,f},   z/{c,d},   x/e
+#   Currently expected:   y/{a,b,e,f}, z/{c,d}
+#   Optimal:  y/{a,b,e},   z/{c,d,f}
+#
+# Note: Both x and y got renamed and it'd be nice to detect both, and we do
+# better with directory rename detection than git did without, but the
+# simple rule from section 5 prevents me from handling this as optimally as
+# we potentially could.
+
+test_expect_success '8a-setup: Dual-directory rename, one into the others way' 
'
+   test_create_repo 8a &&
+   (
+   cd 8a &&
+
+   mkdir x &&
+   mkdir y &&
+   echo a >x/a &&
+   echo b >x/b &&
+   echo c >y/c &&
+   echo d >y/d &&
+   git add x y &&
+   test_tick &&
+   git commit -m "O" &&
+
+   git branch O &&
+   git branch A &&
+   git branch B &&
+
+   git checkout A &&
+   echo e >x/e &&
+   echo f >y/f &&
+   git add x/e y/f &&
+   test_tick &&
+   git commit -m "A" &&
+
+   git checkout B &&
+   git mv y z &&
+   git mv x y &&
+   test_tick &&
+   git commit -m "B"
+   )
+'
+
+test_expect_failure '8a-check: Dual-directory rename, one into the others way' 
'
+   (
+   cd 8a &&
+
+   git checkout A^0 &&
+
+   git merge -s recursive B^0 &&
+
+   git ls-files -s >out &&
+   test_line_count = 6 out &&
+   git ls-files -u >out &&
+   test_line_count = 0 out &&
+   git ls-files -o >out &&
+   test_line_count = 1 out &&
+
+   git rev-parse >actual \
+   HEAD:y/a HEAD:y/b HEAD:y/e HEAD:y/f HEAD:z/c HEAD:z/d &&
+   git rev-parse >expect \
+   O:x/aO:x/bA:x/eA:y/fO:y/cO:y/d &&
+   test_cmp expect actual
+   )
+'
+
+# Testcase 8b, Dual-directory rename, one into the others' way, with 
conflicting filenames
+#   Commit O. x/{a_1,b_1}, y/{a_2,b_2}
+#   Commit A. x/{a_1,b_1,e_1}, y/{a_2,b_2,e_2}
+#   Commit B. y/{a_1,b_1}, z/{a_2,b_2}
+#
+#   w/o dir-rename detection: y/{a_1,b_1,e_2}, z/{a_2,b_2}, x/e_1
+#   Currently expected:   
+#   Scary:    y/{a_1,b_1}, z/{a_2,b_2}, CONFLICT(add/add, 
e_1 vs. e_2)
+#   Optimal:  y/{a_1,b_1,e_1}, z/{a_2,b_2,e_2}
+#
+# Note: Very similar to 8a, except instead of 'e' and 'f' in directories x and
+# y, both are named 'e'.  Without directory rename detection, neither file
+# moves directories.  Implement directory rename detection suboptimally, and
+# you get an add/add conflict, but both files were added in commit A, so this
+# is an add/add conflict where one side of history added both files --
+# something we can't represent in the index.  Obviously, we'd prefer the last
+# resolution, but our previous rules are too coarse to allow it.  Using both
+# the rules from section 4 and section 5 save us from the Scary resolution,
+# making us fall back to pre-directory-rename-detection behavior f

[PATCH v7 03/31] directory rename detection: testcases to avoid taking detection too far

2018-01-30 Thread Elijah Newren
Signed-off-by: Elijah Newren <new...@gmail.com>
---
 t/t6043-merge-rename-directories.sh | 153 
 1 file changed, 153 insertions(+)

diff --git a/t/t6043-merge-rename-directories.sh 
b/t/t6043-merge-rename-directories.sh
index b22a9052b3..8049ed5fc9 100755
--- a/t/t6043-merge-rename-directories.sh
+++ b/t/t6043-merge-rename-directories.sh
@@ -582,4 +582,157 @@ test_expect_success '2b-check: Directory split into two 
on one side, with equal
 #   messages are handled correctly.
 ###
 
+
+###
+# SECTION 3: Path in question is the source path for some rename already
+#
+# Combining cases from Section 1 and trying to handle them could lead to
+# directory renaming detection being over-applied.  So, this section
+# provides some good testcases to check that the implementation doesn't go
+# too far.
+###
+
+# Testcase 3a, Avoid implicit rename if involved as source on other side
+#   (Related to testcases 1c and 1f)
+#   Commit O: z/{b,c,d}
+#   Commit A: z/{b,c,d} (no change)
+#   Commit B: y/{b,c}, x/d
+#   Expected: y/{b,c}, x/d
+test_expect_success '3a-setup: Avoid implicit rename if involved as source on 
other side' '
+   test_create_repo 3a &&
+   (
+   cd 3a &&
+
+   mkdir z &&
+   echo b >z/b &&
+   echo c >z/c &&
+   echo d >z/d &&
+   git add z &&
+   test_tick &&
+   git commit -m "O" &&
+
+   git branch O &&
+   git branch A &&
+   git branch B &&
+
+   git checkout A &&
+   test_tick &&
+   git commit --allow-empty -m "A" &&
+
+   git checkout B &&
+   mkdir y &&
+   mkdir x &&
+   git mv z/b y/ &&
+   git mv z/c y/ &&
+   git mv z/d x/ &&
+   rmdir z &&
+   test_tick &&
+   git commit -m "B"
+   )
+'
+
+test_expect_success '3a-check: Avoid implicit rename if involved as source on 
other side' '
+   (
+   cd 3a &&
+
+   git checkout A^0 &&
+
+   git merge -s recursive B^0 &&
+
+   git ls-files -s >out &&
+   test_line_count = 3 out &&
+
+   git rev-parse >actual \
+   HEAD:y/b HEAD:y/c HEAD:x/d &&
+   git rev-parse >expect \
+   O:z/bO:z/cO:z/d &&
+   test_cmp expect actual
+   )
+'
+
+# Testcase 3b, Avoid implicit rename if involved as source on other side
+#   (Related to testcases 5c and 7c, also kind of 1e and 1f)
+#   Commit O: z/{b,c,d}
+#   Commit A: y/{b,c}, x/d
+#   Commit B: z/{b,c}, w/d
+#   Expected: y/{b,c}, CONFLICT:(z/d -> x/d vs. w/d)
+#   NOTE: We're particularly checking that since z/d is already involved as
+# a source in a file rename on the same side of history, that we don't
+# get it involved in directory rename detection.  If it were, we might
+# end up with CONFLICT:(z/d -> y/d vs. x/d vs. w/d), i.e. a
+# rename/rename/rename(1to3) conflict, which is just weird.
+test_expect_success '3b-setup: Avoid implicit rename if involved as source on 
current side' '
+   test_create_repo 3b &&
+   (
+   cd 3b &&
+
+   mkdir z &&
+   echo b >z/b &&
+   echo c >z/c &&
+   echo d >z/d &&
+   git add z &&
+   test_tick &&
+   git commit -m "O" &&
+
+   git branch O &&
+   git branch A &&
+   git branch B &&
+
+   git checkout A &&
+   mkdir y &&
+   mkdir x &&
+   git mv z/b y/ &&
+   git mv z/c y/ &&
+   git mv z/d x/ &&
+   rmdir z &&
+   test_tick &&
+   git commit -m "A" &&
+
+   git checkout B &&
+   mkdir w &&
+   git mv z/d w/ &&
+   test_tick &&
+   git commit -m "B"
+   )
+'
+
+test_expect_success '3b-check: Avoid implicit rename if involved as source on 
current side' '
+   (
+   cd 3b &&
+
+   git 

[PATCH v7 29/31] directory rename detection: new testcases showcasing a pair of bugs

2018-01-30 Thread Elijah Newren
Add a testcase showing spurious rename/rename(1to2) conflicts occurring
due to directory rename detection.

Also add a pair of testcases dealing with moving directory hierarchies
around that were suggested by Stefan Beller as "food for thought" during
his review of an earlier patch series, but which actually uncovered a
bug.  Round things out with a test that is a cross between the two
testcases that showed existing bugs in order to make sure we aren't
merely addressing problems in isolation but in general.

Signed-off-by: Elijah Newren <new...@gmail.com>
---
 t/t6043-merge-rename-directories.sh | 296 
 1 file changed, 296 insertions(+)

diff --git a/t/t6043-merge-rename-directories.sh 
b/t/t6043-merge-rename-directories.sh
index a34c57d986..3d292f0c5f 100755
--- a/t/t6043-merge-rename-directories.sh
+++ b/t/t6043-merge-rename-directories.sh
@@ -159,6 +159,7 @@ test_expect_success '1b-check: Merge a directory with 
another' '
 # Testcase 1c, Transitive renaming
 #   (Related to testcases 3a and 6d -- when should a transitive rename apply?)
 #   (Related to testcases 9c and 9d -- can transitivity repeat?)
+#   (Related to testcase 12b -- joint-transitivity?)
 #   Commit O: z/{b,c},   x/d
 #   Commit A: y/{b,c},   x/d
 #   Commit B: z/{b,c,d}
@@ -2863,6 +2864,68 @@ test_expect_failure '9g-check: Renamed directory that 
only contained immediate s
)
 '
 
+# Testcase 9h, Avoid implicit rename if involved as source on other side
+#   (Extremely closely related to testcase 3a)
+#   Commit O: z/{b,c,d_1}
+#   Commit A: z/{b,c,d_2}
+#   Commit B: y/{b,c}, x/d_1
+#   Expected: y/{b,c}, x/d_2
+#   NOTE: If we applied the z/ -> y/ rename to z/d, then we'd end up with
+# a rename/rename(1to2) conflict (z/d -> y/d vs. x/d)
+test_expect_success '9h-setup: Avoid dir rename on merely modified path' '
+   test_create_repo 9h &&
+   (
+   cd 9h &&
+
+   mkdir z &&
+   echo b >z/b &&
+   echo c >z/c &&
+   printf "1\n2\n3\n4\n5\n6\n7\n8\nd\n" >z/d &&
+   git add z &&
+   test_tick &&
+   git commit -m "O" &&
+
+   git branch O &&
+   git branch A &&
+   git branch B &&
+
+   git checkout A &&
+   test_tick &&
+   echo more >>z/d &&
+   git add z/d &&
+   git commit -m "A" &&
+
+   git checkout B &&
+   mkdir y &&
+   mkdir x &&
+   git mv z/b y/ &&
+   git mv z/c y/ &&
+   git mv z/d x/ &&
+   rmdir z &&
+   test_tick &&
+   git commit -m "B"
+   )
+'
+
+test_expect_failure '9h-check: Avoid dir rename on merely modified path' '
+   (
+   cd 9h &&
+
+   git checkout A^0 &&
+
+   git merge -s recursive B^0 &&
+
+   git ls-files -s >out &&
+   test_line_count = 3 out &&
+
+   git rev-parse >actual \
+   HEAD:y/b HEAD:y/c HEAD:x/d &&
+   git rev-parse >expect \
+   O:z/bO:z/cA:z/d &&
+   test_cmp expect actual
+   )
+'
+
 ###
 # Rules suggested by section 9:
 #
@@ -3696,4 +3759,237 @@ test_expect_success '11f-check: Avoid deleting 
not-uptodate with dir rename/rena
)
 '
 
+###
+# SECTION 12: Everything else
+#
+# Tests suggested by others.  Tests added after implementation completed
+# and submitted.  Grab bag.
+###
+
+# Testcase 12a, Moving one directory hierarchy into another
+#   (Related to testcase 9a)
+#   Commit O: node1/{leaf1,leaf2}, node2/{leaf3,leaf4}
+#   Commit A: node1/{leaf1,leaf2,node2/{leaf3,leaf4}}
+#   Commit B: node1/{leaf1,leaf2,leaf5}, node2/{leaf3,leaf4,leaf6}
+#   Expected: node1/{leaf1,leaf2,leaf5,node2/{leaf3,leaf4,leaf6}}
+
+test_expect_success '12a-setup: Moving one directory hierarchy into another' '
+   test_create_repo 12a &&
+   (
+   cd 12a &&
+
+   mkdir -p node1 node2 &&
+   echo leaf1 >node1/leaf1 &&
+   echo leaf2 >node1/leaf2 &&
+   echo leaf3 >node2/leaf3 &&
+   echo leaf4 >node2/leaf4 &&
+   git add node1 node2 &&
+   test_tick &&
+   

[PATCH v7 01/31] directory rename detection: basic testcases

2018-01-30 Thread Elijah Newren
Signed-off-by: Elijah Newren <new...@gmail.com>
---
 t/t6043-merge-rename-directories.sh | 442 
 1 file changed, 442 insertions(+)
 create mode 100755 t/t6043-merge-rename-directories.sh

diff --git a/t/t6043-merge-rename-directories.sh 
b/t/t6043-merge-rename-directories.sh
new file mode 100755
index 00..d045f0e31e
--- /dev/null
+++ b/t/t6043-merge-rename-directories.sh
@@ -0,0 +1,442 @@
+#!/bin/sh
+
+test_description="recursive merge with directory renames"
+# includes checking of many corner cases, with a similar methodology to:
+#   t6042: corner cases with renames but not criss-cross merges
+#   t6036: corner cases with both renames and criss-cross merges
+#
+# The setup for all of them, pictorially, is:
+#
+#  A
+#  o
+# / \
+#  O o   ?
+# \ /
+#  o
+#  B
+#
+# To help make it easier to follow the flow of tests, they have been
+# divided into sections and each test will start with a quick explanation
+# of what commits O, A, and B contain.
+#
+# Notation:
+#z/{b,c}   means  files z/b and z/c both exist
+#x/d_1 means  file x/d exists with content d1.  (Purpose of the
+# underscore notation is to differentiate different
+# files that might be renamed into each other's paths.)
+
+. ./test-lib.sh
+
+
+###
+# SECTION 1: Basic cases we should be able to handle
+###
+
+# Testcase 1a, Basic directory rename.
+#   Commit O: z/{b,c}
+#   Commit A: y/{b,c}
+#   Commit B: z/{b,c,d,e/f}
+#   Expected: y/{b,c,d,e/f}
+
+test_expect_success '1a-setup: Simple directory rename detection' '
+   test_create_repo 1a &&
+   (
+   cd 1a &&
+
+   mkdir z &&
+   echo b >z/b &&
+   echo c >z/c &&
+   git add z &&
+   test_tick &&
+   git commit -m "O" &&
+
+   git branch O &&
+   git branch A &&
+   git branch B &&
+
+   git checkout A &&
+   git mv z y &&
+   test_tick &&
+   git commit -m "A" &&
+
+   git checkout B &&
+   echo d >z/d &&
+   mkdir z/e &&
+       echo f >z/e/f &&
+   git add z/d z/e/f &&
+   test_tick &&
+   git commit -m "B"
+   )
+'
+
+test_expect_failure '1a-check: Simple directory rename detection' '
+   (
+   cd 1a &&
+
+   git checkout A^0 &&
+
+   git merge -s recursive B^0 &&
+
+   git ls-files -s >out &&
+   test_line_count = 4 out &&
+
+   git rev-parse >actual \
+   HEAD:y/b HEAD:y/c HEAD:y/d HEAD:y/e/f &&
+   git rev-parse >expect \
+   O:z/bO:z/cB:z/dB:z/e/f &&
+   test_cmp expect actual &&
+
+   git hash-object y/d >actual &&
+   git rev-parse B:z/d >expect &&
+   test_cmp expect actual &&
+
+   test_must_fail git rev-parse HEAD:z/d &&
+   test_must_fail git rev-parse HEAD:z/e/f &&
+   test_path_is_missing z/d &&
+   test_path_is_missing z/e/f
+   )
+'
+
+# Testcase 1b, Merge a directory with another
+#   Commit O: z/{b,c},   y/d
+#   Commit A: z/{b,c,e}, y/d
+#   Commit B: y/{b,c,d}
+#   Expected: y/{b,c,d,e}
+
+test_expect_success '1b-setup: Merge a directory with another' '
+   test_create_repo 1b &&
+   (
+   cd 1b &&
+
+   mkdir z &&
+   echo b >z/b &&
+   echo c >z/c &&
+   mkdir y &&
+   echo d >y/d &&
+   git add z y &&
+   test_tick &&
+   git commit -m "O" &&
+
+   git branch O &&
+   git branch A &&
+   git branch B &&
+
+   git checkout A &&
+   echo e >z/e &&
+   git add z/e &&
+   test_tick &&
+   git commit -m "A" &&
+
+   git checkout B &&
+   git mv z/b y &&
+   git mv z/c y &&
+   rmdir z &&
+   test_tick &&
+   git commit -m "B"
+   )
+'
+
+test_expect_failure '1b-check: Mer

[PATCH v7 02/31] directory rename detection: directory splitting testcases

2018-01-30 Thread Elijah Newren
Signed-off-by: Elijah Newren 
---
 t/t6043-merge-rename-directories.sh | 143 
 1 file changed, 143 insertions(+)

diff --git a/t/t6043-merge-rename-directories.sh 
b/t/t6043-merge-rename-directories.sh
index d045f0e31e..b22a9052b3 100755
--- a/t/t6043-merge-rename-directories.sh
+++ b/t/t6043-merge-rename-directories.sh
@@ -439,4 +439,147 @@ test_expect_failure '1f-check: Split a directory into two 
other directories' '
 #   in section 2, plus testcases 3a and 4a.
 ###
 
+
+###
+# SECTION 2: Split into multiple directories, with equal number of paths
+#
+# Explore the splitting-a-directory rules a bit; what happens in the
+# edge cases?
+#
+# Note that there is a closely related case of a directory not being
+# split on either side of history, but being renamed differently on
+# each side.  See testcase 8e for that.
+###
+
+# Testcase 2a, Directory split into two on one side, with equal numbers of 
paths
+#   Commit O: z/{b,c}
+#   Commit A: y/b, w/c
+#   Commit B: z/{b,c,d}
+#   Expected: y/b, w/c, z/d, with warning about z/ -> (y/ vs. w/) conflict
+test_expect_success '2a-setup: Directory split into two on one side, with 
equal numbers of paths' '
+   test_create_repo 2a &&
+   (
+   cd 2a &&
+
+   mkdir z &&
+   echo b >z/b &&
+   echo c >z/c &&
+   git add z &&
+   test_tick &&
+   git commit -m "O" &&
+
+   git branch O &&
+   git branch A &&
+   git branch B &&
+
+   git checkout A &&
+   mkdir y &&
+   mkdir w &&
+   git mv z/b y/ &&
+   git mv z/c w/ &&
+   test_tick &&
+   git commit -m "A" &&
+
+   git checkout B &&
+   echo d >z/d &&
+   git add z/d &&
+   test_tick &&
+   git commit -m "B"
+   )
+'
+
+test_expect_failure '2a-check: Directory split into two on one side, with 
equal numbers of paths' '
+   (
+   cd 2a &&
+
+   git checkout A^0 &&
+
+   test_must_fail git merge -s recursive B^0 >out &&
+   test_i18ngrep "CONFLICT.*directory rename split" out &&
+
+   git ls-files -s >out &&
+   test_line_count = 3 out &&
+   git ls-files -u >out &&
+   test_line_count = 0 out &&
+   git ls-files -o >out &&
+   test_line_count = 1 out &&
+
+   git rev-parse >actual \
+   :0:y/b :0:w/c :0:z/d &&
+   git rev-parse >expect \
+O:z/b  O:z/c  B:z/d &&
+   test_cmp expect actual
+   )
+'
+
+# Testcase 2b, Directory split into two on one side, with equal numbers of 
paths
+#   Commit O: z/{b,c}
+#   Commit A: y/b, w/c
+#   Commit B: z/{b,c}, x/d
+#   Expected: y/b, w/c, x/d; No warning about z/ -> (y/ vs. w/) conflict
+test_expect_success '2b-setup: Directory split into two on one side, with 
equal numbers of paths' '
+   test_create_repo 2b &&
+   (
+   cd 2b &&
+
+   mkdir z &&
+   echo b >z/b &&
+   echo c >z/c &&
+   git add z &&
+   test_tick &&
+   git commit -m "O" &&
+
+   git branch O &&
+   git branch A &&
+   git branch B &&
+
+   git checkout A &&
+   mkdir y &&
+   mkdir w &&
+   git mv z/b y/ &&
+   git mv z/c w/ &&
+   test_tick &&
+   git commit -m "A" &&
+
+   git checkout B &&
+   mkdir x &&
+   echo d >x/d &&
+   git add x/d &&
+   test_tick &&
+   git commit -m "B"
+   )
+'
+
+test_expect_success '2b-check: Directory split into two on one side, with 
equal numbers of paths' '
+   (
+   cd 2b &&
+
+   git checkout A^0 &&
+
+   git merge -s recursive B^0 >out &&
+
+   git ls-files -s >out &&
+   test_line_count = 3 out &&
+   git ls-files -u >out &&
+   test_line_count = 0 out &&
+   git ls-files -o >out &&
+   test_line_count = 1 out &&
+
+   git rev-parse >actual \
+   :0:y/b :0:w/c :0:x/d &&
+   git rev-parse >expect \
+O:z/b  O:z/c  B:x/d &&
+   test_cmp expect actual &&
+   test_i18ngrep ! "CONFLICT.*directory rename split" out
+   )
+'
+
+###
+# Rules suggested by section 2:
+#
+#   

[PATCH v7 04/31] directory rename detection: partially renamed directory testcase/discussion

2018-01-30 Thread Elijah Newren
Signed-off-by: Elijah Newren <new...@gmail.com>
---
 t/t6043-merge-rename-directories.sh | 107 
 1 file changed, 107 insertions(+)

diff --git a/t/t6043-merge-rename-directories.sh 
b/t/t6043-merge-rename-directories.sh
index 8049ed5fc9..f0213f2bbd 100755
--- a/t/t6043-merge-rename-directories.sh
+++ b/t/t6043-merge-rename-directories.sh
@@ -735,4 +735,111 @@ test_expect_success '3b-check: Avoid implicit rename if 
involved as source on cu
 #   of a rename on either side of a merge.
 ###
 
+
+###
+# SECTION 4: Partially renamed directory; still exists on both sides of merge
+#
+# What if we were to attempt to do directory rename detection when someone
+# "mostly" moved a directory but still left some files around, or,
+# equivalently, fully renamed a directory in one commmit and then recreated
+# that directory in a later commit adding some new files and then tried to
+# merge?
+#
+# It's hard to divine user intent in these cases, because you can make an
+# argument that, depending on the intermediate history of the side being
+# merged, that some users will want files in that directory to
+# automatically be detected and renamed, while users with a different
+# intermediate history wouldn't want that rename to happen.
+#
+# I think that it is best to simply not have directory rename detection
+# apply to such cases.  My reasoning for this is four-fold: (1) it's
+# easiest for users in general to figure out what happened if we don't
+# apply directory rename detection in any such case, (2) it's an easy rule
+# to explain ["We don't do directory rename detection if the directory
+# still exists on both sides of the merge"], (3) we can get some hairy
+# edge/corner cases that would be really confusing and possibly not even
+# representable in the index if we were to even try, and [related to 3] (4)
+# attempting to resolve this issue of divining user intent by examining
+# intermediate history goes against the spirit of three-way merges and is a
+# path towards crazy corner cases that are far more complex than what we're
+# already dealing with.
+#
+# This section contains a test for this partially-renamed-directory case.
+###
+
+# Testcase 4a, Directory split, with original directory still present
+#   (Related to testcase 1f)
+#   Commit O: z/{b,c,d,e}
+#   Commit A: y/{b,c,d}, z/e
+#   Commit B: z/{b,c,d,e,f}
+#   Expected: y/{b,c,d}, z/{e,f}
+#   NOTE: Even though most files from z moved to y, we don't want f to follow.
+
+test_expect_success '4a-setup: Directory split, with original directory still 
present' '
+   test_create_repo 4a &&
+   (
+   cd 4a &&
+
+   mkdir z &&
+   echo b >z/b &&
+   echo c >z/c &&
+   echo d >z/d &&
+   echo e >z/e &&
+   git add z &&
+   test_tick &&
+   git commit -m "O" &&
+
+   git branch O &&
+   git branch A &&
+   git branch B &&
+
+   git checkout A &&
+   mkdir y &&
+   git mv z/b y/ &&
+   git mv z/c y/ &&
+   git mv z/d y/ &&
+   test_tick &&
+   git commit -m "A" &&
+
+   git checkout B &&
+   echo f >z/f &&
+   git add z/f &&
+   test_tick &&
+   git commit -m "B"
+   )
+'
+
+test_expect_success '4a-check: Directory split, with original directory still 
present' '
+   (
+   cd 4a &&
+
+   git checkout A^0 &&
+
+   git merge -s recursive B^0 &&
+
+   git ls-files -s >out &&
+   test_line_count = 5 out &&
+   git ls-files -u >out &&
+   test_line_count = 0 out &&
+   git ls-files -o >out &&
+   test_line_count = 1 out &&
+
+   git rev-parse >actual \
+       HEAD:y/b HEAD:y/c HEAD:y/d HEAD:z/e HEAD:z/f &&
+   git rev-parse >expect \
+   O:z/bO:z/cO:z/dO:z/eB:z/f &&
+   test_cmp expect actual
+   )
+'
+
+#######
+# Rules suggested by section 4:
+#
+#   Directory-rename-detection should be turned off for any directories (as
+#   a source for renames) that exist on both sides of the merge.  (The "as
+#   a source for renames" clarification is due to cases like 1c where
+#   the target directory exists on both sides and we do want the rename
+#   detection.)  But, sadly, see testcase 8b.
+###
+
 test_done
-- 
2.16.1.106.gf69932adfe



[PATCH v7 00/31] Add directory rename detection to git

2018-01-30 Thread Elijah Newren
This patchset introduces directory rename detection to merge-recursive.  See
  https://public-inbox.org/git/20171110190550.27059-1-new...@gmail.com/
for the first series (including design considerations, etc.), and follow-up
series can be found at
  https://public-inbox.org/git/20171120220209.15111-1-new...@gmail.com/
  https://public-inbox.org/git/20171121080059.32304-1-new...@gmail.com/
  https://public-inbox.org/git/20171129014237.32570-1-new...@gmail.com/
  https://public-inbox.org/git/20171228041352.27880-1-new...@gmail.com/
  https://public-inbox.org/git/20180105202711.24311-1-new...@gmail.com/

Changes since v6 (full tbdiff follows below):
  * Fix missing file argument in a testcase, pointed out by SZEDER G??bor.
  * Add whitespace in some testcases in separate git-rev-parse invocations
to make it easier to visually see what is being compared.
  * Rebased on latest origin/master (not that there were any conflicts)

Note: This series continues to depend upon en/merge-recursive-fixes, at
  least contextually.

Elijah Newren (31):
  directory rename detection: basic testcases
  directory rename detection: directory splitting testcases
  directory rename detection: testcases to avoid taking detection too
far
  directory rename detection: partially renamed directory
testcase/discussion
  directory rename detection: files/directories in the way of some
renames
  directory rename detection: testcases checking which side did the
rename
  directory rename detection: more involved edge/corner testcases
  directory rename detection: testcases exploring possibly suboptimal
merges
  directory rename detection: miscellaneous testcases to complete
coverage
  directory rename detection: tests for handling overwriting untracked
files
  directory rename detection: tests for handling overwriting dirty files
  merge-recursive: move the get_renames() function
  merge-recursive: introduce new functions to handle rename logic
  merge-recursive: fix leaks of allocated renames and diff_filepairs
  merge-recursive: make !o->detect_rename codepath more obvious
  merge-recursive: split out code for determining diff_filepairs
  merge-recursive: add a new hashmap for storing directory renames
  merge-recursive: make a helper function for cleanup for handle_renames
  merge-recursive: add get_directory_renames()
  merge-recursive: check for directory level conflicts
  merge-recursive: add a new hashmap for storing file collisions
  merge-recursive: add computation of collisions due to dir rename &
merging
  merge-recursive: check for file level conflicts then get new name
  merge-recursive: when comparing files, don't include trees
  merge-recursive: apply necessary modifications for directory renames
  merge-recursive: avoid clobbering untracked files with directory
renames
  merge-recursive: fix overwriting dirty files involved in renames
  merge-recursive: fix remaining directory rename + dirty overwrite
cases
  directory rename detection: new testcases showcasing a pair of bugs
  merge-recursive: avoid spurious rename/rename conflict from dir
renames
  merge-recursive: ensure we write updates for directory-renamed file

 merge-recursive.c   | 1212 ++-
 merge-recursive.h   |   17 +
 strbuf.c|   16 +
 strbuf.h|   16 +
 t/t3501-revert-cherry-pick.sh   |2 +-
 t/t6043-merge-rename-directories.sh | 3990 +++
 t/t7607-merge-overwrite.sh  |2 +-
 unpack-trees.c  |4 +-
 unpack-trees.h  |4 +
 9 files changed, 5148 insertions(+), 115 deletions(-)
 create mode 100755 t/t6043-merge-rename-directories.sh

 1: 2fd0812b7a !  1: 5ba69c9c7b directory rename detection: basic testcases
@@ -94,7 +94,7 @@
 +  git rev-parse >actual \
 +  HEAD:y/b HEAD:y/c HEAD:y/d HEAD:y/e/f &&
 +  git rev-parse >expect \
-+  O:z/b O:z/c B:z/d B:z/e/f &&
++  O:z/bO:z/cB:z/dB:z/e/f &&
 +  test_cmp expect actual &&
 +
 +  git hash-object y/d >actual &&
@@ -161,7 +161,7 @@
 +  git rev-parse >actual \
 +  HEAD:y/b HEAD:y/c HEAD:y/d HEAD:y/e &&
 +  git rev-parse >expect \
-+  O:z/b O:z/c O:y/d A:z/e &&
++  O:z/bO:z/cO:y/dA:z/e &&
 +  test_cmp expect actual &&
 +  test_must_fail git rev-parse HEAD:z/e
 +  )
@@ -219,7 +219,7 @@
 +  git rev-parse >actual \
 +  HEAD:y/b HEAD:y/c HEAD:y/d &&
 +  git rev-parse >expect \
-+  O:z/b O:z/c O:x/d &&
++  O:z/b 

[PATCH v7 11/31] directory rename detection: tests for handling overwriting dirty files

2018-01-30 Thread Elijah Newren
Signed-off-by: Elijah Newren <new...@gmail.com>
---
 t/t6043-merge-rename-directories.sh | 458 
 1 file changed, 458 insertions(+)

diff --git a/t/t6043-merge-rename-directories.sh 
b/t/t6043-merge-rename-directories.sh
index aa9af49edc..fbac664408 100755
--- a/t/t6043-merge-rename-directories.sh
+++ b/t/t6043-merge-rename-directories.sh
@@ -3238,4 +3238,462 @@ test_expect_failure '10e-check: Does git complain about 
untracked file that is n
)
 '
 
+###
+# SECTION 11: Handling dirty (not up-to-date) files
+#
+# unpack_trees(), upon which the recursive merge algorithm is based, aborts
+# the operation if untracked or dirty files would be deleted or overwritten
+# by the merge.  Unfortunately, unpack_trees() does not understand renames,
+# and if it doesn't abort, then it muddies up the working directory before
+# we even get to the point of detecting renames, so we need some special
+# handling.  This was true even of normal renames, but there are additional
+# codepaths that need special handling with directory renames.  Add
+# testcases for both renamed-by-directory-rename-detection and standard
+# rename cases.
+###
+
+# Testcase 11a, Avoid losing dirty contents with simple rename
+#   Commit O: z/{a,b_v1},
+#   Commit A: z/{a,c_v1}, and z/c_v1 has uncommitted mods
+#   Commit B: z/{a,b_v2}
+#   Expected: ERROR_MSG(Refusing to lose dirty file at z/c) +
+# z/a, staged version of z/c has sha1sum matching B:z/b_v2,
+# z/c~HEAD with contents of B:z/b_v2,
+# z/c with uncommitted mods on top of A:z/c_v1
+
+test_expect_success '11a-setup: Avoid losing dirty contents with simple 
rename' '
+   test_create_repo 11a &&
+   (
+   cd 11a &&
+
+   mkdir z &&
+   echo a >z/a &&
+   test_seq 1 10 >z/b &&
+   git add z &&
+   test_tick &&
+   git commit -m "O" &&
+
+   git branch O &&
+   git branch A &&
+   git branch B &&
+
+   git checkout A &&
+   git mv z/b z/c &&
+   test_tick &&
+   git commit -m "A" &&
+
+   git checkout B &&
+   echo 11 >>z/b &&
+   git add z/b &&
+   test_tick &&
+   git commit -m "B"
+   )
+'
+
+test_expect_failure '11a-check: Avoid losing dirty contents with simple 
rename' '
+   (
+   cd 11a &&
+
+   git checkout A^0 &&
+   echo stuff >>z/c &&
+
+   test_must_fail git merge -s recursive B^0 >out 2>err &&
+   test_i18ngrep "Refusing to lose dirty file at z/c" out &&
+
+   test_seq 1 10 >expected &&
+   echo stuff >>expected &&
+   test_cmp expected z/c &&
+
+   git ls-files -s >out &&
+   test_line_count = 2 out &&
+   git ls-files -u >out &&
+   test_line_count = 1 out &&
+   git ls-files -o >out &&
+   test_line_count = 4 out &&
+
+   git rev-parse >actual \
+   :0:z/a :2:z/c &&
+   git rev-parse >expect \
+O:z/a  B:z/b &&
+   test_cmp expect actual &&
+
+   git hash-object z/c~HEAD >actual &&
+   git rev-parse B:z/b >expect &&
+   test_cmp expect actual
+   )
+'
+
+# Testcase 11b, Avoid losing dirty file involved in directory rename
+#   Commit O: z/a, x/{b,c_v1}
+#   Commit A: z/{a,c_v1},  x/b,   and z/c_v1 has uncommitted mods
+#   Commit B: y/a, x/{b,c_v2}
+#   Expected: y/{a,c_v2}, x/b, z/c_v1 with uncommitted mods untracked,
+# ERROR_MSG(Refusing to lose dirty file at z/c)
+
+
+test_expect_success '11b-setup: Avoid losing dirty file involved in directory 
rename' '
+   test_create_repo 11b &&
+   (
+   cd 11b &&
+
+   mkdir z x &&
+   echo a >z/a &&
+   echo b >x/b &&
+   test_seq 1 10 >x/c &&
+   git add z x &&
+   test_tick &&
+   git commit -m "O" &&
+
+   git branch O &&
+   git branch A &&
+   git branch B &&
+
+   git checkout A &&
+ 

[PATCH v7 10/31] directory rename detection: tests for handling overwriting untracked files

2018-01-30 Thread Elijah Newren
Signed-off-by: Elijah Newren 
---
 t/t6043-merge-rename-directories.sh | 367 
 1 file changed, 367 insertions(+)

diff --git a/t/t6043-merge-rename-directories.sh 
b/t/t6043-merge-rename-directories.sh
index b730256653..aa9af49edc 100755
--- a/t/t6043-merge-rename-directories.sh
+++ b/t/t6043-merge-rename-directories.sh
@@ -2871,4 +2871,371 @@ test_expect_failure '9g-check: Renamed directory that 
only contained immediate s
 #   side of history for any implicit directory renames.
 ###
 
+###
+# SECTION 10: Handling untracked files
+#
+# unpack_trees(), upon which the recursive merge algorithm is based, aborts
+# the operation if untracked or dirty files would be deleted or overwritten
+# by the merge.  Unfortunately, unpack_trees() does not understand renames,
+# and if it doesn't abort, then it muddies up the working directory before
+# we even get to the point of detecting renames, so we need some special
+# handling, at least in the case of directory renames.
+###
+
+# Testcase 10a, Overwrite untracked: normal rename/delete
+#   Commit O: z/{b,c_1}
+#   Commit A: z/b + untracked z/c + untracked z/d
+#   Commit B: z/{b,d_1}
+#   Expected: Aborted Merge +
+#   ERROR_MSG(untracked working tree files would be overwritten by merge)
+
+test_expect_success '10a-setup: Overwrite untracked with normal rename/delete' 
'
+   test_create_repo 10a &&
+   (
+   cd 10a &&
+
+   mkdir z &&
+   echo b >z/b &&
+   echo c >z/c &&
+   git add z &&
+   test_tick &&
+   git commit -m "O" &&
+
+   git branch O &&
+   git branch A &&
+   git branch B &&
+
+   git checkout A &&
+   git rm z/c &&
+   test_tick &&
+   git commit -m "A" &&
+
+   git checkout B &&
+   git mv z/c z/d &&
+   test_tick &&
+   git commit -m "B"
+   )
+'
+
+test_expect_success '10a-check: Overwrite untracked with normal rename/delete' 
'
+   (
+   cd 10a &&
+
+   git checkout A^0 &&
+   echo very >z/c &&
+   echo important >z/d &&
+
+   test_must_fail git merge -s recursive B^0 >out 2>err &&
+   test_i18ngrep "The following untracked working tree files would 
be overwritten by merge" err &&
+
+   git ls-files -s >out &&
+   test_line_count = 1 out &&
+   git ls-files -o >out &&
+   test_line_count = 4 out &&
+
+   echo very >expect &&
+   test_cmp expect z/c &&
+
+   echo important >expect &&
+   test_cmp expect z/d &&
+
+   git rev-parse HEAD:z/b >actual &&
+   git rev-parse O:z/b >expect &&
+   test_cmp expect actual
+   )
+'
+
+# Testcase 10b, Overwrite untracked: dir rename + delete
+#   Commit O: z/{b,c_1}
+#   Commit A: y/b + untracked y/{c,d,e}
+#   Commit B: z/{b,d_1,e}
+#   Expected: Failed Merge; y/b + untracked y/c + untracked y/d on disk +
+# z/c_1 -> z/d_1 rename recorded at stage 3 for y/d +
+#   ERROR_MSG(refusing to lose untracked file at 'y/d')
+
+test_expect_success '10b-setup: Overwrite untracked with dir rename + delete' '
+   test_create_repo 10b &&
+   (
+   cd 10b &&
+
+   mkdir z &&
+   echo b >z/b &&
+   echo c >z/c &&
+   git add z &&
+   test_tick &&
+   git commit -m "O" &&
+
+   git branch O &&
+   git branch A &&
+   git branch B &&
+
+   git checkout A &&
+   git rm z/c &&
+   git mv z/ y/ &&
+   test_tick &&
+   git commit -m "A" &&
+
+   git checkout B &&
+   git mv z/c z/d &&
+   echo e >z/e &&
+   git add z/e &&
+   test_tick &&
+   git commit -m "B"
+   )
+'
+
+test_expect_failure '10b-check: Overwrite untracked with dir rename + delete' '
+   (
+   cd 10b &&
+
+   git checkout A^0 &&
+   echo very >y/c &&
+   echo important >y/d &&
+   echo contents >y/e &&
+
+   test_must_fail git merge -s recursive B^0 >out 2>err &&
+   test_i18ngrep "CONFLICT (rename/delete).*Version B\^0 of y/d 
left in tree at y/d~B\^0" out &&
+   test_i18ngrep "Error: Refusing to lose untracked file at y/e; 
writing to y/e~B\^0 instead" out &&
+
+   git ls-files -s >out &&
+   test_line_count = 3 out &&
+   git ls-files -u >out &&
+

[PATCH v7 06/31] directory rename detection: testcases checking which side did the rename

2018-01-30 Thread Elijah Newren
Signed-off-by: Elijah Newren <new...@gmail.com>
---
 t/t6043-merge-rename-directories.sh | 336 
 1 file changed, 336 insertions(+)

diff --git a/t/t6043-merge-rename-directories.sh 
b/t/t6043-merge-rename-directories.sh
index ac9c3e9974..fbeb8f4316 100755
--- a/t/t6043-merge-rename-directories.sh
+++ b/t/t6043-merge-rename-directories.sh
@@ -1172,4 +1172,340 @@ test_expect_failure '5d-check: Directory/file/file 
conflict due to directory ren
 #   back to old handling.  But, sadly, see testcases 8a and 8b.
 ###
 
+
+###
+# SECTION 6: Same side of the merge was the one that did the rename
+#
+# It may sound obvious that you only want to apply implicit directory
+# renames to directories if the _other_ side of history did the renaming.
+# If you did make an implementation that didn't explicitly enforce this
+# rule, the majority of cases that would fall under this section would
+# also be solved by following the rules from the above sections.  But
+# there are still a few that stick out, so this section covers them just
+# to make sure we also get them right.
+###
+
+# Testcase 6a, Tricky rename/delete
+#   Commit O: z/{b,c,d}
+#   Commit A: z/b
+#   Commit B: y/{b,c}, z/d
+#   Expected: y/b, CONFLICT(rename/delete, z/c -> y/c vs. NULL)
+#   Note: We're just checking here that the rename of z/b and z/c to put
+# them under y/ doesn't accidentally catch z/d and make it look like
+# it is also involved in a rename/delete conflict.
+
+test_expect_success '6a-setup: Tricky rename/delete' '
+   test_create_repo 6a &&
+   (
+   cd 6a &&
+
+   mkdir z &&
+   echo b >z/b &&
+   echo c >z/c &&
+   echo d >z/d &&
+   git add z &&
+   test_tick &&
+   git commit -m "O" &&
+
+   git branch O &&
+   git branch A &&
+   git branch B &&
+
+   git checkout A &&
+   git rm z/c &&
+   git rm z/d &&
+   test_tick &&
+   git commit -m "A" &&
+
+   git checkout B &&
+   mkdir y &&
+   git mv z/b y/ &&
+   git mv z/c y/ &&
+   test_tick &&
+   git commit -m "B"
+   )
+'
+
+test_expect_success '6a-check: Tricky rename/delete' '
+   (
+   cd 6a &&
+
+   git checkout A^0 &&
+
+   test_must_fail git merge -s recursive B^0 >out &&
+   test_i18ngrep "CONFLICT (rename/delete).*z/c.*y/c" out &&
+
+   git ls-files -s >out &&
+   test_line_count = 2 out &&
+   git ls-files -u >out &&
+   test_line_count = 1 out &&
+   git ls-files -o >out &&
+   test_line_count = 1 out &&
+
+   git rev-parse >actual \
+   :0:y/b :3:y/c &&
+   git rev-parse >expect \
+O:z/b  O:z/c &&
+   test_cmp expect actual
+   )
+'
+
+# Testcase 6b, Same rename done on both sides
+#   (Related to testcases 6c and 8e)
+#   Commit O: z/{b,c}
+#   Commit A: y/{b,c}
+#   Commit B: y/{b,c}, z/d
+#   Expected: y/{b,c}, z/d
+#   Note: If we did directory rename detection here, we'd move z/d into y/,
+# but B did that rename and still decided to put the file into z/,
+# so we probably shouldn't apply directory rename detection for it.
+
+test_expect_success '6b-setup: Same rename done on both sides' '
+   test_create_repo 6b &&
+   (
+   cd 6b &&
+
+   mkdir z &&
+   echo b >z/b &&
+   echo c >z/c &&
+   git add z &&
+   test_tick &&
+   git commit -m "O" &&
+
+   git branch O &&
+   git branch A &&
+   git branch B &&
+
+   git checkout A &&
+   git mv z y &&
+   test_tick &&
+   git commit -m "A" &&
+
+   git checkout B &&
+   git mv z y &&
+   mkdir z &&
+   echo d >z/d &&
+   git add z/d &&
+   test_tick &&
+   git commit -m &quo

[PATCH v7 05/31] directory rename detection: files/directories in the way of some renames

2018-01-30 Thread Elijah Newren
Signed-off-by: Elijah Newren <new...@gmail.com>
---
 t/t6043-merge-rename-directories.sh | 330 
 1 file changed, 330 insertions(+)

diff --git a/t/t6043-merge-rename-directories.sh 
b/t/t6043-merge-rename-directories.sh
index f0213f2bbd..ac9c3e9974 100755
--- a/t/t6043-merge-rename-directories.sh
+++ b/t/t6043-merge-rename-directories.sh
@@ -842,4 +842,334 @@ test_expect_success '4a-check: Directory split, with 
original directory still pr
 #   detection.)  But, sadly, see testcase 8b.
 ###
 
+
+###
+# SECTION 5: Files/directories in the way of subset of to-be-renamed paths
+#
+# Implicitly renaming files due to a detected directory rename could run
+# into problems if there are files or directories in the way of the paths
+# we want to rename.  Explore such cases in this section.
+###
+
+# Testcase 5a, Merge directories, other side adds files to original and target
+#   Commit O: z/{b,c},   y/d
+#   Commit A: z/{b,c,e_1,f}, y/{d,e_2}
+#   Commit B: y/{b,c,d}
+#   Expected: z/e_1, y/{b,c,d,e_2,f} + CONFLICT warning
+#   NOTE: While directory rename detection is active here causing z/f to
+# become y/f, we did not apply this for z/e_1 because that would
+# give us an add/add conflict for y/e_1 vs y/e_2.  This problem with
+# this add/add, is that both versions of y/e are from the same side
+# of history, giving us no way to represent this conflict in the
+# index.
+
+test_expect_success '5a-setup: Merge directories, other side adds files to 
original and target' '
+   test_create_repo 5a &&
+   (
+   cd 5a &&
+
+   mkdir z &&
+   echo b >z/b &&
+   echo c >z/c &&
+   mkdir y &&
+   echo d >y/d &&
+   git add z y &&
+   test_tick &&
+   git commit -m "O" &&
+
+   git branch O &&
+   git branch A &&
+   git branch B &&
+
+   git checkout A &&
+   echo e1 >z/e &&
+   echo f >z/f &&
+   echo e2 >y/e &&
+   git add z/e z/f y/e &&
+   test_tick &&
+   git commit -m "A" &&
+
+   git checkout B &&
+   git mv z/b y/ &&
+   git mv z/c y/ &&
+   rmdir z &&
+   test_tick &&
+   git commit -m "B"
+   )
+'
+
+test_expect_failure '5a-check: Merge directories, other side adds files to 
original and target' '
+   (
+   cd 5a &&
+
+   git checkout A^0 &&
+
+   test_must_fail git merge -s recursive B^0 >out &&
+   test_i18ngrep "CONFLICT.*implicit dir rename" out &&
+
+   git ls-files -s >out &&
+   test_line_count = 6 out &&
+   git ls-files -u >out &&
+   test_line_count = 0 out &&
+   git ls-files -o >out &&
+   test_line_count = 1 out &&
+
+   git rev-parse >actual \
+   :0:y/b :0:y/c :0:y/d :0:y/e :0:z/e :0:y/f &&
+   git rev-parse >expect \
+O:z/b  O:z/c  O:y/d  A:y/e  A:z/e  A:z/f &&
+   test_cmp expect actual
+   )
+'
+
+# Testcase 5b, Rename/delete in order to get add/add/add conflict
+#   (Related to testcase 8d; these may appear slightly inconsistent to users;
+#Also related to testcases 7d and 7e)
+#   Commit O: z/{b,c,d_1}
+#   Commit A: y/{b,c,d_2}
+#   Commit B: z/{b,c,d_1,e}, y/d_3
+#   Expected: y/{b,c,e}, CONFLICT(add/add: y/d_2 vs. y/d_3)
+#   NOTE: If z/d_1 in commit B were to be involved in dir rename detection, as
+# we normaly would since z/ is being renamed to y/, then this would be
+# a rename/delete (z/d_1 -> y/d_1 vs. deleted) AND an add/add/add
+# conflict of y/d_1 vs. y/d_2 vs. y/d_3.  Add/add/add is not
+# representable in the index, so the existence of y/d_3 needs to
+# cause us to bail on directory rename detection for that path, falling
+# back to git behavior without the directory rename detection.
+
+test_expect_success '5b-setup: Rename/delete in order to get add/add/add 
conflict' '
+   test_create_repo 5b &&
+   (
+   cd 5b &&
+
+   mkdir z &&
+   echo b >z/b &&

Re: [PATCH v7 00/31] Add directory rename detection to git

2018-01-30 Thread Elijah Newren
On Tue, Jan 30, 2018 at 3:25 PM, Elijah Newren <new...@gmail.com> wrote:
> This patchset introduces directory rename detection to merge-recursive.  See

And a meta question: Should I be trying to submit this feature some other way?

I've really appreciated all the reviews[1], and the testcases are
undoubtedly better because of them.  The code too, but the reviews
have focused almost exclusively on the testcases so far.  Is
merge-recursive just too hairy and my changes too large for folks to
stomach tackling?  Is there something different I could do?

I'm curious for any thoughts on the matter.

Thanks!
Elijah

[1] Especially Stefan who looked through all the patches, but also for
spot reviews from Junio, Johannes Schindelin, Johanness Sixt, Adam
Dinwoodie, and SZEDER Gábor.  Also, a few interested comments from
Philip Oakley and Jacob Keller.


Re: [PATCHv6 03/31] directory rename detection: testcases to avoid taking detection too far

2018-01-26 Thread Elijah Newren
On Fri, Jan 26, 2018 at 3:37 AM, SZEDER Gábor  wrote:
> On Fri, Jan 5, 2018 at 9:26 PM, Elijah Newren  wrote:
>> Signed-off-by: Elijah Newren 
>> ---
>>  t/t6043-merge-rename-directories.sh | 153 
>> 
>>  1 file changed, 153 insertions(+)
>>
>> diff --git a/t/t6043-merge-rename-directories.sh 
>> b/t/t6043-merge-rename-directories.sh
>> index f36493289..239819f2d 100755
>> --- a/t/t6043-merge-rename-directories.sh
>> +++ b/t/t6043-merge-rename-directories.sh
>
>> +test_expect_success '3b-check: Avoid implicit rename if involved as source 
>> on current side' '
>> +   (
>> +   cd 3b &&
>> +
>> +   git checkout A^0 &&
>> +
>> +   test_must_fail git merge -s recursive B^0 >out &&
>> +   test_i18ngrep CONFLICT.*rename/rename.*z/d.*x/d.*w/d out &&
>> +   test_i18ngrep ! CONFLICT.*rename/rename.*y/d &&
>
> The filename argument is missing in the above line.

Whoops.  Good eyes; thanks for pointing it out.  I'll fix it up.


Re: [PATCHv6 03/31] directory rename detection: testcases to avoid taking detection too far

2018-01-26 Thread SZEDER Gábor
On Fri, Jan 5, 2018 at 9:26 PM, Elijah Newren <new...@gmail.com> wrote:
> Signed-off-by: Elijah Newren <new...@gmail.com>
> ---
>  t/t6043-merge-rename-directories.sh | 153 
> 
>  1 file changed, 153 insertions(+)
>
> diff --git a/t/t6043-merge-rename-directories.sh 
> b/t/t6043-merge-rename-directories.sh
> index f36493289..239819f2d 100755
> --- a/t/t6043-merge-rename-directories.sh
> +++ b/t/t6043-merge-rename-directories.sh

> +test_expect_success '3b-check: Avoid implicit rename if involved as source 
> on current side' '
> +   (
> +   cd 3b &&
> +
> +   git checkout A^0 &&
> +
> +   test_must_fail git merge -s recursive B^0 >out &&
> +   test_i18ngrep CONFLICT.*rename/rename.*z/d.*x/d.*w/d out &&
> +   test_i18ngrep ! CONFLICT.*rename/rename.*y/d &&

The filename argument is missing in the above line.

Why does the test still succeed, then?  Because 'test_i18ngrep' is
expected not to find the given pattern, and of course it won't find the
pattern if its input is empty as it comes from /dev/null instead of the
appropriate file.


> +
> +   git ls-files -s >out &&
> +   test_line_count = 5 out &&
> +   git ls-files -u >out &&
> +   test_line_count = 3 out &&
> +   git ls-files -o >out &&
> +   test_line_count = 1 out &&
> +
> +   git rev-parse >actual \
> +   :0:y/b :0:y/c :1:z/d :2:x/d :3:w/d &&
> +   git rev-parse >expect \
> +   O:z/b O:z/c O:z/d O:z/d O:z/d &&
> +   test_cmp expect actual &&
> +
> +   test_path_is_missing z/d &&
> +   git hash-object >actual \
> +   x/d w/d &&
> +       git rev-parse >expect \
> +   O:z/d O:z/d &&
> +   test_cmp expect actual
> +   )
> +'
> +
> +###
> +# Rules suggested by section 3:
> +#
> +#   Avoid directory-rename-detection for a path, if that path is the source
> +#   of a rename on either side of a merge.
> +###
> +
>  test_done
> --
> 2.14.2
>


Re: [PATCH v5 13/34] directory rename detection: tests for handling overwriting untracked files

2018-01-05 Thread Elijah Newren
On Fri, Jan 5, 2018 at 12:31 PM, Thomas Gummerer  wrote:
> On 01/04, Elijah Newren wrote:
>> On Wed, Jan 3, 2018 at 5:52 PM, SZEDER Gábor  wrote:
>>
>> >> + test $(git rev-parse :0:y/b) = $(git rev-parse O:z/b) &&
>> >
>> > There is a test helper for that :)
>> >
>> >   test_cmp_rev :0:y/b O:z/b
>> >
>> > Note, that this is not only a matter of useful output on failure, but
>> > also that of correctness and robustness.
>>
>>
>> Cool, good to know.  Is there any reason test_cmp_rev is not
>> documented in t/README?
>
> It is documented since a few weeks, I added the docs in 5a0526264b
> ("t/README: document test_cmp_rev", 2017-11-26).  Though as people
> mention they just look in 't/test-lib-functions.sh' anyway, maybe the
> docs should live there instead of in the README in the first place?
>

Nice!  If we don't end up moving them, it might be nice to add a note
to point people towards t/test-lib-functions.sh from t/README,
mentioning that it might have other functions available as well.


[PATCHv6 03/31] directory rename detection: testcases to avoid taking detection too far

2018-01-05 Thread Elijah Newren
Signed-off-by: Elijah Newren <new...@gmail.com>
---
 t/t6043-merge-rename-directories.sh | 153 
 1 file changed, 153 insertions(+)

diff --git a/t/t6043-merge-rename-directories.sh 
b/t/t6043-merge-rename-directories.sh
index f36493289..239819f2d 100755
--- a/t/t6043-merge-rename-directories.sh
+++ b/t/t6043-merge-rename-directories.sh
@@ -582,4 +582,157 @@ test_expect_success '2b-check: Directory split into two 
on one side, with equal
 #   messages are handled correctly.
 ###
 
+
+###
+# SECTION 3: Path in question is the source path for some rename already
+#
+# Combining cases from Section 1 and trying to handle them could lead to
+# directory renaming detection being over-applied.  So, this section
+# provides some good testcases to check that the implementation doesn't go
+# too far.
+###
+
+# Testcase 3a, Avoid implicit rename if involved as source on other side
+#   (Related to testcases 1c and 1f)
+#   Commit O: z/{b,c,d}
+#   Commit A: z/{b,c,d} (no change)
+#   Commit B: y/{b,c}, x/d
+#   Expected: y/{b,c}, x/d
+test_expect_success '3a-setup: Avoid implicit rename if involved as source on 
other side' '
+   test_create_repo 3a &&
+   (
+   cd 3a &&
+
+   mkdir z &&
+   echo b >z/b &&
+   echo c >z/c &&
+   echo d >z/d &&
+   git add z &&
+   test_tick &&
+   git commit -m "O" &&
+
+   git branch O &&
+   git branch A &&
+   git branch B &&
+
+   git checkout A &&
+   test_tick &&
+   git commit --allow-empty -m "A" &&
+
+   git checkout B &&
+   mkdir y &&
+   mkdir x &&
+   git mv z/b y/ &&
+   git mv z/c y/ &&
+   git mv z/d x/ &&
+   rmdir z &&
+   test_tick &&
+   git commit -m "B"
+   )
+'
+
+test_expect_success '3a-check: Avoid implicit rename if involved as source on 
other side' '
+   (
+   cd 3a &&
+
+   git checkout A^0 &&
+
+   git merge -s recursive B^0 &&
+
+   git ls-files -s >out &&
+   test_line_count = 3 out &&
+
+   git rev-parse >actual \
+   HEAD:y/b HEAD:y/c HEAD:x/d &&
+   git rev-parse >expect \
+   O:z/b O:z/c O:z/d &&
+   test_cmp expect actual
+   )
+'
+
+# Testcase 3b, Avoid implicit rename if involved as source on other side
+#   (Related to testcases 5c and 7c, also kind of 1e and 1f)
+#   Commit O: z/{b,c,d}
+#   Commit A: y/{b,c}, x/d
+#   Commit B: z/{b,c}, w/d
+#   Expected: y/{b,c}, CONFLICT:(z/d -> x/d vs. w/d)
+#   NOTE: We're particularly checking that since z/d is already involved as
+# a source in a file rename on the same side of history, that we don't
+# get it involved in directory rename detection.  If it were, we might
+# end up with CONFLICT:(z/d -> y/d vs. x/d vs. w/d), i.e. a
+# rename/rename/rename(1to3) conflict, which is just weird.
+test_expect_success '3b-setup: Avoid implicit rename if involved as source on 
current side' '
+   test_create_repo 3b &&
+   (
+   cd 3b &&
+
+   mkdir z &&
+   echo b >z/b &&
+   echo c >z/c &&
+   echo d >z/d &&
+   git add z &&
+   test_tick &&
+   git commit -m "O" &&
+
+   git branch O &&
+   git branch A &&
+   git branch B &&
+
+   git checkout A &&
+   mkdir y &&
+   mkdir x &&
+   git mv z/b y/ &&
+   git mv z/c y/ &&
+   git mv z/d x/ &&
+   rmdir z &&
+   test_tick &&
+   git commit -m "A" &&
+
+   git checkout B &&
+   mkdir w &&
+   git mv z/d w/ &&
+   test_tick &&
+   git commit -m "B"
+   )
+'
+
+test_expect_success '3b-check: Avoid implicit rename if involved as source on 
current side' '
+   (
+   cd 3b &&
+
+   git checkou

[PATCHv6 02/31] directory rename detection: directory splitting testcases

2018-01-05 Thread Elijah Newren
Signed-off-by: Elijah Newren 
---
 t/t6043-merge-rename-directories.sh | 143 
 1 file changed, 143 insertions(+)

diff --git a/t/t6043-merge-rename-directories.sh 
b/t/t6043-merge-rename-directories.sh
index c68ae87f1..f36493289 100755
--- a/t/t6043-merge-rename-directories.sh
+++ b/t/t6043-merge-rename-directories.sh
@@ -439,4 +439,147 @@ test_expect_failure '1f-check: Split a directory into two 
other directories' '
 #   in section 2, plus testcases 3a and 4a.
 ###
 
+
+###
+# SECTION 2: Split into multiple directories, with equal number of paths
+#
+# Explore the splitting-a-directory rules a bit; what happens in the
+# edge cases?
+#
+# Note that there is a closely related case of a directory not being
+# split on either side of history, but being renamed differently on
+# each side.  See testcase 8e for that.
+###
+
+# Testcase 2a, Directory split into two on one side, with equal numbers of 
paths
+#   Commit O: z/{b,c}
+#   Commit A: y/b, w/c
+#   Commit B: z/{b,c,d}
+#   Expected: y/b, w/c, z/d, with warning about z/ -> (y/ vs. w/) conflict
+test_expect_success '2a-setup: Directory split into two on one side, with 
equal numbers of paths' '
+   test_create_repo 2a &&
+   (
+   cd 2a &&
+
+   mkdir z &&
+   echo b >z/b &&
+   echo c >z/c &&
+   git add z &&
+   test_tick &&
+   git commit -m "O" &&
+
+   git branch O &&
+   git branch A &&
+   git branch B &&
+
+   git checkout A &&
+   mkdir y &&
+   mkdir w &&
+   git mv z/b y/ &&
+   git mv z/c w/ &&
+   test_tick &&
+   git commit -m "A" &&
+
+   git checkout B &&
+   echo d >z/d &&
+   git add z/d &&
+   test_tick &&
+   git commit -m "B"
+   )
+'
+
+test_expect_failure '2a-check: Directory split into two on one side, with 
equal numbers of paths' '
+   (
+   cd 2a &&
+
+   git checkout A^0 &&
+
+   test_must_fail git merge -s recursive B^0 >out &&
+   test_i18ngrep "CONFLICT.*directory rename split" out &&
+
+   git ls-files -s >out &&
+   test_line_count = 3 out &&
+   git ls-files -u >out &&
+   test_line_count = 0 out &&
+   git ls-files -o >out &&
+   test_line_count = 1 out &&
+
+   git rev-parse >actual \
+   :0:y/b :0:w/c :0:z/d &&
+   git rev-parse >expect \
+   O:z/b O:z/c B:z/d &&
+   test_cmp expect actual
+   )
+'
+
+# Testcase 2b, Directory split into two on one side, with equal numbers of 
paths
+#   Commit O: z/{b,c}
+#   Commit A: y/b, w/c
+#   Commit B: z/{b,c}, x/d
+#   Expected: y/b, w/c, x/d; No warning about z/ -> (y/ vs. w/) conflict
+test_expect_success '2b-setup: Directory split into two on one side, with 
equal numbers of paths' '
+   test_create_repo 2b &&
+   (
+   cd 2b &&
+
+   mkdir z &&
+   echo b >z/b &&
+   echo c >z/c &&
+   git add z &&
+   test_tick &&
+   git commit -m "O" &&
+
+   git branch O &&
+   git branch A &&
+   git branch B &&
+
+   git checkout A &&
+   mkdir y &&
+   mkdir w &&
+   git mv z/b y/ &&
+   git mv z/c w/ &&
+   test_tick &&
+   git commit -m "A" &&
+
+   git checkout B &&
+   mkdir x &&
+   echo d >x/d &&
+   git add x/d &&
+   test_tick &&
+   git commit -m "B"
+   )
+'
+
+test_expect_success '2b-check: Directory split into two on one side, with 
equal numbers of paths' '
+   (
+   cd 2b &&
+
+   git checkout A^0 &&
+
+   git merge -s recursive B^0 >out &&
+
+   git ls-files -s >out &&
+   test_line_count = 3 out &&
+   git ls-files -u >out &&
+   test_line_count = 0 out &&
+   git ls-files -o >out &&
+   test_line_count = 1 out &&
+
+   git rev-parse >actual \
+   :0:y/b :0:w/c :0:x/d &&
+   git rev-parse >expect \
+   O:z/b O:z/c B:x/d &&
+   test_cmp expect actual &&
+   test_i18ngrep ! "CONFLICT.*directory rename split" out
+   )
+'
+
+###
+# Rules suggested by section 2:
+#
+#   None; the 

[PATCHv6 11/31] directory rename detection: tests for handling overwriting dirty files

2018-01-05 Thread Elijah Newren
Signed-off-by: Elijah Newren <new...@gmail.com>
---
 t/t6043-merge-rename-directories.sh | 458 
 1 file changed, 458 insertions(+)

diff --git a/t/t6043-merge-rename-directories.sh 
b/t/t6043-merge-rename-directories.sh
index 5a2a4449d..d69d26951 100755
--- a/t/t6043-merge-rename-directories.sh
+++ b/t/t6043-merge-rename-directories.sh
@@ -3222,4 +3222,462 @@ test_expect_failure '10e-check: Does git complain about 
untracked file that is n
)
 '
 
+###
+# SECTION 11: Handling dirty (not up-to-date) files
+#
+# unpack_trees(), upon which the recursive merge algorithm is based, aborts
+# the operation if untracked or dirty files would be deleted or overwritten
+# by the merge.  Unfortunately, unpack_trees() does not understand renames,
+# and if it doesn't abort, then it muddies up the working directory before
+# we even get to the point of detecting renames, so we need some special
+# handling.  This was true even of normal renames, but there are additional
+# codepaths that need special handling with directory renames.  Add
+# testcases for both renamed-by-directory-rename-detection and standard
+# rename cases.
+###
+
+# Testcase 11a, Avoid losing dirty contents with simple rename
+#   Commit O: z/{a,b_v1},
+#   Commit A: z/{a,c_v1}, and z/c_v1 has uncommitted mods
+#   Commit B: z/{a,b_v2}
+#   Expected: ERROR_MSG(Refusing to lose dirty file at z/c) +
+# z/a, staged version of z/c has sha1sum matching B:z/b_v2,
+# z/c~HEAD with contents of B:z/b_v2,
+# z/c with uncommitted mods on top of A:z/c_v1
+
+test_expect_success '11a-setup: Avoid losing dirty contents with simple 
rename' '
+   test_create_repo 11a &&
+   (
+   cd 11a &&
+
+   mkdir z &&
+   echo a >z/a &&
+   test_seq 1 10 >z/b &&
+   git add z &&
+   test_tick &&
+   git commit -m "O" &&
+
+   git branch O &&
+   git branch A &&
+   git branch B &&
+
+   git checkout A &&
+   git mv z/b z/c &&
+   test_tick &&
+   git commit -m "A" &&
+
+   git checkout B &&
+   echo 11 >>z/b &&
+   git add z/b &&
+   test_tick &&
+   git commit -m "B"
+   )
+'
+
+test_expect_failure '11a-check: Avoid losing dirty contents with simple 
rename' '
+   (
+   cd 11a &&
+
+   git checkout A^0 &&
+   echo stuff >>z/c &&
+
+   test_must_fail git merge -s recursive B^0 >out 2>err &&
+   test_i18ngrep "Refusing to lose dirty file at z/c" out &&
+
+   test_seq 1 10 >expected &&
+   echo stuff >>expected &&
+   test_cmp expected z/c &&
+
+   git ls-files -s >out &&
+   test_line_count = 2 out &&
+   git ls-files -u >out &&
+   test_line_count = 1 out &&
+   git ls-files -o >out &&
+   test_line_count = 4 out &&
+
+   git rev-parse >actual \
+   :0:z/a :2:z/c &&
+   git rev-parse >expect \
+   O:z/a B:z/b &&
+   test_cmp expect actual &&
+
+   git hash-object z/c~HEAD >actual &&
+   git rev-parse B:z/b >expect &&
+   test_cmp expect actual
+   )
+'
+
+# Testcase 11b, Avoid losing dirty file involved in directory rename
+#   Commit O: z/a, x/{b,c_v1}
+#   Commit A: z/{a,c_v1},  x/b,   and z/c_v1 has uncommitted mods
+#   Commit B: y/a, x/{b,c_v2}
+#   Expected: y/{a,c_v2}, x/b, z/c_v1 with uncommitted mods untracked,
+# ERROR_MSG(Refusing to lose dirty file at z/c)
+
+
+test_expect_success '11b-setup: Avoid losing dirty file involved in directory 
rename' '
+   test_create_repo 11b &&
+   (
+   cd 11b &&
+
+   mkdir z x &&
+   echo a >z/a &&
+   echo b >x/b &&
+   test_seq 1 10 >x/c &&
+   git add z x &&
+   test_tick &&
+   git commit -m "O" &&
+
+   git branch O &&
+   git branch A &&
+   git branch B &&
+
+   git checkout A &&
+   

[PATCHv6 04/31] directory rename detection: partially renamed directory testcase/discussion

2018-01-05 Thread Elijah Newren
Signed-off-by: Elijah Newren <new...@gmail.com>
---
 t/t6043-merge-rename-directories.sh | 107 
 1 file changed, 107 insertions(+)

diff --git a/t/t6043-merge-rename-directories.sh 
b/t/t6043-merge-rename-directories.sh
index 239819f2d..c61ecb9b7 100755
--- a/t/t6043-merge-rename-directories.sh
+++ b/t/t6043-merge-rename-directories.sh
@@ -735,4 +735,111 @@ test_expect_success '3b-check: Avoid implicit rename if 
involved as source on cu
 #   of a rename on either side of a merge.
 ###
 
+
+###
+# SECTION 4: Partially renamed directory; still exists on both sides of merge
+#
+# What if we were to attempt to do directory rename detection when someone
+# "mostly" moved a directory but still left some files around, or,
+# equivalently, fully renamed a directory in one commmit and then recreated
+# that directory in a later commit adding some new files and then tried to
+# merge?
+#
+# It's hard to divine user intent in these cases, because you can make an
+# argument that, depending on the intermediate history of the side being
+# merged, that some users will want files in that directory to
+# automatically be detected and renamed, while users with a different
+# intermediate history wouldn't want that rename to happen.
+#
+# I think that it is best to simply not have directory rename detection
+# apply to such cases.  My reasoning for this is four-fold: (1) it's
+# easiest for users in general to figure out what happened if we don't
+# apply directory rename detection in any such case, (2) it's an easy rule
+# to explain ["We don't do directory rename detection if the directory
+# still exists on both sides of the merge"], (3) we can get some hairy
+# edge/corner cases that would be really confusing and possibly not even
+# representable in the index if we were to even try, and [related to 3] (4)
+# attempting to resolve this issue of divining user intent by examining
+# intermediate history goes against the spirit of three-way merges and is a
+# path towards crazy corner cases that are far more complex than what we're
+# already dealing with.
+#
+# This section contains a test for this partially-renamed-directory case.
+###
+
+# Testcase 4a, Directory split, with original directory still present
+#   (Related to testcase 1f)
+#   Commit O: z/{b,c,d,e}
+#   Commit A: y/{b,c,d}, z/e
+#   Commit B: z/{b,c,d,e,f}
+#   Expected: y/{b,c,d}, z/{e,f}
+#   NOTE: Even though most files from z moved to y, we don't want f to follow.
+
+test_expect_success '4a-setup: Directory split, with original directory still 
present' '
+   test_create_repo 4a &&
+   (
+   cd 4a &&
+
+   mkdir z &&
+   echo b >z/b &&
+   echo c >z/c &&
+   echo d >z/d &&
+   echo e >z/e &&
+   git add z &&
+   test_tick &&
+   git commit -m "O" &&
+
+   git branch O &&
+   git branch A &&
+   git branch B &&
+
+   git checkout A &&
+   mkdir y &&
+   git mv z/b y/ &&
+   git mv z/c y/ &&
+   git mv z/d y/ &&
+   test_tick &&
+   git commit -m "A" &&
+
+   git checkout B &&
+   echo f >z/f &&
+   git add z/f &&
+   test_tick &&
+   git commit -m "B"
+   )
+'
+
+test_expect_success '4a-check: Directory split, with original directory still 
present' '
+   (
+   cd 4a &&
+
+   git checkout A^0 &&
+
+   git merge -s recursive B^0 &&
+
+   git ls-files -s >out &&
+   test_line_count = 5 out &&
+   git ls-files -u >out &&
+   test_line_count = 0 out &&
+   git ls-files -o >out &&
+   test_line_count = 1 out &&
+
+   git rev-parse >actual \
+       HEAD:y/b HEAD:y/c HEAD:y/d HEAD:z/e HEAD:z/f &&
+   git rev-parse >expect \
+   O:z/b O:z/c O:z/d O:z/e B:z/f &&
+   test_cmp expect actual
+   )
+'
+
+#######
+# Rules suggested by section 4:
+#
+#   Directory-rename-detection should be turned off for any directories (as
+#   a source for renames) that exist on both sides of the merge.  (The "as
+#   a source for renames" clarification is due to cases like 1c where
+#   the target directory exists on both sides and we do want the rename
+#   detection.)  But, sadly, see testcase 8b.
+###
+
 test_done
-- 
2.14.2



[PATCHv6 09/31] directory rename detection: miscellaneous testcases to complete coverage

2018-01-05 Thread Elijah Newren
I came up with the testcases in the first eight sections before coding up
the implementation.  The testcases in this section were mostly ones I
thought of while coding/debugging, and which I was too lazy to insert
into the previous sections because I didn't want to re-label with all the
testcase references.  :-)

Signed-off-by: Elijah Newren 
---
 t/t6043-merge-rename-directories.sh | 549 +++-
 1 file changed, 548 insertions(+), 1 deletion(-)

diff --git a/t/t6043-merge-rename-directories.sh 
b/t/t6043-merge-rename-directories.sh
index 470c9a79f..717cf1483 100755
--- a/t/t6043-merge-rename-directories.sh
+++ b/t/t6043-merge-rename-directories.sh
@@ -305,6 +305,7 @@ test_expect_failure '1d-check: Directory renames cause a 
rename/rename(2to1) con
 '
 
 # Testcase 1e, Renamed directory, with all filenames being renamed too
+#   (Related to testcases 9f & 9g)
 #   Commit O: z/{oldb,oldc}
 #   Commit A: y/{newb,newc}
 #   Commit B: z/{oldb,oldc,d}
@@ -593,7 +594,7 @@ test_expect_success '2b-check: Directory split into two on 
one side, with equal
 ###
 
 # Testcase 3a, Avoid implicit rename if involved as source on other side
-#   (Related to testcases 1c and 1f)
+#   (Related to testcases 1c, 1f, and 9h)
 #   Commit O: z/{b,c,d}
 #   Commit A: z/{b,c,d} (no change)
 #   Commit B: y/{b,c}, x/d
@@ -2308,4 +2309,550 @@ test_expect_failure '8e-check: Both sides rename, one 
side adds to original dire
)
 '
 
+###
+# SECTION 9: Other testcases
+#
+# This section consists of miscellaneous testcases I thought of during
+# the implementation which round out the testing.
+###
+
+# Testcase 9a, Inner renamed directory within outer renamed directory
+#   (Related to testcase 1f)
+#   Commit O: z/{b,c,d/{e,f,g}}
+#   Commit A: y/{b,c}, x/w/{e,f,g}
+#   Commit B: z/{b,c,d/{e,f,g,h},i}
+#   Expected: y/{b,c,i}, x/w/{e,f,g,h}
+#   NOTE: The only reason this one is interesting is because when a directory
+# is split into multiple other directories, we determine by the weight
+# of which one had the most paths going to it.  A naive implementation
+# of that could take the new file in commit B at z/i to x/w/i or x/i.
+
+test_expect_success '9a-setup: Inner renamed directory within outer renamed 
directory' '
+   test_create_repo 9a &&
+   (
+   cd 9a &&
+
+   mkdir -p z/d &&
+   echo b >z/b &&
+   echo c >z/c &&
+   echo e >z/d/e &&
+   echo f >z/d/f &&
+   echo g >z/d/g &&
+   git add z &&
+   test_tick &&
+   git commit -m "O" &&
+
+   git branch O &&
+   git branch A &&
+   git branch B &&
+
+   git checkout A &&
+   mkdir x &&
+   git mv z/d x/w &&
+   git mv z y &&
+   test_tick &&
+   git commit -m "A" &&
+
+   git checkout B &&
+   echo h >z/d/h &&
+   echo i >z/i &&
+   git add z &&
+   test_tick &&
+   git commit -m "B"
+   )
+'
+
+test_expect_failure '9a-check: Inner renamed directory within outer renamed 
directory' '
+   (
+   cd 9a &&
+
+   git checkout A^0 &&
+
+   git merge -s recursive B^0 &&
+
+   git ls-files -s >out &&
+   test_line_count = 7 out &&
+   git ls-files -u >out &&
+   test_line_count = 0 out &&
+   git ls-files -o >out &&
+   test_line_count = 1 out &&
+
+   git rev-parse >actual \
+   HEAD:y/b HEAD:y/c HEAD:y/i &&
+   git rev-parse >expect \
+   O:z/b O:z/c B:z/i &&
+   test_cmp expect actual &&
+
+   git rev-parse >actual \
+   HEAD:x/w/e HEAD:x/w/f HEAD:x/w/g HEAD:x/w/h &&
+   git rev-parse >expect \
+   O:z/d/e O:z/d/f O:z/d/g B:z/d/h &&
+   test_cmp expect actual
+   )
+'
+
+# Testcase 9b, Transitive rename with content merge
+#   (Related to testcase 1c)
+#   Commit O: z/{b,c},   x/d_1
+#   Commit A: y/{b,c},   x/d_2
+#   Commit B: z/{b,c,d_3}
+#   Expected: y/{b,c,d_merged}
+
+test_expect_success '9b-setup: Transitive rename with content merge' '
+   test_create_repo 9b &&
+   (
+   cd 9b &&
+
+   mkdir z &&
+   echo b >z/b &&
+   echo c >z/c &&
+   mkdir x &&
+   test_seq 1 10 >x/d &&
+   git add z x &&
+   test_tick &&
+   git commit -m "O" &&
+
+   git branch O &&
+   git branch A &&
+   

[PATCHv6 06/31] directory rename detection: testcases checking which side did the rename

2018-01-05 Thread Elijah Newren
Signed-off-by: Elijah Newren <new...@gmail.com>
---
 t/t6043-merge-rename-directories.sh | 336 
 1 file changed, 336 insertions(+)

diff --git a/t/t6043-merge-rename-directories.sh 
b/t/t6043-merge-rename-directories.sh
index f9d75c83c..dc3fc66e5 100755
--- a/t/t6043-merge-rename-directories.sh
+++ b/t/t6043-merge-rename-directories.sh
@@ -1172,4 +1172,340 @@ test_expect_failure '5d-check: Directory/file/file 
conflict due to directory ren
 #   back to old handling.  But, sadly, see testcases 8a and 8b.
 ###
 
+
+###
+# SECTION 6: Same side of the merge was the one that did the rename
+#
+# It may sound obvious that you only want to apply implicit directory
+# renames to directories if the _other_ side of history did the renaming.
+# If you did make an implementation that didn't explicitly enforce this
+# rule, the majority of cases that would fall under this section would
+# also be solved by following the rules from the above sections.  But
+# there are still a few that stick out, so this section covers them just
+# to make sure we also get them right.
+###
+
+# Testcase 6a, Tricky rename/delete
+#   Commit O: z/{b,c,d}
+#   Commit A: z/b
+#   Commit B: y/{b,c}, z/d
+#   Expected: y/b, CONFLICT(rename/delete, z/c -> y/c vs. NULL)
+#   Note: We're just checking here that the rename of z/b and z/c to put
+# them under y/ doesn't accidentally catch z/d and make it look like
+# it is also involved in a rename/delete conflict.
+
+test_expect_success '6a-setup: Tricky rename/delete' '
+   test_create_repo 6a &&
+   (
+   cd 6a &&
+
+   mkdir z &&
+   echo b >z/b &&
+   echo c >z/c &&
+   echo d >z/d &&
+   git add z &&
+   test_tick &&
+   git commit -m "O" &&
+
+   git branch O &&
+   git branch A &&
+   git branch B &&
+
+   git checkout A &&
+   git rm z/c &&
+   git rm z/d &&
+   test_tick &&
+   git commit -m "A" &&
+
+   git checkout B &&
+   mkdir y &&
+   git mv z/b y/ &&
+   git mv z/c y/ &&
+   test_tick &&
+   git commit -m "B"
+   )
+'
+
+test_expect_success '6a-check: Tricky rename/delete' '
+   (
+   cd 6a &&
+
+   git checkout A^0 &&
+
+   test_must_fail git merge -s recursive B^0 >out &&
+   test_i18ngrep "CONFLICT (rename/delete).*z/c.*y/c" out &&
+
+   git ls-files -s >out &&
+   test_line_count = 2 out &&
+   git ls-files -u >out &&
+   test_line_count = 1 out &&
+   git ls-files -o >out &&
+   test_line_count = 1 out &&
+
+   git rev-parse >actual \
+   :0:y/b :3:y/c &&
+   git rev-parse >expect \
+   O:z/b O:z/c &&
+   test_cmp expect actual
+   )
+'
+
+# Testcase 6b, Same rename done on both sides
+#   (Related to testcases 6c and 8e)
+#   Commit O: z/{b,c}
+#   Commit A: y/{b,c}
+#   Commit B: y/{b,c}, z/d
+#   Expected: y/{b,c}, z/d
+#   Note: If we did directory rename detection here, we'd move z/d into y/,
+# but B did that rename and still decided to put the file into z/,
+# so we probably shouldn't apply directory rename detection for it.
+
+test_expect_success '6b-setup: Same rename done on both sides' '
+   test_create_repo 6b &&
+   (
+   cd 6b &&
+
+   mkdir z &&
+   echo b >z/b &&
+   echo c >z/c &&
+   git add z &&
+   test_tick &&
+   git commit -m "O" &&
+
+   git branch O &&
+   git branch A &&
+   git branch B &&
+
+   git checkout A &&
+   git mv z y &&
+   test_tick &&
+   git commit -m "A" &&
+
+   git checkout B &&
+   git mv z y &&
+   mkdir z &&
+   echo d >z/d &&
+   git add z/d &&
+   test_tick &&
+   git commit -m &quo

[PATCHv6 10/31] directory rename detection: tests for handling overwriting untracked files

2018-01-05 Thread Elijah Newren
Signed-off-by: Elijah Newren 
---
 t/t6043-merge-rename-directories.sh | 367 
 1 file changed, 367 insertions(+)

diff --git a/t/t6043-merge-rename-directories.sh 
b/t/t6043-merge-rename-directories.sh
index 717cf1483..5a2a4449d 100755
--- a/t/t6043-merge-rename-directories.sh
+++ b/t/t6043-merge-rename-directories.sh
@@ -2855,4 +2855,371 @@ test_expect_failure '9g-check: Renamed directory that 
only contained immediate s
 #   side of history for any implicit directory renames.
 ###
 
+###
+# SECTION 10: Handling untracked files
+#
+# unpack_trees(), upon which the recursive merge algorithm is based, aborts
+# the operation if untracked or dirty files would be deleted or overwritten
+# by the merge.  Unfortunately, unpack_trees() does not understand renames,
+# and if it doesn't abort, then it muddies up the working directory before
+# we even get to the point of detecting renames, so we need some special
+# handling, at least in the case of directory renames.
+###
+
+# Testcase 10a, Overwrite untracked: normal rename/delete
+#   Commit O: z/{b,c_1}
+#   Commit A: z/b + untracked z/c + untracked z/d
+#   Commit B: z/{b,d_1}
+#   Expected: Aborted Merge +
+#   ERROR_MSG(untracked working tree files would be overwritten by merge)
+
+test_expect_success '10a-setup: Overwrite untracked with normal rename/delete' 
'
+   test_create_repo 10a &&
+   (
+   cd 10a &&
+
+   mkdir z &&
+   echo b >z/b &&
+   echo c >z/c &&
+   git add z &&
+   test_tick &&
+   git commit -m "O" &&
+
+   git branch O &&
+   git branch A &&
+   git branch B &&
+
+   git checkout A &&
+   git rm z/c &&
+   test_tick &&
+   git commit -m "A" &&
+
+   git checkout B &&
+   git mv z/c z/d &&
+   test_tick &&
+   git commit -m "B"
+   )
+'
+
+test_expect_success '10a-check: Overwrite untracked with normal rename/delete' 
'
+   (
+   cd 10a &&
+
+   git checkout A^0 &&
+   echo very >z/c &&
+   echo important >z/d &&
+
+   test_must_fail git merge -s recursive B^0 >out 2>err &&
+   test_i18ngrep "The following untracked working tree files would 
be overwritten by merge" err &&
+
+   git ls-files -s >out &&
+   test_line_count = 1 out &&
+   git ls-files -o >out &&
+   test_line_count = 4 out &&
+
+   echo very >expect &&
+   test_cmp expect z/c &&
+
+   echo important >expect &&
+   test_cmp expect z/d &&
+
+   git rev-parse HEAD:z/b >actual &&
+   git rev-parse O:z/b >expect &&
+   test_cmp expect actual
+   )
+'
+
+# Testcase 10b, Overwrite untracked: dir rename + delete
+#   Commit O: z/{b,c_1}
+#   Commit A: y/b + untracked y/{c,d,e}
+#   Commit B: z/{b,d_1,e}
+#   Expected: Failed Merge; y/b + untracked y/c + untracked y/d on disk +
+# z/c_1 -> z/d_1 rename recorded at stage 3 for y/d +
+#   ERROR_MSG(refusing to lose untracked file at 'y/d')
+
+test_expect_success '10b-setup: Overwrite untracked with dir rename + delete' '
+   test_create_repo 10b &&
+   (
+   cd 10b &&
+
+   mkdir z &&
+   echo b >z/b &&
+   echo c >z/c &&
+   git add z &&
+   test_tick &&
+   git commit -m "O" &&
+
+   git branch O &&
+   git branch A &&
+   git branch B &&
+
+   git checkout A &&
+   git rm z/c &&
+   git mv z/ y/ &&
+   test_tick &&
+   git commit -m "A" &&
+
+   git checkout B &&
+   git mv z/c z/d &&
+   echo e >z/e &&
+   git add z/e &&
+   test_tick &&
+   git commit -m "B"
+   )
+'
+
+test_expect_failure '10b-check: Overwrite untracked with dir rename + delete' '
+   (
+   cd 10b &&
+
+   git checkout A^0 &&
+   echo very >y/c &&
+   echo important >y/d &&
+   echo contents >y/e &&
+
+   test_must_fail git merge -s recursive B^0 >out 2>err &&
+   test_i18ngrep "CONFLICT (rename/delete).*Version B\^0 of y/d 
left in tree at y/d~B\^0" out &&
+   test_i18ngrep "Error: Refusing to lose untracked file at y/e; 
writing to y/e~B\^0 instead" out &&
+
+   git ls-files -s >out &&
+   test_line_count = 3 out &&
+   git ls-files -u >out &&
+  

Re: [PATCH v5 13/34] directory rename detection: tests for handling overwriting untracked files

2018-01-05 Thread Thomas Gummerer
On 01/04, Elijah Newren wrote:
> On Wed, Jan 3, 2018 at 5:52 PM, SZEDER Gábor  wrote:
> 
> >> + test $(git rev-parse :0:y/b) = $(git rev-parse O:z/b) &&
> >
> > There is a test helper for that :)
> >
> >   test_cmp_rev :0:y/b O:z/b
> >
> > Note, that this is not only a matter of useful output on failure, but
> > also that of correctness and robustness.
> 
> 
> Cool, good to know.  Is there any reason test_cmp_rev is not
> documented in t/README?

It is documented since a few weeks, I added the docs in 5a0526264b
("t/README: document test_cmp_rev", 2017-11-26).  Though as people
mention they just look in 't/test-lib-functions.sh' anyway, maybe the
docs should live there instead of in the README in the first place?

> [...]


[PATCHv6 29/31] directory rename detection: new testcases showcasing a pair of bugs

2018-01-05 Thread Elijah Newren
Add a testcase showing spurious rename/rename(1to2) conflicts occurring
due to directory rename detection.

Also add a pair of testcases dealing with moving directory hierarchies
around that were suggested by Stefan Beller as "food for thought" during
his review of an earlier patch series, but which actually uncovered a
bug.  Round things out with a test that is a cross between the two
testcases that showed existing bugs in order to make sure we aren't
merely addressing problems in isolation but in general.

Signed-off-by: Elijah Newren <new...@gmail.com>
---
 t/t6043-merge-rename-directories.sh | 288 
 1 file changed, 288 insertions(+)

diff --git a/t/t6043-merge-rename-directories.sh 
b/t/t6043-merge-rename-directories.sh
index 003a65b1b..c684f34b6 100755
--- a/t/t6043-merge-rename-directories.sh
+++ b/t/t6043-merge-rename-directories.sh
@@ -159,6 +159,7 @@ test_expect_success '1b-check: Merge a directory with 
another' '
 # Testcase 1c, Transitive renaming
 #   (Related to testcases 3a and 6d -- when should a transitive rename apply?)
 #   (Related to testcases 9c and 9d -- can transitivity repeat?)
+#   (Related to testcase 12b -- joint-transitivity?)
 #   Commit O: z/{b,c},   x/d
 #   Commit A: y/{b,c},   x/d
 #   Commit B: z/{b,c,d}
@@ -2847,6 +2848,68 @@ test_expect_failure '9g-check: Renamed directory that 
only contained immediate s
)
 '
 
+# Testcase 9h, Avoid implicit rename if involved as source on other side
+#   (Extremely closely related to testcase 3a)
+#   Commit O: z/{b,c,d_1}
+#   Commit A: z/{b,c,d_2}
+#   Commit B: y/{b,c}, x/d_1
+#   Expected: y/{b,c}, x/d_2
+#   NOTE: If we applied the z/ -> y/ rename to z/d, then we'd end up with
+# a rename/rename(1to2) conflict (z/d -> y/d vs. x/d)
+test_expect_success '9h-setup: Avoid dir rename on merely modified path' '
+   test_create_repo 9h &&
+   (
+   cd 9h &&
+
+   mkdir z &&
+   echo b >z/b &&
+   echo c >z/c &&
+   printf "1\n2\n3\n4\n5\n6\n7\n8\nd\n" >z/d &&
+   git add z &&
+   test_tick &&
+   git commit -m "O" &&
+
+   git branch O &&
+   git branch A &&
+   git branch B &&
+
+   git checkout A &&
+   test_tick &&
+   echo more >>z/d &&
+   git add z/d &&
+   git commit -m "A" &&
+
+   git checkout B &&
+   mkdir y &&
+   mkdir x &&
+   git mv z/b y/ &&
+   git mv z/c y/ &&
+   git mv z/d x/ &&
+   rmdir z &&
+   test_tick &&
+   git commit -m "B"
+   )
+'
+
+test_expect_failure '9h-check: Avoid dir rename on merely modified path' '
+   (
+   cd 9h &&
+
+   git checkout A^0 &&
+
+   git merge -s recursive B^0 &&
+
+   git ls-files -s >out &&
+   test_line_count = 3 out &&
+
+   git rev-parse >actual \
+   HEAD:y/b HEAD:y/c HEAD:x/d &&
+   git rev-parse >expect \
+   O:z/b O:z/c A:z/d &&
+   test_cmp expect actual
+   )
+'
+
 ###
 # Rules suggested by section 9:
 #
@@ -3680,4 +3743,229 @@ test_expect_success '11f-check: Avoid deleting 
not-uptodate with dir rename/rena
)
 '
 
+###
+# SECTION 12: Everything else
+#
+# Tests suggested by others.  Tests added after implementation completed
+# and submitted.  Grab bag.
+###
+
+# Testcase 12a, Moving one directory hierarchy into another
+#   (Related to testcase 9a)
+#   Commit O: node1/{leaf1,leaf2}, node2/{leaf3,leaf4}
+#   Commit A: node1/{leaf1,leaf2,node2/{leaf3,leaf4}}
+#   Commit B: node1/{leaf1,leaf2,leaf5}, node2/{leaf3,leaf4,leaf6}
+#   Expected: node1/{leaf1,leaf2,leaf5,node2/{leaf3,leaf4,leaf6}}
+
+test_expect_success '12a-setup: Moving one directory hierarchy into another' '
+   test_create_repo 12a &&
+   (
+   cd 12a &&
+
+   mkdir -p node1 node2 &&
+   echo leaf1 >node1/leaf1 &&
+   echo leaf2 >node1/leaf2 &&
+   echo leaf3 >node2/leaf3 &&
+   echo leaf4 >node2/leaf4 &&
+   git add node1 node2 &&
+   test_tick &&
+   

[PATCHv6 01/31] directory rename detection: basic testcases

2018-01-05 Thread Elijah Newren
Signed-off-by: Elijah Newren <new...@gmail.com>
---
 t/t6043-merge-rename-directories.sh | 442 
 1 file changed, 442 insertions(+)
 create mode 100755 t/t6043-merge-rename-directories.sh

diff --git a/t/t6043-merge-rename-directories.sh 
b/t/t6043-merge-rename-directories.sh
new file mode 100755
index 0..c68ae87f1
--- /dev/null
+++ b/t/t6043-merge-rename-directories.sh
@@ -0,0 +1,442 @@
+#!/bin/sh
+
+test_description="recursive merge with directory renames"
+# includes checking of many corner cases, with a similar methodology to:
+#   t6042: corner cases with renames but not criss-cross merges
+#   t6036: corner cases with both renames and criss-cross merges
+#
+# The setup for all of them, pictorially, is:
+#
+#  A
+#  o
+# / \
+#  O o   ?
+# \ /
+#  o
+#  B
+#
+# To help make it easier to follow the flow of tests, they have been
+# divided into sections and each test will start with a quick explanation
+# of what commits O, A, and B contain.
+#
+# Notation:
+#z/{b,c}   means  files z/b and z/c both exist
+#x/d_1 means  file x/d exists with content d1.  (Purpose of the
+# underscore notation is to differentiate different
+# files that might be renamed into each other's paths.)
+
+. ./test-lib.sh
+
+
+###
+# SECTION 1: Basic cases we should be able to handle
+###
+
+# Testcase 1a, Basic directory rename.
+#   Commit O: z/{b,c}
+#   Commit A: y/{b,c}
+#   Commit B: z/{b,c,d,e/f}
+#   Expected: y/{b,c,d,e/f}
+
+test_expect_success '1a-setup: Simple directory rename detection' '
+   test_create_repo 1a &&
+   (
+   cd 1a &&
+
+   mkdir z &&
+   echo b >z/b &&
+   echo c >z/c &&
+   git add z &&
+   test_tick &&
+   git commit -m "O" &&
+
+   git branch O &&
+   git branch A &&
+   git branch B &&
+
+   git checkout A &&
+   git mv z y &&
+   test_tick &&
+   git commit -m "A" &&
+
+   git checkout B &&
+   echo d >z/d &&
+   mkdir z/e &&
+       echo f >z/e/f &&
+   git add z/d z/e/f &&
+   test_tick &&
+   git commit -m "B"
+   )
+'
+
+test_expect_failure '1a-check: Simple directory rename detection' '
+   (
+   cd 1a &&
+
+   git checkout A^0 &&
+
+   git merge -s recursive B^0 &&
+
+   git ls-files -s >out &&
+   test_line_count = 4 out &&
+
+   git rev-parse >actual \
+   HEAD:y/b HEAD:y/c HEAD:y/d HEAD:y/e/f &&
+   git rev-parse >expect \
+   O:z/b O:z/c B:z/d B:z/e/f &&
+   test_cmp expect actual &&
+
+   git hash-object y/d >actual &&
+   git rev-parse B:z/d >expect &&
+   test_cmp expect actual &&
+
+   test_must_fail git rev-parse HEAD:z/d &&
+   test_must_fail git rev-parse HEAD:z/e/f &&
+   test_path_is_missing z/d &&
+   test_path_is_missing z/e/f
+   )
+'
+
+# Testcase 1b, Merge a directory with another
+#   Commit O: z/{b,c},   y/d
+#   Commit A: z/{b,c,e}, y/d
+#   Commit B: y/{b,c,d}
+#   Expected: y/{b,c,d,e}
+
+test_expect_success '1b-setup: Merge a directory with another' '
+   test_create_repo 1b &&
+   (
+   cd 1b &&
+
+   mkdir z &&
+   echo b >z/b &&
+   echo c >z/c &&
+   mkdir y &&
+   echo d >y/d &&
+   git add z y &&
+   test_tick &&
+   git commit -m "O" &&
+
+   git branch O &&
+   git branch A &&
+   git branch B &&
+
+   git checkout A &&
+   echo e >z/e &&
+   git add z/e &&
+   test_tick &&
+   git commit -m "A" &&
+
+   git checkout B &&
+   git mv z/b y &&
+   git mv z/c y &&
+   rmdir z &&
+   test_tick &&
+   git commit -m "B"
+   )
+'
+
+test_expect_failure '1b-check: Mer

[PATCHv6 08/31] directory rename detection: testcases exploring possibly suboptimal merges

2018-01-05 Thread Elijah Newren
Signed-off-by: Elijah Newren <new...@gmail.com>
---
 t/t6043-merge-rename-directories.sh | 404 
 1 file changed, 404 insertions(+)

diff --git a/t/t6043-merge-rename-directories.sh 
b/t/t6043-merge-rename-directories.sh
index 38ca791e9..470c9a79f 100755
--- a/t/t6043-merge-rename-directories.sh
+++ b/t/t6043-merge-rename-directories.sh
@@ -1904,4 +1904,408 @@ test_expect_failure '7e-check: transitive rename in 
rename/delete AND dirs in th
)
 '
 
+###
+# SECTION 8: Suboptimal merges
+#
+# As alluded to in the last section, the ruleset we have built up for
+# detecting directory renames unfortunately has some special cases where it
+# results in slightly suboptimal or non-intuitive behavior.  This section
+# explores these cases.
+#
+# To be fair, we already had non-intuitive or suboptimal behavior for most
+# of these cases in git before introducing implicit directory rename
+# detection, but it'd be nice if there was a modified ruleset out there
+# that handled these cases a bit better.
+###
+
+# Testcase 8a, Dual-directory rename, one into the others' way
+#   Commit O. x/{a,b},   y/{c,d}
+#   Commit A. x/{a,b,e}, y/{c,d,f}
+#   Commit B. y/{a,b},   z/{c,d}
+#
+# Possible Resolutions:
+#   w/o dir-rename detection: y/{a,b,f},   z/{c,d},   x/e
+#   Currently expected:   y/{a,b,e,f}, z/{c,d}
+#   Optimal:  y/{a,b,e},   z/{c,d,f}
+#
+# Note: Both x and y got renamed and it'd be nice to detect both, and we do
+# better with directory rename detection than git did without, but the
+# simple rule from section 5 prevents me from handling this as optimally as
+# we potentially could.
+
+test_expect_success '8a-setup: Dual-directory rename, one into the others way' 
'
+   test_create_repo 8a &&
+   (
+   cd 8a &&
+
+   mkdir x &&
+   mkdir y &&
+   echo a >x/a &&
+   echo b >x/b &&
+   echo c >y/c &&
+   echo d >y/d &&
+   git add x y &&
+   test_tick &&
+   git commit -m "O" &&
+
+   git branch O &&
+   git branch A &&
+   git branch B &&
+
+   git checkout A &&
+   echo e >x/e &&
+   echo f >y/f &&
+   git add x/e y/f &&
+   test_tick &&
+   git commit -m "A" &&
+
+   git checkout B &&
+   git mv y z &&
+   git mv x y &&
+   test_tick &&
+   git commit -m "B"
+   )
+'
+
+test_expect_failure '8a-check: Dual-directory rename, one into the others way' 
'
+   (
+   cd 8a &&
+
+   git checkout A^0 &&
+
+   git merge -s recursive B^0 &&
+
+   git ls-files -s >out &&
+   test_line_count = 6 out &&
+   git ls-files -u >out &&
+   test_line_count = 0 out &&
+   git ls-files -o >out &&
+   test_line_count = 1 out &&
+
+   git rev-parse >actual \
+   HEAD:y/a HEAD:y/b HEAD:y/e HEAD:y/f HEAD:z/c HEAD:z/d &&
+   git rev-parse >expect \
+   O:x/a O:x/b A:x/e A:y/f O:y/c O:y/d &&
+   test_cmp expect actual
+   )
+'
+
+# Testcase 8b, Dual-directory rename, one into the others' way, with 
conflicting filenames
+#   Commit O. x/{a_1,b_1}, y/{a_2,b_2}
+#   Commit A. x/{a_1,b_1,e_1}, y/{a_2,b_2,e_2}
+#   Commit B. y/{a_1,b_1}, z/{a_2,b_2}
+#
+#   w/o dir-rename detection: y/{a_1,b_1,e_2}, z/{a_2,b_2}, x/e_1
+#   Currently expected:   
+#   Scary:    y/{a_1,b_1}, z/{a_2,b_2}, CONFLICT(add/add, 
e_1 vs. e_2)
+#   Optimal:  y/{a_1,b_1,e_1}, z/{a_2,b_2,e_2}
+#
+# Note: Very similar to 8a, except instead of 'e' and 'f' in directories x and
+# y, both are named 'e'.  Without directory rename detection, neither file
+# moves directories.  Implement directory rename detection suboptimally, and
+# you get an add/add conflict, but both files were added in commit A, so this
+# is an add/add conflict where one side of history added both files --
+# something we can't represent in the index.  Obviously, we'd prefer the last
+# resolution, but our previous rules are too coarse to allow it.  Using both
+# the rules from section 4 and section 5 save us from the Scary resolution,
+# making us fall back to pre-directory-rename-detection behavior f

[PATCHv6 00/31] Add directory rename detection to git

2018-01-05 Thread Elijah Newren
This patchset introduces directory rename detection to merge-recursive.  See
  https://public-inbox.org/git/20171110190550.27059-1-new...@gmail.com/
for the first series (including design considerations, etc.), and follow-up
series can be found at
  https://public-inbox.org/git/20171120220209.15111-1-new...@gmail.com/
  https://public-inbox.org/git/20171121080059.32304-1-new...@gmail.com/
  https://public-inbox.org/git/20171129014237.32570-1-new...@gmail.com/
  https://public-inbox.org/git/20171228041352.27880-1-new...@gmail.com/
  
Changes since v5 (full tbdiff follows below):
  * Split the first three (independent) patches off into a separate series.[1]
Semantically, they are separate and could go in either order, but built
with this one depending on the other series to avoid context region
conflicts.

[1] https://public-inbox.org/git/20180105202001.24218-1-new...@gmail.com/

  * Testsuite:

* Replaced plain 'test' with functions that'd provide more
  feedback on failure; e.g. test_line_count, test_path_is_missing,
  test_path_is_file (Thanks to SZEDER Gábor for the suggestions)
  
* Backslash escaped '^' character when found in middle of string
  being passed to test_i18ngrep (Thanks to Johannes Sixt), and
  avoided some shell globbing that somehow tripped up some old
  windows build

Elijah Newren (31):
  directory rename detection: basic testcases
  directory rename detection: directory splitting testcases
  directory rename detection: testcases to avoid taking detection too
far
  directory rename detection: partially renamed directory
testcase/discussion
  directory rename detection: files/directories in the way of some
renames
  directory rename detection: testcases checking which side did the
rename
  directory rename detection: more involved edge/corner testcases
  directory rename detection: testcases exploring possibly suboptimal
merges
  directory rename detection: miscellaneous testcases to complete
coverage
  directory rename detection: tests for handling overwriting untracked
files
  directory rename detection: tests for handling overwriting dirty files
  merge-recursive: move the get_renames() function
  merge-recursive: introduce new functions to handle rename logic
  merge-recursive: fix leaks of allocated renames and diff_filepairs
  merge-recursive: make !o->detect_rename codepath more obvious
  merge-recursive: split out code for determining diff_filepairs
  merge-recursive: add a new hashmap for storing directory renames
  merge-recursive: make a helper function for cleanup for handle_renames
  merge-recursive: add get_directory_renames()
  merge-recursive: check for directory level conflicts
  merge-recursive: add a new hashmap for storing file collisions
  merge-recursive: add computation of collisions due to dir rename &
merging
  merge-recursive: check for file level conflicts then get new name
  merge-recursive: when comparing files, don't include trees
  merge-recursive: apply necessary modifications for directory renames
  merge-recursive: avoid clobbering untracked files with directory
renames
  merge-recursive: fix overwriting dirty files involved in renames
  merge-recursive: fix remaining directory rename + dirty overwrite
cases
  directory rename detection: new testcases showcasing a pair of bugs
  merge-recursive: avoid spurious rename/rename conflict from dir
renames
  merge-recursive: ensure we write updates for directory-renamed file

 merge-recursive.c   | 1212 ++-
 merge-recursive.h   |   17 +
 strbuf.c|   16 +
 strbuf.h|   16 +
 t/t3501-revert-cherry-pick.sh   |2 +-
 t/t6043-merge-rename-directories.sh | 3966 +++
 t/t7607-merge-overwrite.sh  |2 +-
 unpack-trees.c  |4 +-
 unpack-trees.h  |4 +
 9 files changed, 5124 insertions(+), 115 deletions(-)
 create mode 100755 t/t6043-merge-rename-directories.sh

 1: d95d5fffe < --:  --- Tighten and correct a few testcases for merging 
and cherry-picking
 2: b177dccc2 < --:  --- merge-recursive: fix logic ordering issue
 3: fe3de710e < --:  --- merge-recursive: add explanation for src_entry and 
dst_entry
 4: 19b82b795 !  1: 2fd0812b7 directory rename detection: basic testcases
@@ -88,18 +88,23 @@
 +
 +  git merge -s recursive B^0 &&
 +
-+  test 4 -eq $(git ls-files -s | wc -l) &&
++  git ls-files -s >out &&
++  test_line_count = 4 out &&
 +
 +  git rev-parse >actual \
 +  HEAD:y/b HEAD:y/c HEAD:y/d HEAD:y/e/f &&
 +  git rev-parse >expect \
 +  O:z/b O:z/c B:z/d B:z/e/f &&
 +  test_cmp expect actual

[PATCHv6 07/31] directory rename detection: more involved edge/corner testcases

2018-01-05 Thread Elijah Newren
Signed-off-by: Elijah Newren 
---
 t/t6043-merge-rename-directories.sh | 396 
 1 file changed, 396 insertions(+)

diff --git a/t/t6043-merge-rename-directories.sh 
b/t/t6043-merge-rename-directories.sh
index dc3fc66e5..38ca791e9 100755
--- a/t/t6043-merge-rename-directories.sh
+++ b/t/t6043-merge-rename-directories.sh
@@ -1508,4 +1508,400 @@ test_expect_success '6e-check: Add/add from one side' '
 #   side of history is the one doing the renaming.
 ###
 
+
+###
+# SECTION 7: More involved Edge/Corner cases
+#
+# The ruleset we have generated in the above sections seems to provide
+# well-defined merges.  But can we find edge/corner cases that either (a)
+# are harder for users to understand, or (b) have a resolution that is
+# non-intuitive or suboptimal?
+#
+# The testcases in this section dive into cases that I've tried to craft in
+# a way to find some that might be surprising to users or difficult for
+# them to understand (the next section will look at non-intuitive or
+# suboptimal merge results).  Some of the testcases are similar to ones
+# from past sections, but have been simplified to try to highlight error
+# messages using a "modified" path (due to the directory rename).  Are
+# users okay with these?
+#
+# In my opinion, testcases that are difficult to understand from this
+# section is due to difficulty in the testcase rather than the directory
+# renaming (similar to how t6042 and t6036 have difficult resolutions due
+# to the problem setup itself being complex).  And I don't think the
+# error messages are a problem.
+#
+# On the other hand, the testcases in section 8 worry me slightly more...
+###
+
+# Testcase 7a, rename-dir vs. rename-dir (NOT split evenly) PLUS add-other-file
+#   Commit O: z/{b,c}
+#   Commit A: y/{b,c}
+#   Commit B: w/b, x/c, z/d
+#   Expected: y/d, CONFLICT(rename/rename for both z/b and z/c)
+#   NOTE: There's a rename of z/ here, y/ has more renames, so z/d -> y/d.
+
+test_expect_success '7a-setup: rename-dir vs. rename-dir (NOT split evenly) 
PLUS add-other-file' '
+   test_create_repo 7a &&
+   (
+   cd 7a &&
+
+   mkdir z &&
+   echo b >z/b &&
+   echo c >z/c &&
+   git add z &&
+   test_tick &&
+   git commit -m "O" &&
+
+   git branch O &&
+   git branch A &&
+   git branch B &&
+
+   git checkout A &&
+   git mv z y &&
+   test_tick &&
+   git commit -m "A" &&
+
+   git checkout B &&
+   mkdir w &&
+   mkdir x &&
+   git mv z/b w/ &&
+   git mv z/c x/ &&
+   echo d > z/d &&
+   git add z/d &&
+   test_tick &&
+   git commit -m "B"
+   )
+'
+
+test_expect_failure '7a-check: rename-dir vs. rename-dir (NOT split evenly) 
PLUS add-other-file' '
+   (
+   cd 7a &&
+
+   git checkout A^0 &&
+
+   test_must_fail git merge -s recursive B^0 >out &&
+   test_i18ngrep "CONFLICT (rename/rename).*z/b.*y/b.*w/b" out &&
+   test_i18ngrep "CONFLICT (rename/rename).*z/c.*y/c.*x/c" out &&
+
+   git ls-files -s >out &&
+   test_line_count = 7 out &&
+   git ls-files -u >out &&
+   test_line_count = 6 out &&
+   git ls-files -o >out &&
+   test_line_count = 1 out &&
+
+   git rev-parse >actual \
+   :1:z/b :2:y/b :3:w/b :1:z/c :2:y/c :3:x/c :0:y/d &&
+   git rev-parse >expect \
+   O:z/b O:z/b O:z/b O:z/c O:z/c O:z/c B:z/d &&
+   test_cmp expect actual &&
+
+   git hash-object >actual \
+   y/b w/b y/c x/c &&
+   git rev-parse >expect \
+   O:z/b O:z/b O:z/c O:z/c &&
+   test_cmp expect actual
+   )
+'
+
+# Testcase 7b, rename/rename(2to1), but only due to transitive rename
+#   (Related to testcase 1d)
+#   Commit O: z/{b,c}, x/d_1, w/d_2
+#   Commit A: y/{b,c,d_2}, x/d_1
+#   Commit B: z/{b,c,d_1},w/d_2
+#   Expected: y/{b,c}, CONFLICT(rename/rename(2to1): x/d_1, w/d_2 -> y_d)
+
+test_expect_success '7b-setup: rename/rename(2to1), but only due to transitive 
rename' '
+   test_create_repo 7b &&
+   (
+   cd 7b &&
+
+   mkdir z &&
+   mkdir x &&
+   mkdir w &&
+   echo b >z/b &&
+   echo c >z/c &&
+   echo d1 > x/d &&
+   echo d2 > w/d &&
+   git add z x w &&
+   test_tick &&
+   git commit 

[PATCHv6 05/31] directory rename detection: files/directories in the way of some renames

2018-01-05 Thread Elijah Newren
Signed-off-by: Elijah Newren <new...@gmail.com>
---
 t/t6043-merge-rename-directories.sh | 330 
 1 file changed, 330 insertions(+)

diff --git a/t/t6043-merge-rename-directories.sh 
b/t/t6043-merge-rename-directories.sh
index c61ecb9b7..f9d75c83c 100755
--- a/t/t6043-merge-rename-directories.sh
+++ b/t/t6043-merge-rename-directories.sh
@@ -842,4 +842,334 @@ test_expect_success '4a-check: Directory split, with 
original directory still pr
 #   detection.)  But, sadly, see testcase 8b.
 ###
 
+
+###
+# SECTION 5: Files/directories in the way of subset of to-be-renamed paths
+#
+# Implicitly renaming files due to a detected directory rename could run
+# into problems if there are files or directories in the way of the paths
+# we want to rename.  Explore such cases in this section.
+###
+
+# Testcase 5a, Merge directories, other side adds files to original and target
+#   Commit O: z/{b,c},   y/d
+#   Commit A: z/{b,c,e_1,f}, y/{d,e_2}
+#   Commit B: y/{b,c,d}
+#   Expected: z/e_1, y/{b,c,d,e_2,f} + CONFLICT warning
+#   NOTE: While directory rename detection is active here causing z/f to
+# become y/f, we did not apply this for z/e_1 because that would
+# give us an add/add conflict for y/e_1 vs y/e_2.  This problem with
+# this add/add, is that both versions of y/e are from the same side
+# of history, giving us no way to represent this conflict in the
+# index.
+
+test_expect_success '5a-setup: Merge directories, other side adds files to 
original and target' '
+   test_create_repo 5a &&
+   (
+   cd 5a &&
+
+   mkdir z &&
+   echo b >z/b &&
+   echo c >z/c &&
+   mkdir y &&
+   echo d >y/d &&
+   git add z y &&
+   test_tick &&
+   git commit -m "O" &&
+
+   git branch O &&
+   git branch A &&
+   git branch B &&
+
+   git checkout A &&
+   echo e1 >z/e &&
+   echo f >z/f &&
+   echo e2 >y/e &&
+   git add z/e z/f y/e &&
+   test_tick &&
+   git commit -m "A" &&
+
+   git checkout B &&
+   git mv z/b y/ &&
+   git mv z/c y/ &&
+   rmdir z &&
+   test_tick &&
+   git commit -m "B"
+   )
+'
+
+test_expect_failure '5a-check: Merge directories, other side adds files to 
original and target' '
+   (
+   cd 5a &&
+
+   git checkout A^0 &&
+
+   test_must_fail git merge -s recursive B^0 >out &&
+   test_i18ngrep "CONFLICT.*implicit dir rename" out &&
+
+   git ls-files -s >out &&
+   test_line_count = 6 out &&
+   git ls-files -u >out &&
+   test_line_count = 0 out &&
+   git ls-files -o >out &&
+   test_line_count = 1 out &&
+
+   git rev-parse >actual \
+   :0:y/b :0:y/c :0:y/d :0:y/e :0:z/e :0:y/f &&
+   git rev-parse >expect \
+   O:z/b O:z/c O:y/d A:y/e A:z/e A:z/f &&
+   test_cmp expect actual
+   )
+'
+
+# Testcase 5b, Rename/delete in order to get add/add/add conflict
+#   (Related to testcase 8d; these may appear slightly inconsistent to users;
+#Also related to testcases 7d and 7e)
+#   Commit O: z/{b,c,d_1}
+#   Commit A: y/{b,c,d_2}
+#   Commit B: z/{b,c,d_1,e}, y/d_3
+#   Expected: y/{b,c,e}, CONFLICT(add/add: y/d_2 vs. y/d_3)
+#   NOTE: If z/d_1 in commit B were to be involved in dir rename detection, as
+# we normaly would since z/ is being renamed to y/, then this would be
+# a rename/delete (z/d_1 -> y/d_1 vs. deleted) AND an add/add/add
+# conflict of y/d_1 vs. y/d_2 vs. y/d_3.  Add/add/add is not
+# representable in the index, so the existence of y/d_3 needs to
+# cause us to bail on directory rename detection for that path, falling
+# back to git behavior without the directory rename detection.
+
+test_expect_success '5b-setup: Rename/delete in order to get add/add/add 
conflict' '
+   test_create_repo 5b &&
+   (
+   cd 5b &&
+
+   mkdir z &&
+   echo b >z/b &&
+   echo c &

Re: [PATCH v5 13/34] directory rename detection: tests for handling overwriting untracked files

2018-01-05 Thread SZEDER Gábor
On Thu, Jan 4, 2018 at 10:10 PM, Elijah Newren  wrote:
> On Wed, Jan 3, 2018 at 5:52 PM, SZEDER Gábor  wrote:
>
>>> + test $(git rev-parse :0:y/b) = $(git rev-parse O:z/b) &&
>>
>> There is a test helper for that :)
>>
>>   test_cmp_rev :0:y/b O:z/b
>>
>> Note, that this is not only a matter of useful output on failure, but
>> also that of correctness and robustness.
>
>
> Cool, good to know.  Is there any reason test_cmp_rev is not
> documented in t/README?

Because of mere oversight, perhaps?  (both for this and for the
'verbose' helper)
I forgot that the test helpers are documented in 't/README' (well,
apparently only some of them), I usually go straight to
't/test-lib-functions.sh' to find test helpers or to learn what they are
doing.

> I already changed these yesterday, as part of trying to avoid the use
> of plain 'test' as you suggested (I was just waiting another day or so
> for more feedback before resubmitting the series).  Since I tended to
> have several of these rev-parse comparisons in a single test, I simply
> combined them:
> git rev-parse >actual
>   :0:y/b :1:x/c :2:x/c :3:x/c
> git rev-parse >expect
>   O:z/b O:x/c A:x/c B:x/c
> test_cmp expect actual
>
> That does result in fewer rev-parse invocations, which is probably
> good, but the test_cmp_rev does seem slightly more readable.  Hmmm...

Yeah, the significantly more 'git rev-parse' invocations are the reason
why I didn't recommend 'test_cmd_rev' in those cases.  Folks running the
test suite on Windows probably won't be very happy about such a change.
Though I had a bit of a "Huh?!" moment when seeing those combined 'git
rev-parse' pairs for the first time, I don't think converting them to a
list of 'test_cmp_rev' invocations would make it notably easier to
read.
Have you considered vertically aligning the corresponding rev arguments
in those 'git rev-parse' pairs?

Furthermore, on second look, while 'test_cmp_rev' does produce output on
failure, it's not really useful for humans, being only a diff of two
files with a single object name in each.  As it is, I don't think it's
any more useful than the output of those combined 'git
rev-parse'-'test_cmp' combos would be, so no gain there, either.
OTOH, we could easily enhance 'test_cmp_rev' to include the given revs
in its error message...

I leave it up to you.


Re: [PATCH v5 13/34] directory rename detection: tests for handling overwriting untracked files

2018-01-04 Thread Elijah Newren
On Wed, Jan 3, 2018 at 5:52 PM, SZEDER Gábor  wrote:

>> + test $(git rev-parse :0:y/b) = $(git rev-parse O:z/b) &&
>
> There is a test helper for that :)
>
>   test_cmp_rev :0:y/b O:z/b
>
> Note, that this is not only a matter of useful output on failure, but
> also that of correctness and robustness.


Cool, good to know.  Is there any reason test_cmp_rev is not
documented in t/README?

I already changed these yesterday, as part of trying to avoid the use
of plain 'test' as you suggested (I was just waiting another day or so
for more feedback before resubmitting the series).  Since I tended to
have several of these rev-parse comparisons in a single test, I simply
combined them:
git rev-parse >actual
  :0:y/b :1:x/c :2:x/c :3:x/c
git rev-parse >expect
  O:z/b O:x/c A:x/c B:x/c
test_cmp expect actual

That does result in fewer rev-parse invocations, which is probably
good, but the test_cmp_rev does seem slightly more readable.  Hmmm...

> I noticed that this patch series adds several similar
>
>   test $(git hash-object this) = $(git rev-parse that)
>
> conditions.  Well, for that we don't have a test helper
> function.  Similar 'hash-object = rev-parse' comparisons are already
> present in two other test scripts, so perhaps it's worth adding a
> helper function.  Or you could perhaps
>
>   git cat-file -p that >out &&
>   test_cmp this out

Yeah, I switched these over to the same format above; e.g.
   git hash-object this >actual &&
   git rev-parse that > expect &&
   test_cmp expect actual

That does have the advantage of re-using a similar style.

>> + test "very" = "$(cat y/c)" &&
>> +
>> + test "important" = "$(cat y/d)" &&
>
> The 'verbose' helper could make conditions like these more, well,
> verbose about their failure:
>
>   verbose test "very" = "$(cat y/c)" &&

Also good to know.  Is there any reason the "verbose" helper isn't
documented in t/README?

I also switched these over yesterday to write to files and then use
test_cmp, but

>> + test "important" != "$(git rev-parse :3:y/d)" &&
>
> I'm not sure what this condition is supposed to check.

Well that's embarrassing.  That was supposed to be 'cat-file -p', not
rev-parse.  And since I'm already testing that y/d on disk does have
"important" as its contents and that :3:y/d matches O:z/c (i.e. has
contents of "c"), the check was rather unnecessary, and too easy to
false pass.  I think I should just remove it (and two others like it).


  1   2   3   >