Re: [PATCH v3 2/3] merge: Add merge.renames config setting

2018-04-30 Thread Elijah Newren
Hi Eckhard,

On Mon, Apr 30, 2018 at 1:03 AM, Eckhard Maaß
 wrote:
> On Fri, Apr 27, 2018 at 01:23:20PM -0700, Elijah Newren wrote:
>> I doubt it has ever been discussed before this thread.  But, if you're
>> curious, I'll try to dump a few thoughts.
>
> Thank you, I try to dump some of mine, too. Maybe let me first stress
> that for me copy detection without --find-copies-harder is much more a
> "find content extracted" (like methods being factored out). In a way
> this is nearer to a rename than to a real copy.

Ooh, if you wanted to detect movement of code between files (as blame
does, I think searches for PICKAXE_BLAME_MOVE would point you in the
right direction) and then try to use that during merge to allow
applied changes to move with the code, that would be awesome.
Expensive, and might be a lot of work to wire it all up, but it'd be
very interesting.  I was only discussing DIFF_DETECT_COPY in my
previous email, which was all about duplicating entire files; that's
something I don't see utility in right now for resolving merges.


> I admit that a "real" copy would get unnoticed that way. But the
> semantics of such a copy isn't too clear for me either - did I copy the
> other part to make it independent of the other or did I just employ a
> copy and paste tactic? The former does not want the changes, the later
> does. But I am happy catering to the former here.

Right, if you have to assume that the copy was made to make the code
independent, then there's no value for merge resolution to having
detected the copy in the first place.  That has the advantage of
side-stepping the possible new edge and corner cases I mentioned in
the rest of my email, but it means we shouldn't even spend time
detecting copies -- whether whole file (via DIFF_DETECT_COPY) or
individual lines (via PICKAXE_BLAME_COPY and variants).


Elijah


Re: [PATCH v3 2/3] merge: Add merge.renames config setting

2018-04-30 Thread Eckhard Maaß
On Fri, Apr 27, 2018 at 01:23:20PM -0700, Elijah Newren wrote:
> I doubt it has ever been discussed before this thread.  But, if you're
> curious, I'll try to dump a few thoughts.

Thank you, I try to dump some of mine, too. Maybe let me first stress
that for me copy detection without --find-copies-harder is much more a
"find content extracted" (like methods being factored out). In a way
this is nearer to a rename than to a real copy.


> [...] Let's say we have branches
> A and B, and:
>A: modifies file z
>B: copies z to y
> 
> Should the modifications to z done in A propagate to both z and y?  If
> not, what good is copy detection?  If so, then there are several
> ramifications...

If one just assumes the most likely outcome is that something from z wad
factored out to y, it might just be sufficient to see whether the
modifications of the two branches apply cleanly - if A touched the parts
of B that have been factored out there would be a normal merge conflict
(where one could be nice and give a hint that some content was copied to
y on the B branch), if A did not touched the parts touched (or moved) by
B, then there is no problem. If A exactly deleted the content moved by
B, there will be no conflict - but this is seems to be strange anyway.

I admit that a "real" copy would get unnoticed that way. But the
semantics of such a copy isn't too clear for me either - did I copy the
other part to make it independent of the other or did I just employ a
copy and paste tactic? The former does not want the changes, the later
does. But I am happy catering to the former here.

To sum up:
- fail as before for conflicting merges, but give a hint that one has
  copied to quicken up resolution.


> - If B not only copied z but also first modified it, then do we have
>   potential conflicts with both z and y -- possibly the exact same
>   conflicts, making the user resolve them repeatedly?

With the above suggestion, if there are conflicts, you fail and give a
hint.

> - What if A copied z to x?  Do changes to z propagate to all three of
>   z and x and y?  Do changes to either x or y affect z?  Do they
>   affect each other?

A copy on branch to x and one another to y seems strange even if z
merges cleanly. Did both sides try to factor the same thing out to
different files? Or did they try to make something independent, but
managed to make it to different files? For this I would be inclined to
just suggest fail with a copy/copy(somewhere else). But this is a real
corner case after all. Has anyone seen just thing in practice?

> - If A deleted z, does that give us a copy/delete conflict for y?  Do
>   we also have to worry about copy/add conflicts?  copy/add/delete?
>   rename/copy (multiple variants)?  copy/copy?

We do have the modified/deleted conflict where we could hint that
content also has been copied and then not try to do more.

> - Extra degrees of freedom may mean new conflict types:
> 
>   - The extra degrees of freedom from renames introduced multiple new
> conflict types (e.g. rename/add, rename/rename(1to2),
> rename/rename(2to1)).

For renaming one side and coping the other, I would think doing the same
as above is sensible enough: if there are conflicts one can give an
additional hint of the one part having been copied, but not change the
kind of conflicts much.

> The more I think about it, the more I think that attempting to detect
> copies in a merge algorithm just doesn't make sense.  Anything I can
> think of that someone might attempt to use detected copies for would
> just surprise users in a bad way...

Hm, it didn't sound like that. Would you think that users would be
surprised by my suggestions? Or are they all too corner casey to be
worth implementing anyway?

Greetings,
Eckhard


Re: [PATCH v3 2/3] merge: Add merge.renames config setting

2018-04-27 Thread Elijah Newren
On Fri, Apr 27, 2018 at 11:37 AM, Eckhard Maaß
 wrote:
> On Fri, Apr 27, 2018 at 11:23:56AM +0900, Junio C Hamano wrote:
>> I think demoting from copy to rename-only is a good idea, at least
>> for now, because I do not believe we have figured out what we want
>> to happen when we detect copied files are involved in a merge.
>
> Does anyone know some threads concerning that topic? I tried to search
> for it, but somehow "merge copy detection" did not find me useful
> threads so far. I would be interested in that topic anyway, so I would
> like to know what the ideas are that floated so far.
>

I doubt it has ever been discussed before this thread.  But, if you're
curious, I'll try to dump a few thoughts.  Let's say we have branches
A and B, and:
   A: modifies file z
   B: copies z to y

Should the modifications to z done in A propagate to both z and y?  If
not, what good is copy detection?  If so, then there are several
ramifications...

- If B not only copied z but also first modified it, then do we have
  potential conflicts with both z and y -- possibly the exact same
  conflicts, making the user resolve them repeatedly?

- What if A copied z to x?  Do changes to z propagate to all three of
  z and x and y?  Do changes to either x or y affect z?  Do they
  affect each other?

- If A deleted z, does that give us a copy/delete conflict for y?  Do
  we also have to worry about copy/add conflicts?  copy/add/delete?
  rename/copy (multiple variants)?  copy/copy?

- Extra degrees of freedom may mean new conflict types:

  - The extra degrees of freedom from renames introduced multiple new
conflict types (e.g. rename/add, rename/rename(1to2),
rename/rename(2to1)).

  - Directory rename detection added more (rename/rename/rename(1to3),
rename/rename(Nto1), add/add/add, directory/file/file, n-fold
transitive rename, etc.), -- and forced us to specify various
rules to limit the possibility space so that conflicts could be
representable in the index and understandable by the user.

  - I suspect adding copies would add new types of conflicts we
haven't thought of yet.

The more I think about it, the more I think that attempting to detect
copies in a merge algorithm just doesn't make sense.  Anything I can
think of that someone might attempt to use detected copies for would
just surprise users in a bad way...and it'd open up a big can of edge
and corner case problems as well.


Re: [PATCH v3 2/3] merge: Add merge.renames config setting

2018-04-27 Thread Eckhard Maaß
On Fri, Apr 27, 2018 at 11:23:56AM +0900, Junio C Hamano wrote:
> I think demoting from copy to rename-only is a good idea, at least
> for now, because I do not believe we have figured out what we want
> to happen when we detect copied files are involved in a merge.

Does anyone know some threads concerning that topic? I tried to search
for it, but somehow "merge copy detection" did not find me useful
threads so far. I would be interested in that topic anyway, so I would
like to know what the ideas are that floated so far.

Greetings,
Eckhard


Re: [PATCH v3 2/3] merge: Add merge.renames config setting

2018-04-27 Thread Elijah Newren
Hi Dsco,

On Fri, Apr 27, 2018 at 12:23 AM, Johannes Schindelin
 wrote:
> Hi,

>
> Guys, you argued long and hard that one config setting (diff.renames)
> should magically imply another one (merge.renames), on the basis that they
> essentially do the same.

I apologize for any frustration; I've probably caused a good deal of
it by repeatedly catching or noticing things after one review and then
bringing them up later.  I just didn't catch it all at first.

Your restatement of the basis of the argument doesn't match my
rationale, though.  My reasoning was that (1) the configuration
options exist to allow the user to specify how much of a performance
cost they're willing to pay, (2) the two options are separate because
some users might want to configure one more loosely or tightly than
the other, but (3) many users will want to just specify once how much
they are willing to pay without being forced to repeat themselves for
different parts of git (diff, merge, possible future commands).

I'll add to my former basis by stating that I think the diffstat at
the end of the merge should be controlled by these variables, even if
that's not implemented as part of Ben's series.  And both of those
configuration options clearly feed into whether that diffstat should
be doing rename or copy or no detection.  While they could be kept
separate options, forcing us to somehow decide which one overrides
which, I think it's much more natural if we already have merge.renames
inheriting from and overriding diff.renames.

> And now you are suggesting that they *cannot* be essentially the same? Are
> you agreeing on such a violation of the Law of Least Surprise?
>
> Please, make up your mind. Inheriting merge.renames from diff.renames is
> "magic" enough to be puzzling. Introducing that auto-demoting is just
> simply creating a bad user experience. Please don't.

>From my view, resolve and octopus merges auto-demote diff.renames and
merge.renames to false.  In fact, they optimize the work involved in
that demotion by not even checking the value.  I think demoting "copy"
to "true" in the recursive merge machinery is no more surprising than
that is.  You can find more details to this argument in the portion of
my email that you responded to but snipped out.

But to add to that argument, if auto-demotion is a violation of the
Law of Least Surprise, as you claim, then naming this option
merge.renames is also a violation of the Law of Least Surprise.  You
would instead need to rename it to something like
merge.recursive.renames.  (And then when a new merge strategy comes
along that also does renames, we'll need to add a merge.ort.renames.)

> It is fine, of course, to admit at this stage that the whole magic idea of
> letting merge.renames inherit from diff.renames was only half thought
> through (we're in reviewing stage right now for the purpose of weighing
> alternative approaches and trying to come up with the best solution after
> all) and that we should go with Ben's original approach, which has the
> rather huge and dramatic advantage of being very, very clear and easy to
> understand to the user. We could even document that (and why) the 'copy'
> value in merge.renames is not allowed for the time being.

It was only half thought through, yes, at least by me.  But the more I
think through it, the more I like the inheritance personally.  I see
no problem with the demotion and think the inheritance has the
advantage of being easier to understand, because I see your proposal
as causing questions like:
  - "Why does merge.renameLimit inherit from diff.renameLimit but
merge.renames doesn't from diff.renames?"
  - "Why can't I just specify in one place that I {am, am not} willing
to pay for {full, both copy and} rename detection wherever it makes
sense?"
  - "How do I control the detection for the diffstat at the end of the merge?"


Hope that helps,
Elijah


Re: [PATCH v3 2/3] merge: Add merge.renames config setting

2018-04-27 Thread Johannes Schindelin
Hi,

On Thu, 26 Apr 2018, Elijah Newren wrote:

> On Thu, Apr 26, 2018 at 7:23 PM, Junio C Hamano  wrote:
> > Ben Peart  writes:
> >
> >> Color me puzzled. :)  The consensus was that the default value for
> >> merge.renames come from diff.renames.  diff.renames supports copy
> >> detection which means that merge.renames will inherit that value.  My
> >> assumption was that is what was intended so when I reimplemented it, I
> >> fully implemented it that way.
> >>
> >> Are you now requesting to only use diff.renames as the default if the
> >> value is true or false but not if it is copy?  What should happen if
> >> diff.renames is actually set to copy?  Should merge silently change
> >> that to true, display a warning, error out, or something else?  Do you
> >> have some other behavior for how to handle copy being inherited from
> >> diff.renames you'd like to see?
> >>
> >> Can you write the documentation that clearly explains the exact
> >> behavior you want?  That would kill two birds with one stone... :)
> >
> > I think demoting from copy to rename-only is a good idea, at least
> > for now, because I do not believe we have figured out what we want
> > to happen when we detect copied files are involved in a merge.
> >
> > But I am not sure if we even want to fail merge.renames=copy as an
> > invalid configuration.  So my gut feeling of the best solution to
> > the above is to do something like:
> >
> >  - whether the configuration comes from diff.renames or
> >merge.renames, turn *.renames=copy to true inside the merge
> >recursive machinery.
> >
> >  - document the fact in "git merge-recursive" documentation (or "git
> >merge" documentation) to say "_currently_ asking for rename
> >detection to find copies and renames will do the same
> >thing---copies are ignored", impliying "this might change in the
> >future", in the BUGS section.
> 
> Yes, I agree.

Guys, you argued long and hard that one config setting (diff.renames)
should magically imply another one (merge.renames), on the basis that they
essentially do the same.

And now you are suggesting that they *cannot* be essentially the same? Are
you agreeing on such a violation of the Law of Least Surprise?

Please, make up your mind. Inheriting merge.renames from diff.renames is
"magic" enough to be puzzling. Introducing that auto-demoting is just
simply creating a bad user experience. Please don't.

It is fine, of course, to admit at this stage that the whole magic idea of
letting merge.renames inherit from diff.renames was only half thought
through (we're in reviewing stage right now for the purpose of weighing
alternative approaches and trying to come up with the best solution after
all) and that we should go with Ben's original approach, which has the
rather huge and dramatic advantage of being very, very clear and easy to
understand to the user. We could even document that (and why) the 'copy'
value in merge.renames is not allowed for the time being.

Ciao,
Dscho


Re: [PATCH v3 2/3] merge: Add merge.renames config setting

2018-04-26 Thread Elijah Newren
On Thu, Apr 26, 2018 at 5:54 PM, Ben Peart  wrote:
> On 4/26/2018 6:52 PM, Elijah Newren wrote:
>> On Thu, Apr 26, 2018 at 1:52 PM, Ben Peart 
>> wrote:
>>
>>> diff --git a/merge-recursive.h b/merge-recursive.h
>>> index 80d69d1401..0c5f7eff98 100644
>>> --- a/merge-recursive.h
>>> +++ b/merge-recursive.h
>>> @@ -17,7 +17,8 @@ struct merge_options {
>>>  unsigned renormalize : 1;
>>>  long xdl_opts;
>>>  int verbosity;
>>> -   int detect_rename;
>>> +   int diff_detect_rename;
>>> +   int merge_detect_rename;
>>>  int diff_rename_limit;
>>>  int merge_rename_limit;
>>>  int rename_score;
>>> @@ -28,6 +29,11 @@ struct merge_options {
>>>  struct hashmap current_file_dir_set;
>>>  struct string_list df_conflict_file_set;
>>>   };
>>> +inline int merge_detect_rename(struct merge_options *o)
>>> +{
>>> +   return o->merge_detect_rename >= 0 ? o->merge_detect_rename :
>>> +   o->diff_detect_rename >= 0 ? o->diff_detect_rename : 1;
>>> +}
>>
>>
>> Why did you split o->detect_rename into two fields?  You then
>> recombine them in merge_detect_rename(), and after initial setup only
>> ever access them through that function.  Having two fields worries me
>> that people will accidentally introduce bugs by using one of them
>> instead of the merge_detect_rename() function.  Is there a reason you
>> decided against having the initial setup just set a single value and
>> then use it directly?
>>
> The setup of this value is split into 3 places that may or may not all get
> called.  The initial values, the values that come from the config settings
> and then any values passed on the command line.
>
> Because the merge value can now inherit from the diff value, you only know
> the final value after you have received all possible inputs.  That makes it
> necessary to be a calculated value.
>
> If you look at diff_rename_limit/merge_rename_limit, detect_rename follow
> the same pattern for the same reasons.  It turns out detect_rename was a
> little more complex because it is used in 3 different locations (vs just
> one) which is why I wrapped the inheritance logic into the helper function
> merge_detect_rename().

Ah, you're following the precedent set by
diff_rename_limit/merge_rename_limit; that makes sense.  Thanks for
the explanation.  I believe another possibility here is that for both
the {merge,diff}_rename_limit pair of variables and the
{diff,merge}_renames pair of variables, since the code parses all
inputs before ever using the result, we could calculate the result
once and store it rather than storing the constituent pieces of the
calculation.  That would also prevent people from trying to use one of
the pieces of the calculation instead of treating it as a coherent
whole.  However, while I would have preferred that the rename_limit
pair of variables also went away in favor of just one field which is
updated as it parses each input option, what you have is fine for this
series.


Re: [PATCH v3 2/3] merge: Add merge.renames config setting

2018-04-26 Thread Elijah Newren
On Thu, Apr 26, 2018 at 7:23 PM, Junio C Hamano  wrote:
> Ben Peart  writes:
>
>> Color me puzzled. :)  The consensus was that the default value for
>> merge.renames come from diff.renames.  diff.renames supports copy
>> detection which means that merge.renames will inherit that value.  My
>> assumption was that is what was intended so when I reimplemented it, I
>> fully implemented it that way.
>>
>> Are you now requesting to only use diff.renames as the default if the
>> value is true or false but not if it is copy?  What should happen if
>> diff.renames is actually set to copy?  Should merge silently change
>> that to true, display a warning, error out, or something else?  Do you
>> have some other behavior for how to handle copy being inherited from
>> diff.renames you'd like to see?
>>
>> Can you write the documentation that clearly explains the exact
>> behavior you want?  That would kill two birds with one stone... :)
>
> I think demoting from copy to rename-only is a good idea, at least
> for now, because I do not believe we have figured out what we want
> to happen when we detect copied files are involved in a merge.
>
> But I am not sure if we even want to fail merge.renames=copy as an
> invalid configuration.  So my gut feeling of the best solution to
> the above is to do something like:
>
>  - whether the configuration comes from diff.renames or
>merge.renames, turn *.renames=copy to true inside the merge
>recursive machinery.
>
>  - document the fact in "git merge-recursive" documentation (or "git
>merge" documentation) to say "_currently_ asking for rename
>detection to find copies and renames will do the same
>thing---copies are ignored", impliying "this might change in the
>future", in the BUGS section.

Yes, I agree.  One more thing:

  - It may be best to avoid advertising "copies" as a vaild option for
merge.renames since it doesn't have any current practical use
anywhere.  (Remove the sentence 'If set to "copies" or "copy", Git
will detect copies, as well.' from the documentation)

My rationale for translating "copy" to "true" is a little different
than Junio's, though:

1) The reason we have configuration options around renames and copies
is primarily because they are expensive to compute.  So we let some
users specify that they don't want them, other users are willing to
pay for rename detection, and others are willing to pay for both
rename and copy detection.
2) If rename/copy detection were cheap, every part of git would just
compute whatever level of detection was relevant and use it.
3) The resolve and octopus merge strategies ignores diff.renames and
merge.renames, because they don't have logic to use any rename
information.  diff and log can use both renames and copies.  And the
recursive merge machinery is code which can use renames but not
copies.
4) Therefore, translating from "copy" to "true" inside the merge
recursive machinery is fine and not an error because we are using as
much detection information as is relevant to the algorithm and which
the user is willing to pay for.

To throw one more wrinkle in here, merge.renames could actually be set
to "copy" and make sense, because we compute diffs multiple times.
Twice within the recursive merge machinery (for which we'd want to
translate "copy" to "true"), and once for the diffstat at the end
(which comes from builtin/merge.c, and for which it could make sense
to detect copies).

(Kind of curious whether Junio agrees with my rationale or thinks I'm
out in left field with it...)


Re: [PATCH v3 2/3] merge: Add merge.renames config setting

2018-04-26 Thread Junio C Hamano
Ben Peart  writes:

> Color me puzzled. :)  The consensus was that the default value for
> merge.renames come from diff.renames.  diff.renames supports copy
> detection which means that merge.renames will inherit that value.  My
> assumption was that is what was intended so when I reimplemented it, I
> fully implemented it that way.
>
> Are you now requesting to only use diff.renames as the default if the
> value is true or false but not if it is copy?  What should happen if
> diff.renames is actually set to copy?  Should merge silently change
> that to true, display a warning, error out, or something else?  Do you
> have some other behavior for how to handle copy being inherited from
> diff.renames you'd like to see?
>
> Can you write the documentation that clearly explains the exact
> behavior you want?  That would kill two birds with one stone... :)

I think demoting from copy to rename-only is a good idea, at least
for now, because I do not believe we have figured out what we want
to happen when we detect copied files are involved in a merge.

But I am not sure if we even want to fail merge.renames=copy as an
invalid configuration.  So my gut feeling of the best solution to
the above is to do something like:

 - whether the configuration comes from diff.renames or
   merge.renames, turn *.renames=copy to true inside the merge
   recursive machinery.

 - document the fact in "git merge-recursive" documentation (or "git
   merge" documentation) to say "_currently_ asking for rename
   detection to find copies and renames will do the same
   thing---copies are ignored", impliying "this might change in the
   future", in the BUGS section.



Re: [PATCH v3 2/3] merge: Add merge.renames config setting

2018-04-26 Thread Ben Peart



On 4/26/2018 6:52 PM, Elijah Newren wrote:

On Thu, Apr 26, 2018 at 1:52 PM, Ben Peart  wrote:


+merge.renames::
+   Whether and how Git detects renames.  If set to "false",
+   rename detection is disabled. If set to "true", basic rename
+   detection is enabled.  If set to "copies" or "copy", Git will
+   detect copies, as well.  Defaults to the value of diff.renames.
+


We shouldn't allow users to force copy detection on for merges  The
diff side of the code will detect them correctly but the code in
merge-recursive will mishandle the copy pairs.  I think fixing it is
somewhere between big can of worms and
it's-a-configuration-that-doesn't-even-make-sense, but it's been a
while since I thought about it.


Color me puzzled. :)  The consensus was that the default value for 
merge.renames come from diff.renames.  diff.renames supports copy 
detection which means that merge.renames will inherit that value.  My 
assumption was that is what was intended so when I reimplemented it, I 
fully implemented it that way.


Are you now requesting to only use diff.renames as the default if the 
value is true or false but not if it is copy?  What should happen if 
diff.renames is actually set to copy?  Should merge silently change that 
to true, display a warning, error out, or something else?  Do you have 
some other behavior for how to handle copy being inherited from 
diff.renames you'd like to see?


Can you write the documentation that clearly explains the exact behavior 
you want?  That would kill two birds with one stone... :)





diff --git a/merge-recursive.h b/merge-recursive.h
index 80d69d1401..0c5f7eff98 100644
--- a/merge-recursive.h
+++ b/merge-recursive.h
@@ -17,7 +17,8 @@ struct merge_options {
 unsigned renormalize : 1;
 long xdl_opts;
 int verbosity;
-   int detect_rename;
+   int diff_detect_rename;
+   int merge_detect_rename;
 int diff_rename_limit;
 int merge_rename_limit;
 int rename_score;
@@ -28,6 +29,11 @@ struct merge_options {
 struct hashmap current_file_dir_set;
 struct string_list df_conflict_file_set;
  };
+inline int merge_detect_rename(struct merge_options *o)
+{
+   return o->merge_detect_rename >= 0 ? o->merge_detect_rename :
+   o->diff_detect_rename >= 0 ? o->diff_detect_rename : 1;
+}


Why did you split o->detect_rename into two fields?  You then
recombine them in merge_detect_rename(), and after initial setup only
ever access them through that function.  Having two fields worries me
that people will accidentally introduce bugs by using one of them
instead of the merge_detect_rename() function.  Is there a reason you
decided against having the initial setup just set a single value and
then use it directly?



The setup of this value is split into 3 places that may or may not all 
get called.  The initial values, the values that come from the config 
settings and then any values passed on the command line.


Because the merge value can now inherit from the diff value, you only 
know the final value after you have received all possible inputs.  That 
makes it necessary to be a calculated value.


If you look at diff_rename_limit/merge_rename_limit, detect_rename 
follow the same pattern for the same reasons.  It turns out 
detect_rename was a little more complex because it is used in 3 
different locations (vs just one) which is why I wrapped the inheritance 
logic into the helper function merge_detect_rename().


Re: [PATCH v3 2/3] merge: Add merge.renames config setting

2018-04-26 Thread Elijah Newren
On Thu, Apr 26, 2018 at 1:52 PM, Ben Peart  wrote:

> +merge.renames::
> +   Whether and how Git detects renames.  If set to "false",
> +   rename detection is disabled. If set to "true", basic rename
> +   detection is enabled.  If set to "copies" or "copy", Git will
> +   detect copies, as well.  Defaults to the value of diff.renames.
> +

We shouldn't allow users to force copy detection on for merges  The
diff side of the code will detect them correctly but the code in
merge-recursive will mishandle the copy pairs.  I think fixing it is
somewhere between big can of worms and
it's-a-configuration-that-doesn't-even-make-sense, but it's been a
while since I thought about it.

> diff --git a/merge-recursive.h b/merge-recursive.h
> index 80d69d1401..0c5f7eff98 100644
> --- a/merge-recursive.h
> +++ b/merge-recursive.h
> @@ -17,7 +17,8 @@ struct merge_options {
> unsigned renormalize : 1;
> long xdl_opts;
> int verbosity;
> -   int detect_rename;
> +   int diff_detect_rename;
> +   int merge_detect_rename;
> int diff_rename_limit;
> int merge_rename_limit;
> int rename_score;
> @@ -28,6 +29,11 @@ struct merge_options {
> struct hashmap current_file_dir_set;
> struct string_list df_conflict_file_set;
>  };
> +inline int merge_detect_rename(struct merge_options *o)
> +{
> +   return o->merge_detect_rename >= 0 ? o->merge_detect_rename :
> +   o->diff_detect_rename >= 0 ? o->diff_detect_rename : 1;
> +}

Why did you split o->detect_rename into two fields?  You then
recombine them in merge_detect_rename(), and after initial setup only
ever access them through that function.  Having two fields worries me
that people will accidentally introduce bugs by using one of them
instead of the merge_detect_rename() function.  Is there a reason you
decided against having the initial setup just set a single value and
then use it directly?


[PATCH v3 2/3] merge: Add merge.renames config setting

2018-04-26 Thread Ben Peart
Add the ability to control rename detection for merge via a config setting.
This setting behaves the same and defaults to the value of diff.renames but only
applies to merge.

Reviewed-by: Johannes Schindelin 
Signed-off-by: Ben Peart 
---
 Documentation/merge-config.txt|  6 ++
 Documentation/merge-strategies.txt|  6 --
 diff.c|  2 +-
 diff.h|  2 ++
 merge-recursive.c | 23 +--
 merge-recursive.h |  8 +++-
 t/t3034-merge-recursive-rename-options.sh | 18 ++
 7 files changed, 55 insertions(+), 10 deletions(-)

diff --git a/Documentation/merge-config.txt b/Documentation/merge-config.txt
index 48ee3bce77..59848e5634 100644
--- a/Documentation/merge-config.txt
+++ b/Documentation/merge-config.txt
@@ -38,6 +38,12 @@ merge.renameLimit::
diff.renameLimit. This setting has no effect if rename detection
is turned off.
 
+merge.renames::
+   Whether and how Git detects renames.  If set to "false",
+   rename detection is disabled. If set to "true", basic rename
+   detection is enabled.  If set to "copies" or "copy", Git will
+   detect copies, as well.  Defaults to the value of diff.renames.
+
 merge.renormalize::
Tell Git that canonical representation of files in the
repository has changed over time (e.g. earlier commits record
diff --git a/Documentation/merge-strategies.txt 
b/Documentation/merge-strategies.txt
index 4a58aad4b8..1e0728aa12 100644
--- a/Documentation/merge-strategies.txt
+++ b/Documentation/merge-strategies.txt
@@ -84,12 +84,14 @@ no-renormalize;;
`merge.renormalize` configuration variable.
 
 no-renames;;
-   Turn off rename detection.
+   Turn off rename detection. This overrides the `merge.renames`
+   configuration variable.
See also linkgit:git-diff[1] `--no-renames`.
 
 find-renames[=];;
Turn on rename detection, optionally setting the similarity
-   threshold.  This is the default.
+   threshold.  This is the default. This overrides the
+   'merge.renames' configuration variable.
See also linkgit:git-diff[1] `--find-renames`.
 
 rename-threshold=;;
diff --git a/diff.c b/diff.c
index 1289df4b1f..5dfc24aa6d 100644
--- a/diff.c
+++ b/diff.c
@@ -177,7 +177,7 @@ static int parse_submodule_params(struct diff_options 
*options, const char *valu
return 0;
 }
 
-static int git_config_rename(const char *var, const char *value)
+int git_config_rename(const char *var, const char *value)
 {
if (!value)
return DIFF_DETECT_RENAME;
diff --git a/diff.h b/diff.h
index d29560f822..806faee2b3 100644
--- a/diff.h
+++ b/diff.h
@@ -324,6 +324,8 @@ extern int git_diff_ui_config(const char *var, const char 
*value, void *cb);
 extern void diff_setup(struct diff_options *);
 extern int diff_opt_parse(struct diff_options *, const char **, int, const 
char *);
 extern void diff_setup_done(struct diff_options *);
+extern int git_config_rename(const char *var, const char *value);
+
 
 #define DIFF_DETECT_RENAME 1
 #define DIFF_DETECT_COPY   2
diff --git a/merge-recursive.c b/merge-recursive.c
index 0c0d48624d..2637d34d87 100644
--- a/merge-recursive.c
+++ b/merge-recursive.c
@@ -555,13 +555,13 @@ static struct string_list *get_renames(struct 
merge_options *o,
struct diff_options opts;
 
renames = xcalloc(1, sizeof(struct string_list));
-   if (!o->detect_rename)
+   if (!merge_detect_rename(o))
return renames;
 
diff_setup();
opts.flags.recursive = 1;
opts.flags.rename_empty = 0;
-   opts.detect_rename = DIFF_DETECT_RENAME;
+   opts.detect_rename = merge_detect_rename(o);
opts.rename_limit = o->merge_rename_limit >= 0 ? o->merge_rename_limit :
o->diff_rename_limit >= 0 ? o->diff_rename_limit :
1000;
@@ -2232,9 +2232,18 @@ int merge_recursive_generic(struct merge_options *o,
 
 static void merge_recursive_config(struct merge_options *o)
 {
+   char *value = NULL;
git_config_get_int("merge.verbosity", >verbosity);
git_config_get_int("diff.renamelimit", >diff_rename_limit);
git_config_get_int("merge.renamelimit", >merge_rename_limit);
+   if (!git_config_get_string("diff.renames", )) {
+   o->diff_detect_rename = git_config_rename("diff.renames", 
value);
+   free(value);
+   }
+   if (!git_config_get_string("merge.renames", )) {
+   o->merge_detect_rename = git_config_rename("merge.renames", 
value);
+   free(value);
+   }
git_config(git_xmerge_config, NULL);
 }
 
@@ -2244,10 +2253,11 @@ void init_merge_options(struct merge_options *o)
memset(o, 0, sizeof(struct merge_options));
o->verbosity = 2;