Re: feature suggestion: optimize common parts for checkout --conflict=diff3

2013-04-04 Thread Jeff King
On Thu, Mar 07, 2013 at 01:50:46PM -0500, Jeff King wrote:

 On Thu, Mar 07, 2013 at 10:40:46AM -0800, Junio C Hamano wrote:
 
  Where we differ is if such information loss is a good thing to have.
 
  We could say both sides added, identically is auto-resolved when
  you use the zealous option, and do so regardless of how the merge
  conflicts are presented.  Then it becomes perfectly fine to eject
  A and E out of the conflicted block and merge them to be part of
  pre/post contexts.  The same goes for reducing C|=C to C.  As
  long as we clearly present the users what the option does and what
  its implications are, it is not bad to have such an option, I think.
 
 Exactly. I do think it has real-world uses (see the example script I
 posted yesterday), but it would never replace diff3. I'm going to try it
 out for a bit. As I mentioned yesterday, I see those sorts of
 cherry-pick-with-something-on-top conflicts when I am rebasing onto or
 merging my topics into what you have picked up from the same topic on
 the list.

I wanted to give an update on how this has been going. I've been running
with zdiff3 for almost a month. I keep my merge.conflictstyle set to
diff3, and when I see something that I think might benefit from the
both sides added zealousness, I do a git checkout --conflict=zdiff3
and examine the result.

I have seen it help, and always when rebasing patches that were accepted
upstream. For example, imagine I added a big block of text in one patch
(e.g., an entire test script). Then I added more tests in a follow-on
patch. Or I change some of the lines from expect_failure to
expect_success. You can see this in t1060 of the
jk/check-corrupt-objects-carefully topic (I didn't try, but you could
probably reproduce by just rebasing it on top of the current master).

When I rebase my version of the patches on your master with the new
content, the conflict for the first patch is useless in diff3. I see
that the base had nothing, upstream added a hundred lines, and my patch
added ninety lines. But it's hard to see which lines are missing or
modified because of the size of the conflict. It looks like:

ours
   #!/bin/sh
   test_description=whatever
   ...
  end of some test
   '
   test_done
   ||| base
   ===
   #!/bin/sh
   test_description=whatever
   ...
  end of another test
   '
   test_done
theirs

The interesting part is in the ..., which contains different lines in
each version, but it may be hundreds of lines long. Using zdiff3, I get:

   #!/bin/sh
   test_description=whatever
   ...
ours
   test_expect_success 'some_new_test' '
   ...
   ||| base
   ===
theirs
   '
   test_done

I can see that nothing was tweaked; I just didn't add any content there,
and upstream did. Contrast this with zealous merge conflicts, which
would look like:

  #!/bin/sh
  test_description=whatever
  ...
   ours
  test_expect_success 'some_new_test' '
  ...
  ===
   theirs
  '
  test_done

which similarly condenses, but is missing a piece of information: that
there was nothing in the base. I don't know whether the conflict is
there because my patch removed some content that got changed upstream,
or whether upstream added some content that I did not have in my patch.

So I think it is useful when rebasing on top of what upstream took,
specifically when:

  1. You have a series that updates the same hunk repeatedly (because
 from your perspective, you see only the tip of what upstream took).

  2. Upstream takes your patch but tweaks it (either as a fixup, to deal
 with a merge conflict, or whatever). You get to see the minimal
 tweak, not the fact that you have a giant hunk which differs from
 the upstream only by a few characters or a few lines.

So I do think zdiff3 is useful, and I plan to continue using it.

-Peff
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: feature suggestion: optimize common parts for checkout --conflict=diff3

2013-04-04 Thread Uwe Kleine-König
Hi Jeff,

On Thu, Apr 04, 2013 at 04:33:44PM -0400, Jeff King wrote:
 [...]
 So I do think zdiff3 is useful, and I plan to continue using it.
Thanks for your description. I'm using and liking zdiff3, too. So I'd
really like seeing it in vanilla git.

Thanks
Uwe

-- 
Pengutronix e.K.   | Uwe Kleine-König|
Industrial Linux Solutions | http://www.pengutronix.de/  |
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: feature suggestion: optimize common parts for checkout --conflict=diff3

2013-04-04 Thread Jeff King
On Thu, Apr 04, 2013 at 10:49:44PM +0200, Uwe Kleine-König wrote:

 On Thu, Apr 04, 2013 at 04:33:44PM -0400, Jeff King wrote:
  [...]
  So I do think zdiff3 is useful, and I plan to continue using it.
 Thanks for your description. I'm using and liking zdiff3, too. So I'd
 really like seeing it in vanilla git.

I don't know if Junio is interested in taking a patch with the concept
or not, but as I recall, your patch needed at least a documentation
update before it could be picked up. Unless Junio wants to say no, I am
not interested at all, I think the next step would be to repost an
updated version.

-Peff
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: feature suggestion: optimize common parts for checkout --conflict=diff3

2013-04-04 Thread Junio C Hamano
Jeff King p...@peff.net writes:

 On Thu, Apr 04, 2013 at 10:49:44PM +0200, Uwe Kleine-König wrote:

 On Thu, Apr 04, 2013 at 04:33:44PM -0400, Jeff King wrote:
  [...]
  So I do think zdiff3 is useful, and I plan to continue using it.
 Thanks for your description. I'm using and liking zdiff3, too. So I'd
 really like seeing it in vanilla git.

 I don't know if Junio is interested in taking a patch with the concept
 or not,...

In two messages upthread before you restarted this discussion, I
said:

As long as we clearly present the users what the option does and
what its implications are, it is not bad to have such an option,
I think.

 but as I recall, your patch needed at least a documentation update
 before it could be picked up. Unless Junio wants to say no, I am
 not interested at all, I think the next step would be to repost
 an updated version.

Yup.
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: feature suggestion: optimize common parts for checkout --conflict=diff3

2013-03-07 Thread Jeff King
On Wed, Mar 06, 2013 at 01:32:28PM -0800, Junio C Hamano wrote:

  We show both sides added, either identically or differently as
  noteworthy events, but the patched code pushes both sides added
  identically case outside the conflicting hunk, as if what was added
  relative to the common ancestor version (in Uwe's case, is it 1-14
  that is common, or just 10-14?) is not worth looking at when
  considering what the right resolution is.  If it is not worth
  looking at what was in the original for the conflicting part, why
  would we be even using diff3 mode in the first place?
 
 I vaguely recall we did this clip to eager as an explicit bugfix
 at 83133740d9c8 (xmerge.c: diff3 -m style clips merge reduction
 level to EAGER or less, 2008-08-29).  The list archive around that
 time may give us more contexts.

Thanks for the pointer. The relevant threads are:

  http://article.gmane.org/gmane.comp.version-control.git/94228

and

  http://thread.gmane.org/gmane.comp.version-control.git/94339

There is not much discussion beyond what ended up in 8313374; both Linus
and Dscho question whether level and output format are orthogonal, but
seem to accept the explanation you give in the commit message.

Having read that commit and the surrounding thread, I think I stand by
my argument that zdiff3 is a useful tool to have, as long as the user
understands what the hunks mean. It should never replace diff3, but I
think it makes sense as a separate format.

I was also curious whether it would the diff3/zealous combination would
trigger any weird corner cases. In particular, I wanted to know how the
example you gave in that commit of:

  postimage#1: 1234ABCDE789
  |/
  |   /
  preimage:123456789
  |   \
  |\
  postimage#2: 1234AXCYE789

would react with diff3 (this is not the original example, but one with
an extra C in the middle of postimage#2, which could in theory be
presented as split hunks). However, it seems that we do not do such hunk
splitting at all, neither for diff3 nor for the merge representation.

-Peff
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: feature suggestion: optimize common parts for checkout --conflict=diff3

2013-03-07 Thread Junio C Hamano
Jeff King p...@peff.net writes:

 I was also curious whether it would the diff3/zealous combination would
 trigger any weird corner cases. In particular, I wanted to know how the
 example you gave in that commit of:

   postimage#1: 1234ABCDE789
   |/
   |   /
   preimage:123456789
   |   \
   |\
   postimage#2: 1234AXCYE789

 would react with diff3 (this is not the original example, but one with
 an extra C in the middle of postimage#2, which could in theory be
 presented as split hunks). However, it seems that we do not do such hunk
 splitting at all, neither for diff3 nor for the merge representation.

Without thinking about it too deeply,...

I think the RCS merge _could_ show it as 1234AB=XCD=YE789
without losing any information (as it is already discarding what was
in the original in the part that is affected by the conflict,
i.e. 56 was there).

Let's think aloud how diff3 -m _should_ split this. The most
straight-forward representation would be 1234ABCDE|56=AXCYE789,
that is, where 56 was originally there, one side made it to
ABCDE and the other AXCYE.

You could make it 1234AB|5=AXC|=CDE|6=YE789, and that is
technically correct (what there were in the shared original for the
conflicted part is 5 and then 6), but the representation pretends
that it knows more than there actually is information, which may be
somewhat misleading.  All these three are equally plausible split of
the original 56:

1234AB|=AXC|=CDE|56=YE789
1234AB|5=AXC|=CDE|6=YE789
1234AB|56=AXC|=CDE|=YE789

and picking one over others would be a mere heuristic.  All three
are technically correct representations and it is just the matter of
which one is the easiest to understand.  So, this is the kind of
misleading but not incorrect.

In all these cases, the middle part would look like this:

 ours
C
||| base
===
C
 theirs

in order to honor the explicit I want to view all three versions to
examine the situation aka --conflict=diff3 option.  We cannot
reduce it to just C.  That will make it not just misleading but
is actively wrong.
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: feature suggestion: optimize common parts for checkout --conflict=diff3

2013-03-07 Thread Jeff King
On Thu, Mar 07, 2013 at 09:26:05AM -0800, Junio C Hamano wrote:

 Without thinking about it too deeply,...
 
 I think the RCS merge _could_ show it as 1234AB=XCD=YE789
 without losing any information (as it is already discarding what was
 in the original in the part that is affected by the conflict,
 i.e. 56 was there).

Right, I think that is sane, though we do not do that at this point.

 Let's think aloud how diff3 -m _should_ split this. The most
 straight-forward representation would be 1234ABCDE|56=AXCYE789,
 that is, where 56 was originally there, one side made it to
 ABCDE and the other AXCYE.

Yes, that is what diff3 would do now (because it does not do any hunk
refinement at all), and should continue doing.

 You could make it 1234AB|5=AXC|=CDE|6=YE789, and that is
 technically correct (what there were in the shared original for the
 conflicted part is 5 and then 6), but the representation pretends
 that it knows more than there actually is information, which may be
 somewhat misleading.  All these three are equally plausible split of
 the original 56:
 
   1234AB|=AXC|=CDE|56=YE789
   1234AB|5=AXC|=CDE|6=YE789
   1234AB|56=AXC|=CDE|=YE789
 
 and picking one over others would be a mere heuristic.  All three
 are technically correct representations and it is just the matter of
 which one is the easiest to understand.  So, this is the kind of
 misleading but not incorrect.

Yes, I agree it is a heuristic about which part of a split hunk to place
deleted preimage lines in. Conceptually, I'm OK with that; the point of
zdiff3 is to try to make the conflict easier to read by eliminating
possibly uninteresting parts. It doesn't have to be right all the time;
it just has to be useful most of the time. But it's not clear how true
that would be in real life.

I think this is somewhat a moot point, though. We do not do this
splitting now. If we later learn to do it, there is nothing to say that
zdiff3 would have to adopt it also; it could stop at a lower
zealous-level than the regular merge markers. I think I'd want to
experiment with it and see some real-world examples before making a
decision on that.

 In all these cases, the middle part would look like this:
 
ours
 C
 ||| base
 ===
   C
  theirs
 
 in order to honor the explicit I want to view all three versions to
 examine the situation aka --conflict=diff3 option.  We cannot
 reduce it to just C.  That will make it not just misleading but
 is actively wrong.

I'm not sure I agree. In this output (which does the zealous
simplification, the splitting, and arbitrarily assigns deleted preimage
to the first of the split hunks):

  1234AB|56=XCD|YE789

I do not see the promotion of C to already resolved, you cannot tell if
it was really in the preimage or not as any more or less misleading or
wrong than that of A or E.  It is no more misleading than what the
merge-marker case would do, which would be:

  1234AB=XCD=YE789

The wrong thing to me is the arbitrary choice about how to distribute
the preimage lines. In this example, it is not a big deal for the
heuristic to be wrong; you can see both of the hunks. But if C is long,
and you do not even see D=Y while resolving B=X, seeing the preimage
there may become nonsensical.

But again, we don't do this splitting now. So I don't think it's
something that should make or break a decision to have zdiff3. Without
the splitting, I can see it being quite useful. I'm going to carry the
patch in my tree for a while and try using it in practice for a while.

-Peff
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: feature suggestion: optimize common parts for checkout --conflict=diff3

2013-03-07 Thread Junio C Hamano
Junio C Hamano gits...@pobox.com writes:

 You could make it 1234AB|5=AXC|=CDE|6=YE789, and that is
 technically correct (what there were in the shared original for the
 conflicted part is 5 and then 6), but the representation pretends
 that it knows more than there actually is information, which may be
 somewhat misleading.  All these three are equally plausible split of
 the original 56:

   1234AB|=AXC|=CDE|56=YE789
   1234AB|5=AXC|=CDE|6=YE789
   1234AB|56=AXC|=CDE|=YE789

 and picking one over others would be a mere heuristic.  All three
 are technically correct representations and it is just the matter of
 which one is the easiest to understand.  So, this is the kind of
 misleading but not incorrect.

I forgot to say that youu could even do something silly like:

1234AB|=AXC|56=CDE|=YE789

;-)

 In all these cases, the middle part would look like this:

ours
   C
   ||| base
   ===
   C
theirs

 in order to honor the explicit I want to view all three versions to
 examine the situation aka --conflict=diff3 option.  We cannot
 reduce it to just C.  That will make it not just misleading but
 is actively wrong.

I also forgot to say that the issue is the same to reduce

1234AB|=AXC|=CDE|56=YE789

to

1234A|=AB|=XC|=CD|56=YE|=E789

which is unconditionally correct and then for all x reduce x|=x to
x, yielding

1234AB|=XCD|56=YE789

which your zealous-diff3 would do.  So squashing that C|=C in the
middle would be consistent if you take the zealous-diff3 route.

But again, that is discarding the information of the original, which
the user explicitly asked from diff3 -m, i.e. show all three to
examine the situation. If the user wants to operate _without_ the
original, the user would have asked for RCS merge style output, so
I am still not sure if that is a sensible mode of operation for diff3
to begin with.




--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: feature suggestion: optimize common parts for checkout --conflict=diff3

2013-03-07 Thread Junio C Hamano
Jeff King p...@peff.net writes:

 I'm not sure I agree. In this output (which does the zealous
 simplification, the splitting, and arbitrarily assigns deleted preimage
 to the first of the split hunks):

   1234AB|56=XCD|YE789

 I do not see the promotion of C to already resolved, you cannot tell if
 it was really in the preimage or not as any more or less misleading or
 wrong than that of A or E.  It is no more misleading than what the
 merge-marker case would do, which would be:

   1234AB=XCD=YE789

That is exactly my point and I think we are in complete agreement.
While the intended difference between RCS merge and diff3 -m is for
the latter not to lose information on the original, zealous-diff3
chooses to lose information in both sides added, identically case.

Where we differ is if such information loss is a good thing to have.

We could say both sides added, identically is auto-resolved when
you use the zealous option, and do so regardless of how the merge
conflicts are presented.  Then it becomes perfectly fine to eject
A and E out of the conflicted block and merge them to be part of
pre/post contexts.  The same goes for reducing C|=C to C.  As
long as we clearly present the users what the option does and what
its implications are, it is not bad to have such an option, I think.

 The wrong thing to me is the arbitrary choice about how to distribute
 the preimage lines.

Yeah, but that is not diff3 -m vs zealous-diff3 issue, is it?
If you value the original and want to show it somewhere, you cannot
avoid making the choice whether you are zealous or not if you split
such a hunk.

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: feature suggestion: optimize common parts for checkout --conflict=diff3

2013-03-07 Thread Jeff King
On Thu, Mar 07, 2013 at 10:40:46AM -0800, Junio C Hamano wrote:

 Where we differ is if such information loss is a good thing to have.

 We could say both sides added, identically is auto-resolved when
 you use the zealous option, and do so regardless of how the merge
 conflicts are presented.  Then it becomes perfectly fine to eject
 A and E out of the conflicted block and merge them to be part of
 pre/post contexts.  The same goes for reducing C|=C to C.  As
 long as we clearly present the users what the option does and what
 its implications are, it is not bad to have such an option, I think.

Exactly. I do think it has real-world uses (see the example script I
posted yesterday), but it would never replace diff3. I'm going to try it
out for a bit. As I mentioned yesterday, I see those sorts of
cherry-pick-with-something-on-top conflicts when I am rebasing onto or
merging my topics into what you have picked up from the same topic on
the list.

I think the code in Uwe's patch looked fine, but it definitely needs a
documentation change to explain the new mode and its caveats. I'd also
be happy with a different name, if you think it implies that it is too
related to zdiff3, but I cannot think of anything better at the moment.

  The wrong thing to me is the arbitrary choice about how to distribute
  the preimage lines.
 
 Yeah, but that is not diff3 -m vs zealous-diff3 issue, is it?
 If you value the original and want to show it somewhere, you cannot
 avoid making the choice whether you are zealous or not if you split
 such a hunk.

Right, but I meant that we would never split a hunk like that with
diff3, because we would not do any hunk refinement at all.  Splitting a
hunk with merge is OK, because the where does the preimage go
problem does not exist there. zdiff3 is the only problematic case,
because it would be the only one that (potentially) splits and cares
about how the preimage maps to each hunk. But we can deal with that if
and when we ever do such splitting.

-Peff
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


feature suggestion: optimize common parts for checkout --conflict=diff3

2013-03-06 Thread Uwe Kleine-König
Hello,

Here comes another recipe for a different suggestion:

git init
echo 1  file
git add file
git commit -m 'base'
git branch branch
seq 1 30 | grep -v 15  file
git commit -m 'add 2-30 without 15' file
git checkout branch
seq 1 30 | grep -v 16  file
git commit -m 'add 2-30 without 16' file
git merge master
git diff

This yields:

diff --cc file
index a07e697,5080129..000
--- a/file
+++ b/file
@@@ -12,7 -12,7 +12,11 @@@
  12
  13
  14
++ HEAD
 +15
++===
+ 16
++ master
  17
  18
  19

as expected; nice and sweet. After

git checkout --conflict=diff3 file

however the difference isn't that easy to spot any more. I expected

diff --cc file
index a07e697,5080129..000
--- a/file
+++ b/file
@@@ -12,7 -12,7 +12,12 @@@
  12
  13
  14
++ ours
 +15
++||| base
++===
+ 16
++ theirs
  17
  18
  19

But instead I get

diff --cc file
index a07e697,5080129..000
--- a/file
+++ b/file
@@@ -1,29 -1,29 +1,61 @@@
  1
++ ours
 +2
 +3
 +4
 +5
 +6
 +7
 +8
 +9
 +10
 +11
 +12
 +13
 +14
 +15
 +17
 +18
 +19
 +20
 +21
 +22
 +23
 +24
 +25
 +26
 +27
 +28
 +29
 +30
++||| base
++===
+ 2
+ 3
+ 4
+ 5
+ 6
+ 7
+ 8
+ 9
+ 10
+ 11
+ 12
+ 13
+ 14
+ 16
+ 17
+ 18
+ 19
+ 20
+ 21
+ 22
+ 23
+ 24
+ 25
+ 26
+ 27
+ 28
+ 29
+ 30
++ theirs

Of course this is technically correct, just not maximally helpful.

Is this a missing optimisation for the diff3 case or did I miss a detail
that makes my expectation wrong?

Best regards
Uwe

-- 
Pengutronix e.K.   | Uwe Kleine-König|
Industrial Linux Solutions | http://www.pengutronix.de/  |
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: feature suggestion: optimize common parts for checkout --conflict=diff3

2013-03-06 Thread Antoine Pelisse
 git checkout --conflict=diff3 file

That's somehow unrelated, but shouldn't we have a conflict option to
git-merge as we have for git-checkout ?

With something like this (pasted into gmail):

diff --git a/builtin/merge.c b/builtin/merge.c
index 7c8922c..edad742 100644
--- a/builtin/merge.c
+++ b/builtin/merge.c
@@ -65,6 +65,7 @@ static int abort_current_merge;
 static int show_progress = -1;
 static int default_to_upstream;
 static const char *sign_commit;
+static char *conflict_style;

 static struct strategy all_strategy[] = {
  { recursive,  DEFAULT_TWOHEAD | NO_TRIVIAL },
@@ -213,6 +214,7 @@ static struct option builtin_merge_options[] = {
  { OPTION_STRING, 'S', gpg-sign, sign_commit, N_(key id),
   N_(GPG sign commit), PARSE_OPT_OPTARG, NULL, (intptr_t)  },
  OPT_BOOLEAN(0, overwrite-ignore, overwrite_ignore, N_(update
ignored files (default))),
+ OPT_STRING(0, conflict, conflict_style, N_(style), N_(conflict
style (merge or diff3))),
  OPT_END()
 };

@@ -1102,6 +1104,9 @@ int cmd_merge(int argc, const char **argv, const
char *prefix)
  if (verbosity  0  show_progress == -1)
  show_progress = 0;

+ if (conflict_style)
+ git_xmerge_config(merge.conflictstyle, conflict_style, NULL);
+
  if (abort_current_merge) {
  int nargc = 2;
  const char *nargv[] = {reset, --merge, NULL};
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: feature suggestion: optimize common parts for checkout --conflict=diff3

2013-03-06 Thread Antoine Pelisse
 however the difference isn't that easy to spot any more. I expected

 diff --cc file
 index a07e697,5080129..000
 --- a/file
 +++ b/file
 @@@ -12,7 -12,7 +12,12 @@@
   12
   13
   14
 ++ ours
  +15
 ++||| base
 ++===
 + 16
 ++ theirs
   17
   18
   19

This is not correct, it would mean that 12, 13, 14, 17, 18, 19 are in
base, while they are not.

Cheers,
Antoine
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: feature suggestion: optimize common parts for checkout --conflict=diff3

2013-03-06 Thread Jeff King
On Wed, Mar 06, 2013 at 08:26:57PM +0100, Antoine Pelisse wrote:

  however the difference isn't that easy to spot any more. I expected
 
  diff --cc file
  index a07e697,5080129..000
  --- a/file
  +++ b/file
  @@@ -12,7 -12,7 +12,12 @@@
12
13
14
  ++ ours
   +15
  ++||| base
  ++===
  + 16
  ++ theirs
17
18
19
 
 This is not correct, it would mean that 12, 13, 14, 17, 18, 19 are in
 base, while they are not.

Yeah, I agree it is a bit of a lie, as you are not seeing the full
picture of what was in the base. That is why we intentionally dial down
the conflict simplification level when using diff3. If you apply this
patch to git:

diff --git a/xdiff/xmerge.c b/xdiff/xmerge.c
index 9e13b25..f381e0c 100644
--- a/xdiff/xmerge.c
+++ b/xdiff/xmerge.c
@@ -420,15 +420,6 @@ static int xdl_do_merge(xdfenv_t *xe1, xdchange_t *xscr1,
int style = xmp-style;
int favor = xmp-favor;
 
-   if (style == XDL_MERGE_DIFF3) {
-   /*
-* diff3 -m output does not make sense for anything
-* more aggressive than XDL_MERGE_EAGER.
-*/
-   if (XDL_MERGE_EAGER  level)
-   level = XDL_MERGE_EAGER;
-   }
-
c = changes = NULL;
 
while (xscr1  xscr2) {

then it will produce the output that Uwe expects. While it can be
misleading, I also think it can make some conflicts (like this one) much
easier to understand. I don't see any reason we can't have a zealous
diff3 mode to let people view this output, as long as it is not the
default.

-Peff
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: feature suggestion: optimize common parts for checkout --conflict=diff3

2013-03-06 Thread Junio C Hamano
Jeff King p...@peff.net writes:

 then it will produce the output that Uwe expects. While it can be
 misleading,...

Misleading is one thing but in this case isn't it outright wrong?

If you remove  ours ||| portion from the combined diff output,
I would expect that the hunk will apply to the base, but that is no
longer true, no?

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: feature suggestion: optimize common parts for checkout --conflict=diff3

2013-03-06 Thread Jeff King
On Wed, Mar 06, 2013 at 12:40:48PM -0800, Junio C Hamano wrote:

 Jeff King p...@peff.net writes:
 
  then it will produce the output that Uwe expects. While it can be
  misleading,...
 
 Misleading is one thing but in this case isn't it outright wrong?
 
 If you remove  ours ||| portion from the combined diff output,
 I would expect that the hunk will apply to the base, but that is no
 longer true, no?

It shifts the concept of what is the base and what is the conflict.
In Uwe's example, no, it would not apply to the single-line file that is
the true 3-way base. But it would apply to the content that is outside
of the hunk marker; we have changed the concept of what is in the base
and what is in the conflict by shrinking the conflict to its smallest
size. The same is true of the conflict markers produced in the non-diff3
case. It is a property of XDL_MERGE_ZEALOUS, not of the conflict style.

If your argument is diff3 means something different than regular
conflict markers; it should have the property of being
machine-convertible into a patch, but regular markers do not, I'm not
sure I agree. It may be used that way, but I think it is mostly used in
git to give the reader more context when making a resolution. And
anyway, I think the proposed change would not be to change diff3, but to
introduce a new diff3-like format that also shrinks the hunk size, so it
would not hurt existing users of diff3.

-Peff
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: feature suggestion: optimize common parts for checkout --conflict=diff3

2013-03-06 Thread Junio C Hamano
Jeff King p...@peff.net writes:

 But it would apply to the content that is outside
 of the hunk marker; we have changed the concept of what is in the base
 and what is in the conflict by shrinking the conflict to its smallest
 size.

Hmm, unless you mean by base something entirely different from
what was in the common ancestor version, I do not think I can
agree.  The point of diff3 mode is to show how it looked line in the
common ancestor and what the conflicting sides want to change that
common version into; letting the user view three versions to help
him decide what to do by only looking at the part inside conflict
markers.

We show both sides added, either identically or differently as
noteworthy events, but the patched code pushes both sides added
identically case outside the conflicting hunk, as if what was added
relative to the common ancestor version (in Uwe's case, is it 1-14
that is common, or just 10-14?) is not worth looking at when
considering what the right resolution is.  If it is not worth
looking at what was in the original for the conflicting part, why
would we be even using diff3 mode in the first place?

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: feature suggestion: optimize common parts for checkout --conflict=diff3

2013-03-06 Thread Jeff King
On Wed, Mar 06, 2013 at 01:09:41PM -0800, Junio C Hamano wrote:

 Jeff King p...@peff.net writes:
 
  But it would apply to the content that is outside
  of the hunk marker; we have changed the concept of what is in the base
  and what is in the conflict by shrinking the conflict to its smallest
  size.
 
 Hmm, unless you mean by base something entirely different from
 what was in the common ancestor version, I do not think I can
 agree.

I don't know. I didn't use the word base in the first place. I was
trying to figure out what you meant. :)

My point is that the hunk (everything from  to ) is
self-consistent. It's just misleading in that the hunk has been shrunk
not to include identical bits from each side. IMHO, this is not much
different than a nearby change being auto-resolved. The conflict hunks
the user sees do not represent the original files, but rather the
remains after a first pass at resolving.

 The point of diff3 mode is to show how it looked line in the
 common ancestor and what the conflicting sides want to change that
 common version into; letting the user view three versions to help
 him decide what to do by only looking at the part inside conflict
 markers.

Right, I agree.

 We show both sides added, either identically or differently as
 noteworthy events, but the patched code pushes both sides added
 identically case outside the conflicting hunk, as if what was added
 relative to the common ancestor version (in Uwe's case, is it 1-14
 that is common, or just 10-14?) is not worth looking at when
 considering what the right resolution is.  If it is not worth
 looking at what was in the original for the conflicting part, why
 would we be even using diff3 mode in the first place?

I think Uwe's example shows that it _is_ useful. Yes, you no longer have
the information about what happened through 1-14 (whether it was really
there in the ancestor file, or whether it was simply added identically).
But that information might or might not be relevant. In Uwe's example,
it is just noise that detracts from the interesting part of the change
(or does it? I think the answer is in the eye of the reader).  I think
it can be helpful to have both types available, and they can pick which
one they want; it's just another tool.

Another argument is that some people (including me) set
merge.conflictstyle to diff3, because they like seeing the extra context
when resolving (I find it helps a lot with rebasing, when it is
sometimes hard to remember which side is which in the merge). I'd
consider setting it to zdiff3 to get the benefits of XDL_MERGE_ZEALOUS,
and using git checkout --conflict-style=diff3 if I need to get more
information about a specific case.

-Peff
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: feature suggestion: optimize common parts for checkout --conflict=diff3

2013-03-06 Thread Uwe Kleine-König
Hello Junio,

On Wed, Mar 06, 2013 at 01:09:41PM -0800, Junio C Hamano wrote:
 Jeff King p...@peff.net writes:
 
  But it would apply to the content that is outside
  of the hunk marker; we have changed the concept of what is in the base
  and what is in the conflict by shrinking the conflict to its smallest
  size.
 
 Hmm, unless you mean by base something entirely different from
 what was in the common ancestor version, I do not think I can
 agree.  The point of diff3 mode is to show how it looked line in the
 common ancestor and what the conflicting sides want to change that
 common version into; letting the user view three versions to help
 him decide what to do by only looking at the part inside conflict
 markers.
 
 We show both sides added, either identically or differently as
 noteworthy events, but the patched code pushes both sides added
 identically case outside the conflicting hunk, as if what was added
I didn't test, but both sides removed identically should be moved out,
too, shouldn't it?

 relative to the common ancestor version (in Uwe's case, is it 1-14
 that is common, or just 10-14?) is not worth looking at when
 considering what the right resolution is.  If it is not worth
 looking at what was in the original for the conflicting part, why
 would we be even using diff3 mode in the first place?
because even zdiff3 contains more information than merge. And compared
to diff3 it's smaller sometimes and so easier to understand.

Other than that I agree fully to the things Jeff said so far.

Best regards
Uwe

-- 
Pengutronix e.K.   | Uwe Kleine-König|
Industrial Linux Solutions | http://www.pengutronix.de/  |
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: feature suggestion: optimize common parts for checkout --conflict=diff3

2013-03-06 Thread Junio C Hamano
Junio C Hamano gits...@pobox.com writes:

 Jeff King p...@peff.net writes:

 But it would apply to the content that is outside
 of the hunk marker; we have changed the concept of what is in the base
 and what is in the conflict by shrinking the conflict to its smallest
 size.

 Hmm, unless you mean by base something entirely different from
 what was in the common ancestor version, I do not think I can
 agree.  The point of diff3 mode is to show how it looked line in the
 common ancestor and what the conflicting sides want to change that
 common version into; letting the user view three versions to help
 him decide what to do by only looking at the part inside conflict
 markers.

 We show both sides added, either identically or differently as
 noteworthy events, but the patched code pushes both sides added
 identically case outside the conflicting hunk, as if what was added
 relative to the common ancestor version (in Uwe's case, is it 1-14
 that is common, or just 10-14?) is not worth looking at when
 considering what the right resolution is.  If it is not worth
 looking at what was in the original for the conflicting part, why
 would we be even using diff3 mode in the first place?

I vaguely recall we did this clip to eager as an explicit bugfix
at 83133740d9c8 (xmerge.c: diff3 -m style clips merge reduction
level to EAGER or less, 2008-08-29).  The list archive around that
time may give us more contexts.
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: feature suggestion: optimize common parts for checkout --conflict=diff3

2013-03-06 Thread Junio C Hamano
Jeff King p...@peff.net writes:

 I think Uwe's example shows that it _is_ useful. Yes, you no longer have
 the information about what happened through 1-14 (whether it was really
 there in the ancestor file, or whether it was simply added identically).
 But that information might or might not be relevant.

I think it is more like I added bread and my wife added bread to
our common shopping list and our two-way RCS merge default is to
collapse that case to one loaf of bread on the shopping list.  My
impression has always been that people who use diff3 mode care
about this case and want to know that the original did not have
bread on the list in order to decide if one or two loaves of bread
should remain in the result.

 In Uwe's example,
 it is just noise that detracts from the interesting part of the change
 (or does it? I think the answer is in the eye of the reader).

In other words, you would use the RCS merge style because most of
the time you would resolve to one loaf of bread and the fact that
it was missing in the original is not needed to decide that.  So, it
feels strange to use diff3 and still want to discard that
information---if it is not relevant, why are you using diff3 mode in
the first place?  That is the question that is still not answered.
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: feature suggestion: optimize common parts for checkout --conflict=diff3

2013-03-06 Thread Jeff King
On Wed, Mar 06, 2013 at 01:50:46PM -0800, Junio C Hamano wrote:

 I think it is more like I added bread and my wife added bread to
 our common shopping list and our two-way RCS merge default is to
 collapse that case to one loaf of bread on the shopping list.  My
 impression has always been that people who use diff3 mode care
 about this case and want to know that the original did not have
 bread on the list in order to decide if one or two loaves of bread
 should remain in the result.

I think that is only the case sometimes. It depends on what is in the
conflict, and what your data is. I think you are conflating two things,
though: zealousness of merge, and having the original content handy when
resolving. To me, diff3 is about the latter. It can also be a hint that
the user cares about the former, but not necessarily.

  In Uwe's example,
  it is just noise that detracts from the interesting part of the change
  (or does it? I think the answer is in the eye of the reader).
 
 In other words, you would use the RCS merge style because most of
 the time you would resolve to one loaf of bread and the fact that
 it was missing in the original is not needed to decide that.  So, it
 feels strange to use diff3 and still want to discard that
 information---if it is not relevant, why are you using diff3 mode in
 the first place?  That is the question that is still not answered.

Because for the lines that _are_ changed, you may want to see what the
original looked like. Here's a more realistic example:

git init repo
cd repo

# Some baseline C code.
cat foo.c \EOF
int foo(int bar)
{
  return bar + 5;
}
EOF
git add foo.c
git commit -m base
git tag base

# Simulate a modification to the function.
sed -i '2a\
  if (bar  3)\
bar *= 2;
' foo.c
git commit -am multiply
git tag multiply

# And another modification.
sed -i 's/bar + 5/bar + 7/' foo.c
git commit -am plus7

# Now on a side branch...
git checkout -b side base

# let's cherry pick the first change. Obviously
# we could just fast-forward in this toy example,
# but let's try to simulate a real history.
#
# We insert a sleep so that the cherry-pick does not
# accidentally end up with the exact same commit-id (again,
# because this is a toy example).
sleep 1
git cherry-pick multiply

# and now let's make a change that conflicts with later
# changes on master
sed -i 's/bar + 5/bar + 8/' foo.c
git commit -am plus8

# and now merge, getting a conflict
git merge master

# show the result with various marker styles
for i in merge diff3 zdiff3; do
  echo
  echo == $i
  git.compile checkout --conflict=$i foo.c
  cat foo.c
done

which produces:

== merge
int foo(int bar)
{
  if (bar  3)
bar *= 2;
 ours
  return bar + 8;
===
  return bar + 7;
 theirs
}

The ZEALOUS level has helpfully cut out the shared cherry-picked bits,
and let us focus on the real change.

== diff3
int foo(int bar)
{
 ours
  if (bar  3)
bar *= 2;
  return bar + 8;
||| base
  return bar + 5;
===
  if (bar  3)
bar *= 2;
  return bar + 7;
 theirs
}

Here we get to see all of the change, but the interesting difference is
overwhelmed by the shared cherry-picked bits. It's only 2 lines here,
but of course it could be much larger in a real example, and the reader
is forced to manually verify that the early parts are byte-for-byte
identical.

== zdiff3
int foo(int bar)
{
  if (bar  3)
bar *= 2;
 ours
  return bar + 8;
||| base
  return bar + 5;
===
  return bar + 7;
 theirs
}

Here we see the hunk cut-down again, removing the cherry-picked parts.
But the presence of the base is still interesting, because we see
something that was not in the merge marker: that we were originally
at 5, and moved to 7 on one side and 8 on the other.

I see conflicts like this when I rebase my topics forward; you may pick
up part of my series, or even make a tweak to a patch in the middle. I
prefer diff3 markers because they carry more information (and use them
automatically via merge.conflictstyle). But in some cases, the lack of
zealous reduction means that I end having to figure out whether and if
anything changed in the seemingly identical bits.  Sometimes it is
nothing, and sometimes you tweaked whitespace or fixed a typo, and it
takes a lot of manual looking to figure it out. I hadn't realized it was
related to the