Re: [PATCH] diff: respect --no-ext-diff with typechange

2012-07-19 Thread Jeff King
On Wed, Jul 18, 2012 at 03:47:21PM -0700, Junio C Hamano wrote:

  I don't care too deeply either way, and it is technically a behavior
  change. So there is a chance of regression for something that nobody has
  actually complained about.
 
 Thanks. I share the same feeling, but the code after the patch looks
 much nicer, which is somewhat sad.

If we're not going to do it, perhaps we should at least include the
tests I wrote (modified for the current behavior), since there was no
coverage in this area previously. Patch is below.

  To be honest, I doubt many people are using
  external diff at all these days; textconv is closer to what most people
  want, and is much easier to use.
 
 It depends on the payload and the output you want.  I wouldn't think
 it is fair to challenge anybody to come up with a solution based on
 the textconv machinery to replicate what the compare-cooking.perl in
 the 'todo' branch (used as an external diff driver for
 whats-cooking.txt) does.

Oh, absolutely. The compare-cooking script is a great example of what
you can do with the flexibility that external diff offers. But based on
my experience and reading the list, the much more common request is to
produce a meaningful diff of two binary word processor documents. :)

Googling around, the other common use seems to be to stuff the output
into a visual diff tool. Though I think that git difftool is a better
approach for that.

-Peff

-- 8 --
Subject: [PATCH] diff: test precedence of external diff drivers

There are three ways to specify an external diff command:
GIT_EXTERNAL_DIFF in the environment, diff.external in the
config, or a diff gitattribute. The current order of
precedence is:

  1. gitattribute

  2. GIT_EXTERNAL_DIFF

  3. diff.external

Usually our rule is that environment variables should take
precedence over on-disk config (i.e., option 2 should come
before option 1). However, this situation is trickier than
some, because option 1 is more specific to the individual
file than option 2 (which affects all files), so it might be
preferable. So the current behavior can be seen as
implementing do the specific thing if we can, but fall back
to this general thing.

This is probably not what we would do if we were writing git
from scratch, but it has been this way for several years,
and is not worth changing. So let's at least document that
this is the way it's supposed to work with a test.

While we're there, let's also make sure that diff.external
(which was not previously tested at all) works by running it
through the same tests as GIT_EXTERNAL_DIFF.

Signed-off-by: Jeff King p...@peff.net
---
 t/t4020-diff-external.sh | 40 
 1 file changed, 40 insertions(+)

diff --git a/t/t4020-diff-external.sh b/t/t4020-diff-external.sh
index 5a5f68c..2e7d73f 100755
--- a/t/t4020-diff-external.sh
+++ b/t/t4020-diff-external.sh
@@ -65,6 +65,33 @@ test_expect_success SYMLINKS 'typechange diff' '
test_cmp expect actual
 '
 
+test_expect_success 'diff.external' '
+   git reset --hard 
+   echo third file 
+   test_config diff.external echo 
+   git diff | {
+   read path oldfile oldhex oldmode newfile newhex newmode 
+   test z$path = zfile 
+   test z$oldmode = z100644 
+   test z$newhex = z$_z40 
+   test z$newmode = z100644 
+   oh=$(git rev-parse --verify HEAD:file) 
+   test z$oh = z$oldhex
+   }
+'
+
+test_expect_success 'diff.external should apply only to diff' '
+   test_config diff.external echo 
+   git log -p -1 HEAD |
+   grep ^diff --git a/file b/file
+'
+
+test_expect_success 'diff.external and --no-ext-diff' '
+   test_config diff.external echo 
+   git diff --no-ext-diff |
+   grep ^diff --git a/file b/file
+'
+
 test_expect_success 'diff attribute' '
git reset --hard 
echo third file 
@@ -132,6 +159,19 @@ test_expect_success 'diff attribute and --no-ext-diff' '
 
 '
 
+test_expect_success 'GIT_EXTERNAL_DIFF trumps diff.external' '
+   .gitattributes 
+   test_config diff.external echo ext-global 
+   GIT_EXTERNAL_DIFF=echo ext-env git diff | grep ext-env
+'
+
+test_expect_success 'attributes trump GIT_EXTERNAL_DIFF and diff.external' '
+   test_config diff.foo.command echo ext-attribute 
+   test_config diff.external echo ext-global 
+   echo file diff=foo .gitattributes 
+   GIT_EXTERNAL_DIFF=echo ext-env git diff | grep ext-attribute
+'
+
 test_expect_success 'no diff with -diff' '
echo .gitattributes file -diff 
git diff | grep Binary
-- 
1.7.10.5.40.g059818d

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] diff: respect --no-ext-diff with typechange

2012-07-18 Thread Jeff King
On Tue, Jul 17, 2012 at 10:08:59PM -0700, Junio C Hamano wrote:

 The impression I got from Peff's review was that the problem
 description in the proposed commit log message did not describe the
 reality at all, and the added three lines did not do what the
 message implied they do.  So I do not see how it can be acceptable
 by anybody.
 
 It also needs a test to protect this fix from being broken by other
 people in the future.

Yeah, exactly.

 -- 8 --
 Subject: diff: correctly disable external_diff with --no-ext-diff
 
 Upon seeing a type-change filepair, diff --no-ext-diff does not
 show the usual deletion followed by addition split patch and does
 not run the external diff driver either.
 
 This is because the logic to disable external diff was placed at a
 wrong level in the callchain.  run_diff_cmd() decides to show the
 split patch only when external diff driver is not configured or
 specified via GIT_EXTERNAL_DIFF environment, but this is done before
 checking if --no-ext-diff was given.  To make things worse,
 run_diff_cmd() checks --no-ext-diff and disables the output for such
 a filepair completely, as the callchain below it (e.g. builtin_diff)
 does not want to handle typechange filepairs.
 
 Signed-off-by: Junio C Hamano gits...@pobox.com

Your patch looks good to me.

 ---
  * The use of userdiff_find_by_path() in run_diff_cmd() may be iffy;
it is probably OK to override diff.external with a more specific
per-path configuration, but I think an external diff specified by
the GIT_EXTERNAL_DIFF environment may want to trump the
configured per-path one, as an environment is a stronger one-shot
request.

I think this date all the way back to f1af60b (Support 'diff=pgm'
attribute, 2007-04-22). There's a tradeoff here; usually environment
variables trump config, but you end up using a large hammer (here is how
to diff _all_ files externally) to hit a small nail (here is how to diff
_just_ this file).  I suspect it isn't that big a problem in practice
because people tend to use either one mechanism or the other.

The most sensible thing to me is probably $GIT_EXTERNAL_DIFF, followed
by attributes, followed by diff.external. That uses the more specific
diff pulled from the on-disk config, but allows you to do one-shot overrides
with the environment as long as you are careful to restrict your command
(e.g., GIT_EXTERNAL_DIFF=foo-differ git diff -- file.foo).

But this patch is not about changing that semantics, so I left it
as-is.

Sounds sensible.

-Peff
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] diff: respect --no-ext-diff with typechange

2012-07-18 Thread Jeff King
On Wed, Jul 18, 2012 at 02:23:29AM -0400, Jeff King wrote:

   * The use of userdiff_find_by_path() in run_diff_cmd() may be iffy;
 it is probably OK to override diff.external with a more specific
 per-path configuration, but I think an external diff specified by
 the GIT_EXTERNAL_DIFF environment may want to trump the
 configured per-path one, as an environment is a stronger one-shot
 request.
 
 I think this date all the way back to f1af60b (Support 'diff=pgm'
 attribute, 2007-04-22). There's a tradeoff here; usually environment
 variables trump config, but you end up using a large hammer (here is how
 to diff _all_ files externally) to hit a small nail (here is how to diff
 _just_ this file).  I suspect it isn't that big a problem in practice
 because people tend to use either one mechanism or the other.
 
 The most sensible thing to me is probably $GIT_EXTERNAL_DIFF, followed
 by attributes, followed by diff.external. That uses the more specific
 diff pulled from the on-disk config, but allows you to do one-shot overrides
 with the environment as long as you are careful to restrict your command
 (e.g., GIT_EXTERNAL_DIFF=foo-differ git diff -- file.foo).

Here's a patch implementing that. This is definitely how I would have
done it if I were starting from scratch. My only hesitation now is that
I don't care too deeply either way, and it is technically a behavior
change. So there is a chance of regression for something that nobody has
actually complained about. To be honest, I doubt many people are using
external diff at all these days; textconv is closer to what most people
want, and is much easier to use.

-- 8 --
Subject: [PATCH] diff: fix precedence of external diff options

There are three ways to specify an external diff command:
GIT_EXTERNAL_DIFF in the environment, diff.external in the
config, or a diff gitattribute. The current order of
precedence is:

  1. gitattribute

  2. GIT_EXTERNAL_DIFF

  3. diff.external

But usually our rule is that environment variables should
take precedence over on-disk config (i.e., option 2 should
come before option 1). This situation is trickier than some,
because option 1 is more specific to the individual file
than option 2 (which affects all files), so it might be
preferable. So the current behavior can be seen as
implementing do the specific thing if we can, but fall back
to this general thing.

However, since you can already implement that behavior (by
combining attributes with a diff.external setting), and
because environment variables can be useful for one-shot
overrides (e.g., GIT_EXTERNAL_DIFF=foo git diff -- bar to
override bar's gitattribute setting), let's switch the
precedence of options 1 and 2 above.

While we're adding tests to t4020 for the precedence, let's
also make sure that diff.external works at all by running it
through the same tests as GIT_EXTERNAL_DIFF.

Signed-off-by: Jeff King p...@peff.net
---
 diff.c   | 35 +++
 t/t4020-diff-external.sh | 41 +
 2 files changed, 56 insertions(+), 20 deletions(-)

diff --git a/diff.c b/diff.c
index 62cbe14..ed2de96 100644
--- a/diff.c
+++ b/diff.c
@@ -238,18 +238,20 @@ static char *quote_two(const char *one, const char *two)
return strbuf_detach(res, NULL);
 }
 
-static const char *external_diff(void)
+static const char *external_diff(const char *path)
 {
-   static const char *external_diff_cmd = NULL;
-   static int done_preparing = 0;
+   const char *r;
+   struct userdiff_driver *drv;
 
-   if (done_preparing)
-   return external_diff_cmd;
-   external_diff_cmd = getenv(GIT_EXTERNAL_DIFF);
-   if (!external_diff_cmd)
-   external_diff_cmd = external_diff_cmd_cfg;
-   done_preparing = 1;
-   return external_diff_cmd;
+   r = getenv(GIT_EXTERNAL_DIFF);
+   if (r)
+   return r;
+
+   drv = userdiff_find_by_path(path);
+   if (drv  drv-external)
+   return drv-external;
+
+   return external_diff_cmd_cfg;
 }
 
 static struct diff_tempfile {
@@ -2992,13 +2994,6 @@ static void run_diff_cmd(const char *pgm,
int complete_rewrite = (p-status == DIFF_STATUS_MODIFIED)  p-score;
int must_show_header = 0;
 
-
-   if (DIFF_OPT_TST(o, ALLOW_EXTERNAL)) {
-   struct userdiff_driver *drv = userdiff_find_by_path(attr_path);
-   if (drv  drv-external)
-   pgm = drv-external;
-   }
-
if (msg) {
/*
 * don't use colors when the header is intended for an
@@ -3059,7 +3054,7 @@ static void strip_prefix(int prefix_length, const char 
**namep, const char **oth
 
 static void run_diff(struct diff_filepair *p, struct diff_options *o)
 {
-   const char *pgm = external_diff();
+   const char *pgm = NULL;
struct strbuf msg;
struct diff_filespec *one = p-one;
struct 

Re: [PATCH] diff: respect --no-ext-diff with typechange

2012-07-18 Thread Junio C Hamano
Jeff King p...@peff.net writes:

 I don't care too deeply either way, and it is technically a behavior
 change. So there is a chance of regression for something that nobody has
 actually complained about.

Thanks. I share the same feeling, but the code after the patch looks
much nicer, which is somewhat sad.

 To be honest, I doubt many people are using
 external diff at all these days; textconv is closer to what most people
 want, and is much easier to use.

It depends on the payload and the output you want.  I wouldn't think
it is fair to challenge anybody to come up with a solution based on
the textconv machinery to replicate what the compare-cooking.perl in
the 'todo' branch (used as an external diff driver for
whats-cooking.txt) does.
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH] diff: respect --no-ext-diff with typechange

2012-07-17 Thread Jakub Vrana
Yes, I was fixing the invalid (!pgm) condition, sorry for a non-precise 
description.

Does it mean that my patch is accepted or is there something else I need to do?

-- 
Jakub Vrana


-Original Message-
From: Jeff King [mailto:p...@peff.net] 
Sent: Monday, July 16, 2012 9:16 PM
To: Jakub Vrana
Cc: git@vger.kernel.org; gits...@pobox.com
Subject: Re: [PATCH] diff: respect --no-ext-diff with typechange

On Mon, Jul 16, 2012 at 05:27:00PM -0700, Jakub Vrana wrote:

 If external diff is specified through diff.external then it is used 
 even if `git diff --no-ext-diff` is used when there is a typechange.

Eek. That has some minor security implications, as it means that it is 
dangerous to run even plumbing inspection command in somebody else's repository.

However...

  diff.c |3 +++
  1 file changed, 3 insertions(+)
 
 diff --git a/diff.c b/diff.c
 index 208096f..898d610 100644
 --- a/diff.c
 +++ b/diff.c
 @@ -3074,6 +3074,9 @@ static void run_diff(struct diff_filepair *p, 
 struct diff_options *o)
   if (o-prefix_length)
   strip_prefix(o-prefix_length, name, other);
  
 + if (!DIFF_OPT_TST(o, ALLOW_EXTERNAL))
 + pgm = NULL;
 +
   if (DIFF_PAIR_UNMERGED(p)) {
   run_diff_cmd(pgm, name, NULL, attr_path,
NULL, NULL, NULL, o, p);

run_diff_cmd already checks the ALLOW_EXTERNAL bit and sets pgm to NULL there. 
So as far as I can tell, we are not actually running the external diff. 
However, there is still a problem. Later in run_diff we do:

if (!pgm 
DIFF_FILE_VALID(one)  DIFF_FILE_VALID(two) 
(S_IFMT  one-mode) != (S_IFMT  two-mode)) {
/*
 * a filepair that changes between file and symlink
 * needs to be split into deletion and creation.
 */
struct diff_filespec *null = alloc_filespec(two-path);
run_diff_cmd(NULL, name, other, attr_path,
 one, null, msg, o, p);
free(null);
strbuf_release(msg);

null = alloc_filespec(one-path);
run_diff_cmd(NULL, name, other, attr_path,
 null, two, msg, o, p);
free(null);
}
else
run_diff_cmd(pgm, name, other, attr_path,
 one, two, msg, o, p);

IOW, we split up a typechange if we are feeding it to the internal diff 
generator, because builtin_diff will not show diffs between different types. 
But the check for !pgm here is not right; we don't know yet whether we will 
be builtin or external, because we have not checked ALLOW_EXTERNAL yet.

So I think your fix is the right thing, but the bug it is fixing is not do not 
run external diff even when --no-ext-diff is specified. It is do not 
accidentally feed typechange diffs to builtin_diff.

You can see the difference in output with this script (and it works fine with 
your patch applied):

git init -q repo  cd repo 
echo content file  git add file  git commit -q -m regular 
rm file  ln -s dest file  git commit -q -a -m typechange 
export GIT_PAGER=cat 
export GIT_EXTERNAL_DIFF='echo doing external diff' 
git show HEAD^ --format='=== %s, ext ===' --ext-diff 
git show HEAD^ --format='=== %s, no-ext ===' --no-ext-diff 
git show HEAD  --format='=== %s, ext ===' --ext-diff 
git show HEAD  --format='=== %s, no-ext ===' --no-ext-diff

-Peff

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] diff: respect --no-ext-diff with typechange

2012-07-17 Thread Junio C Hamano
Jakub Vrana ja...@vrana.cz writes:

 Yes, I was fixing the invalid (!pgm) condition, sorry for a non-precise 
 description.

 Does it mean that my patch is accepted or is there something else I need to 
 do?

The impression I got from Peff's review was that the problem
description in the proposed commit log message did not describe the
reality at all, and the added three lines did not do what the
message implied they do.  So I do not see how it can be acceptable
by anybody.

It also needs a test to protect this fix from being broken by other
people in the future.

I've followed the codepath myself, and here is what I would have
liked the submitted patch to look like.  Note that run_diff_cmd()
no longer needs to reset pgm to NULL based on ALLOW_EXTERNAL, but
it still needs to look at it to decide if per-path external userdiff
needs to be called.

-- 8 --
Subject: diff: correctly disable external_diff with --no-ext-diff

Upon seeing a type-change filepair, diff --no-ext-diff does not
show the usual deletion followed by addition split patch and does
not run the external diff driver either.

This is because the logic to disable external diff was placed at a
wrong level in the callchain.  run_diff_cmd() decides to show the
split patch only when external diff driver is not configured or
specified via GIT_EXTERNAL_DIFF environment, but this is done before
checking if --no-ext-diff was given.  To make things worse,
run_diff_cmd() checks --no-ext-diff and disables the output for such
a filepair completely, as the callchain below it (e.g. builtin_diff)
does not want to handle typechange filepairs.

Signed-off-by: Junio C Hamano gits...@pobox.com
---
 * The use of userdiff_find_by_path() in run_diff_cmd() may be iffy;
   it is probably OK to override diff.external with a more specific
   per-path configuration, but I think an external diff specified by
   the GIT_EXTERNAL_DIFF environment may want to trump the
   configured per-path one, as an environment is a stronger one-shot
   request.

   But this patch is not about changing that semantics, so I left it
   as-is. 

 diff.c   |  8 +---
 t/t4020-diff-external.sh | 19 +++
 2 files changed, 24 insertions(+), 3 deletions(-)

diff --git a/diff.c b/diff.c
index 208096f..62cbe14 100644
--- a/diff.c
+++ b/diff.c
@@ -2992,9 +2992,8 @@ static void run_diff_cmd(const char *pgm,
int complete_rewrite = (p-status == DIFF_STATUS_MODIFIED)  p-score;
int must_show_header = 0;
 
-   if (!DIFF_OPT_TST(o, ALLOW_EXTERNAL))
-   pgm = NULL;
-   else {
+
+   if (DIFF_OPT_TST(o, ALLOW_EXTERNAL)) {
struct userdiff_driver *drv = userdiff_find_by_path(attr_path);
if (drv  drv-external)
pgm = drv-external;
@@ -3074,6 +3073,9 @@ static void run_diff(struct diff_filepair *p, struct 
diff_options *o)
if (o-prefix_length)
strip_prefix(o-prefix_length, name, other);
 
+   if (!DIFF_OPT_TST(o, ALLOW_EXTERNAL))
+   pgm = NULL;
+
if (DIFF_PAIR_UNMERGED(p)) {
run_diff_cmd(pgm, name, NULL, attr_path,
 NULL, NULL, NULL, o, p);
diff --git a/t/t4020-diff-external.sh b/t/t4020-diff-external.sh
index 533afc1..5a5f68c 100755
--- a/t/t4020-diff-external.sh
+++ b/t/t4020-diff-external.sh
@@ -48,7 +48,26 @@ test_expect_success 'GIT_EXTERNAL_DIFF environment and 
--no-ext-diff' '
 
 '
 
+test_expect_success SYMLINKS 'typechange diff' '
+   rm -f file 
+   ln -s elif file 
+   GIT_EXTERNAL_DIFF=echo git diff  | {
+   read path oldfile oldhex oldmode newfile newhex newmode 
+   test z$path = zfile 
+   test z$oldmode = z100644 
+   test z$newhex = z$_z40 
+   test z$newmode = z12 
+   oh=$(git rev-parse --verify HEAD:file) 
+   test z$oh = z$oldhex
+   } 
+   GIT_EXTERNAL_DIFF=echo git diff --no-ext-diff actual 
+   git diff expect 
+   test_cmp expect actual
+'
+
 test_expect_success 'diff attribute' '
+   git reset --hard 
+   echo third file 
 
git config diff.parrot.command echo 
 
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] diff: respect --no-ext-diff with typechange

2012-07-16 Thread Jeff King
On Mon, Jul 16, 2012 at 05:27:00PM -0700, Jakub Vrana wrote:

 If external diff is specified through diff.external then it is used even if
 `git diff --no-ext-diff` is used when there is a typechange.

Eek. That has some minor security implications, as it means that it is
dangerous to run even plumbing inspection command in somebody else's
repository.

However...

  diff.c |3 +++
  1 file changed, 3 insertions(+)
 
 diff --git a/diff.c b/diff.c
 index 208096f..898d610 100644
 --- a/diff.c
 +++ b/diff.c
 @@ -3074,6 +3074,9 @@ static void run_diff(struct diff_filepair *p, struct
 diff_options *o)
   if (o-prefix_length)
   strip_prefix(o-prefix_length, name, other);
  
 + if (!DIFF_OPT_TST(o, ALLOW_EXTERNAL))
 + pgm = NULL;
 +
   if (DIFF_PAIR_UNMERGED(p)) {
   run_diff_cmd(pgm, name, NULL, attr_path,
NULL, NULL, NULL, o, p);

run_diff_cmd already checks the ALLOW_EXTERNAL bit and sets pgm to NULL
there. So as far as I can tell, we are not actually running the external
diff. However, there is still a problem. Later in run_diff we do:

if (!pgm 
DIFF_FILE_VALID(one)  DIFF_FILE_VALID(two) 
(S_IFMT  one-mode) != (S_IFMT  two-mode)) {
/*
 * a filepair that changes between file and symlink
 * needs to be split into deletion and creation.
 */
struct diff_filespec *null = alloc_filespec(two-path);
run_diff_cmd(NULL, name, other, attr_path,
 one, null, msg, o, p);
free(null);
strbuf_release(msg);

null = alloc_filespec(one-path);
run_diff_cmd(NULL, name, other, attr_path,
 null, two, msg, o, p);
free(null);
}
else
run_diff_cmd(pgm, name, other, attr_path,
 one, two, msg, o, p);

IOW, we split up a typechange if we are feeding it to the internal diff
generator, because builtin_diff will not show diffs between different
types. But the check for !pgm here is not right; we don't know yet
whether we will be builtin or external, because we have not checked
ALLOW_EXTERNAL yet.

So I think your fix is the right thing, but the bug it is fixing is not
do not run external diff even when --no-ext-diff is specified. It is
do not accidentally feed typechange diffs to builtin_diff.

You can see the difference in output with this script (and it works fine
with your patch applied):

git init -q repo  cd repo 
echo content file  git add file  git commit -q -m regular 
rm file  ln -s dest file  git commit -q -a -m typechange 
export GIT_PAGER=cat 
export GIT_EXTERNAL_DIFF='echo doing external diff' 
git show HEAD^ --format='=== %s, ext ===' --ext-diff 
git show HEAD^ --format='=== %s, no-ext ===' --no-ext-diff 
git show HEAD  --format='=== %s, ext ===' --ext-diff 
git show HEAD  --format='=== %s, no-ext ===' --no-ext-diff

-Peff
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html