Re: bug with git-diff --quiet

2014-01-24 Thread Duy Nguyen
On Thu, Jan 23, 2014 at 11:45:25AM +0900, IWAMOTO Toshihiro wrote:
 I found git-diff --quiet returns a zero exit status even if there's
 a change.  The following sequence reproduces the bug:
 
   $ mkdir foo
   $ cd foo
   $ git init
   $ echo a  a.txt
   $ echo b b.txt
   $ git add ?.txt
   $ git commit
   $ echo b  b.txt
   $ touch a.txt
   $ git diff --quiet; echo $?
   
 Interestingly, if you issue git-diff --quiet again, it returns the
 expected exit status 1.

Because stat info in index is updated and diff_change() won't be
called again on a.txt.

 The problem is in the optimization code in run_diff_files().  The
 function finds a.txt has different stat(2) data from .git/index and
 calls diff_change(), which sets DIFF_OPT_HAS_CHANGES.  As the flag
 makes diff_can_quit_early() return 1, run_diff_files()'s loop finishes
 without calling diff_change() for b.txt.
 
 Then, diffcore_std() examines diff_queued_diff and clears
 DIFF_OPT_HAS_CHANGES, because a.txt is unchanged.
 This is how a change in b.txt is ignored by git-diff --quiet.

Thanks for the analysis. Perhaps we could make diff_change test
whether a.txt is unchanged so it does not set HAS_CHANGES prematurely?
Maybe something like below.

By the time diffcore_skip_stat_unmatch() is called, everything is
cached, so there's not much of performance regression. We still do
memcmp() twice (in diff_filespec_is_identical), but I think that has
less impact than removing diff_can_quit_early().

-- 8 --
diff --git a/diff.c b/diff.c
index 6b4cd0e..5226fc0 100644
--- a/diff.c
+++ b/diff.c
@@ -4697,6 +4697,33 @@ static int diff_filespec_is_identical(struct 
diff_filespec *one,
return !memcmp(one-data, two-data, one-size);
 }
 
+static int diff_filespec_check_stat_unmatch(struct diff_filepair *p)
+{
+   /*
+* 1. Entries that come from stat info dirtiness
+*always have both sides (iow, not create/delete),
+*one side of the object name is unknown, with
+*the same mode and size.  Keep the ones that
+*do not match these criteria.  They have real
+*differences.
+*
+* 2. At this point, the file is known to be modified,
+*with the same mode and size, and the object
+*name of one side is unknown.  Need to inspect
+*the identical contents.
+*/
+   if (!DIFF_FILE_VALID(p-one) || /* (1) */
+   !DIFF_FILE_VALID(p-two) ||
+   (p-one-sha1_valid  p-two-sha1_valid) ||
+   (p-one-mode != p-two-mode) ||
+   diff_populate_filespec(p-one, 1) ||
+   diff_populate_filespec(p-two, 1) ||
+   (p-one-size != p-two-size) ||
+   !diff_filespec_is_identical(p-one, p-two)) /* (2) */
+   return 1;
+   return 0;
+}
+
 static void diffcore_skip_stat_unmatch(struct diff_options *diffopt)
 {
int i;
@@ -4707,27 +4734,7 @@ static void diffcore_skip_stat_unmatch(struct 
diff_options *diffopt)
for (i = 0; i  q-nr; i++) {
struct diff_filepair *p = q-queue[i];
 
-   /*
-* 1. Entries that come from stat info dirtiness
-*always have both sides (iow, not create/delete),
-*one side of the object name is unknown, with
-*the same mode and size.  Keep the ones that
-*do not match these criteria.  They have real
-*differences.
-*
-* 2. At this point, the file is known to be modified,
-*with the same mode and size, and the object
-*name of one side is unknown.  Need to inspect
-*the identical contents.
-*/
-   if (!DIFF_FILE_VALID(p-one) || /* (1) */
-   !DIFF_FILE_VALID(p-two) ||
-   (p-one-sha1_valid  p-two-sha1_valid) ||
-   (p-one-mode != p-two-mode) ||
-   diff_populate_filespec(p-one, 1) ||
-   diff_populate_filespec(p-two, 1) ||
-   (p-one-size != p-two-size) ||
-   !diff_filespec_is_identical(p-one, p-two)) /* (2) */
+   if (diff_filespec_check_stat_unmatch(p))
diff_q(outq, p);
else {
/*
@@ -4890,6 +4897,7 @@ void diff_change(struct diff_options *options,
 unsigned old_dirty_submodule, unsigned new_dirty_submodule)
 {
struct diff_filespec *one, *two;
+   struct diff_filepair *p;
 
if (S_ISGITLINK(old_mode)  S_ISGITLINK(new_mode) 
is_submodule_ignored(concatpath, options))
@@ -4916,10 +4924,17 @@ void diff_change(struct diff_options *options,
fill_filespec(two, new_sha1, new_sha1_valid, new_mode);
one-dirty_submodule = old_dirty_submodule;
two-dirty_submodule = new_dirty_submodule;
+   p = diff_queue(diff_queued_diff, one, two);
 
-   

bug with git-diff --quiet

2014-01-22 Thread IWAMOTO Toshihiro
I found git-diff --quiet returns a zero exit status even if there's
a change.  The following sequence reproduces the bug:

  $ mkdir foo
  $ cd foo
  $ git init
  $ echo a  a.txt
  $ echo b b.txt
  $ git add ?.txt
  $ git commit
  $ echo b  b.txt
  $ touch a.txt
  $ git diff --quiet; echo $?
  
Interestingly, if you issue git-diff --quiet again, it returns the
expected exit status 1.

The problem is in the optimization code in run_diff_files().  The
function finds a.txt has different stat(2) data from .git/index and
calls diff_change(), which sets DIFF_OPT_HAS_CHANGES.  As the flag
makes diff_can_quit_early() return 1, run_diff_files()'s loop finishes
without calling diff_change() for b.txt.

Then, diffcore_std() examines diff_queued_diff and clears
DIFF_OPT_HAS_CHANGES, because a.txt is unchanged.
This is how a change in b.txt is ignored by git-diff --quiet.

Here's a obvious fix for this bug, but I think you can find a better
fix. Thanks in advance.


diff --git a/diff-lib.c b/diff-lib.c
index 346cac6..0b8c58d 100644
--- a/diff-lib.c
+++ b/diff-lib.c
@@ -105,9 +105,6 @@ int run_diff_files(struct rev_info *revs, unsigned int 
option)
int changed;
unsigned dirty_submodule = 0;
 
-   if (diff_can_quit_early(revs-diffopt))
-   break;
-
if (!ce_path_match(ce, revs-prune_data))
continue;
 
--
IWAMOTO Toshihiro
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html