[email protected] writes:
> From: Junio C Hamano <[email protected]>
>
> git diff --quiet may take a short-cut to see if a file is changed
> in the working tree:
> Whenever the file size differs from what is recorded in the index,
> the file is assumed to be changed and git diff --quiet returns
> exit with code 1
>
> This shortcut must be suppressed whenever the line endings are converted
> or a filter is in use.
> The attributes say "* text=auto" and a file has
> "Hello\nWorld\n" in the index with a length of 12.
> The file in the working tree has "Hello\r\nWorld\r\n" with a length of 14.
> (Or even "Hello\r\nWorld\n").
> In this case "git add" will not do any changes to the index, and
> "git diff -quiet" should exit 0.
The thing I find the most disturbing is that at this point in the
flow, p->one->size and p->two->size are supposed to be the sizes of
the blob object, not the contents of the file on the working tree.
IOW, p->two->size being 14 in the above example sounds like pointing
at a different bug, if it is 14.
The early return in diff_populate_filespec(), where it does
s->size = xsize_t(st.st_size);
...
if (size_only)
return 0;
way before it runs convert_to_git(), may be the real culprit.
I am wondering if the real fix would be to do this, instead of the
two extra would_convert_to_git() call there in the patch you sent.
The result seems to still pass the new test in your patch.
Thanks for helping.
diff.c | 19 ++++++++++++++++++-
1 file changed, 18 insertions(+), 1 deletion(-)
diff --git a/diff.c b/diff.c
index 8c78fce49d..dc51dceb44 100644
--- a/diff.c
+++ b/diff.c
@@ -2792,8 +2792,25 @@ int diff_populate_filespec(struct diff_filespec *s,
unsigned int flags)
s->should_free = 1;
return 0;
}
- if (size_only)
+
+ /*
+ * Even if the caller would be happy with getting
+ * only the size, we cannot return early at this
+ * point if the path requires us to run the content
+ * conversion.
+ */
+ if (!would_convert_to_git(s->path) && size_only)
return 0;
+
+ /*
+ * Note: this check uses xsize_t(st.st_size) that may
+ * not be the true size of the blob after it goes
+ * through convert_to_git(). This may not strictly be
+ * correct, but the whole point of big_file_threashold
+ * and is_binary check is that we want to avoid
+ * opening the file and inspecting the contents, so
+ * this is probably fine.
+ */
if ((flags & CHECK_BINARY) &&
s->size > big_file_threshold && s->is_binary == -1) {
s->is_binary = 1;