On Fri, Nov 20, 2015 at 2:33 PM, Stephan Müller <[email protected]> wrote:
> Am Fri, 20 Nov 2015 23:12:40 +0100
> schrieb Jim Meyering <[email protected]>:
>
>> On Fri, Nov 20, 2015 at 9:06 PM, Stephan Müller <[email protected]>
>> wrote:
>> > Am Fri, 20 Nov 2015 18:28:37 +0100
>> > schrieb Jim Meyering <[email protected]>:
>> >
>> >> On Tue, Nov 17, 2015 at 12:44 PM, Stephan Müller
>> >> <[email protected]> wrote:
>> >> > recently I had to debug weird problem. Finally I figured it out.
>> >> >
>> >> > Virtual file systems like /sys or /proc usually don't care about
>> >> > file sizes. All files have a size of 0. This leads to
>> >> > difficulties as diff sometimes looks for file sizes.
>> >> >
>> >> > Say you do:
>> >> >
>> >> >> $ cp /proc/cmdline my_cmdline
>> >> >> $ diff /proc/cmdline my_cmdline ; echo $?
>> >> >> 0      // ok, files don't differ
>> >> >> $ diff --brief /proc/cmdline my_cmdline
>> >> >> Files /proc/cmdline and mycmdline differ
>> >> >
>> >> > The --brief option triggers a binary compare, as we aren't
>> >> > interested in the actual differences this makes sense. As a first
>> >> > step, file sizes are compared (0 vs ~150) and the files are
>> >> > reported as different.
>> >>
>> >> thanks for the report.
>> >> What version of diffutils are you using?
>> >> I think this has been fixed for some time.
>> >> I was unable to reproduce with 2.8.1 nor with the latest built from
>> >> git. I.e., I created an empty file and used diff-2.8.1 to compare
>> >> it with the nominally-
>> >> zero-length /proc/cmdline file, and diff did the right thing.
>> >> Also, I ran stat to show st_size of each file is indeed 0:
>> >>
>> >>   $ :
>> >> > /tmp/k; /p/p/diffutils-2.8.1/bin/diff /proc/cmdline /tmp/k; \
>> >> > stat --format %s /proc/cmdline /tmp/k
>> >>   1d0
>> >>   < ro root=LABEL=...
>> >>   0
>> >>   0
>> >>
>> >> In fact, I went ahead and built all available versions and tested
>> >> them like this:
>> >>
>> >>   $ for i in /p/p/*/bin/diff; do p=diffutils-$i; echo $i; $i
>> >> /proc/cmdline /tmp/k > /dev/null && echo bad; done
>> >>   /p/p/diffutils-2.7/bin/diff
>> >>   /p/p/diffutils-2.8.1/bin/diff
>> >>   /p/p/diffutils-2.8/bin/diff
>> >>   /p/p/diffutils-2.9/bin/diff
>> >>   /p/p/diffutils-3.0/bin/diff
>> >>   /p/p/diffutils-3.1/bin/diff
>> >>   /p/p/diffutils-3.2/bin/diff
>> >>   /p/p/diffutils-3.3/bin/diff
>> >
>> > Hi,
>> >
>> > I am using v.3.3 of diffutils
>> >
>> > $ diff -v
>> > diff (GNU diffutils) 3.3
>> >
>> > but I think you misunderstood the problem. Sorry for being
>> > ambiguous. I am not diffing against an empty file. That works well.
>> > The point is procfs doesn't care about size, but 'normal' file
>> > systems do. So for example on my system I have (after
>> > cp /proc/cmdline mycmdline)
>> >
>> > $ stat --format %s /proc/cmdline mycmdline
>> > 0
>> > 140
>> >
>> > The result of diffing /proc/cmdline against mycmdline depends on the
>> > --brief flag.
>> >
>> > STEPS TO REPRODUCE:
>> >
>> > cp /proc/cmdline mycmdline
>> > diff --brief /proc/cmdline mycmdline > /dev/null ; echo ?$
>> > 1
>> > diff /proc/cmdline mycmdline ; echo $?
>> > 0
>> >
>> > EXPECTED RESULT:
>> >
>> > cp /proc/cmdline mycmdline
>> > diff --brief /proc/cmdline mycmdline > /dev/null ; echo ?$
>> > 0
>> > diff /proc/cmdline mycmdline ; echo $?
>> > 0
>>
>> Oh, indeed. Thank you for clarifying. That feels like a bug.
>> Here's a knee-jerk patch that refrains from using the
>> st_size-comparing heuristic when either of the sizes is zero. This
>> may well be wrong. I have only barely tested the diff.c code path.
>
> Thanks, that makes the problem at least (even) less unlikely. But if we
> cant trust file sizes we're doomed. What do you think about a flag
> controlling comparison by size and a notice if files differ by size.
>
> I can craft a patch for this.

Thank you, but I don't want to have to specify some new option to
avoid this misbehavior, so will push the attached patch shortly.
If someone finds a system for which a falsely reported stat.st_size
is nonzero, we can revisit this.
From 1209d61ac0c940bdd2dd11448cc9e1b3e9b59a29 Mon Sep 17 00:00:00 2001
From: Jim Meyering <[email protected]>
Date: Sat, 28 Nov 2015 18:02:05 -0800
Subject: [PATCH] diff --brief no longer mistakenly reports diff. with 0-sized
 /proc/ files
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

* src/analyze.c (diff_2_files): Don't declare difference when any
st_size is 0.
* src/diff.c (compare_files): Likewise.
* tests/brief-vs-proc-stat-zero: New test.
* tests/Makefile.am: Add it.
* NEWS (Bug fixes): Describe it.
Reported by Stephan Müller in http://debbugs.gnu.org/21942
---
 NEWS                                 |  8 ++++++++
 src/analyze.c                        |  2 ++
 src/diff.c                           |  4 +++-
 tests/Makefile.am                    |  1 +
 tests/brief-vs-stat-zero-kernel-lies | 30 ++++++++++++++++++++++++++++++
 5 files changed, 44 insertions(+), 1 deletion(-)
 create mode 100755 tests/brief-vs-stat-zero-kernel-lies

diff --git a/NEWS b/NEWS
index 088f13b..1724c74 100644
--- a/NEWS
+++ b/NEWS
@@ -23,6 +23,14 @@ GNU diffutils NEWS                                    -*- outline -*-
   diff -B no longer generates incorrect output if the two inputs
   each end with a one-byte incomplete line.

+  diff --brief no longer reports a difference for unusual identical files.
+  For example, when comparing a file like /proc/cmdline (for which the linux
+  kernel reports st_size of 0 even though it is not an empty file) to a
+  copy of that file's contents residing on a "normal" file system:
+    $ f=/proc/cmdline; cp $f k; diff --brief $f k
+    Files /proc/cmdline and k differ
+
+
 ** Performance changes

   diff's default algorithm has been adjusted to output higher-quality
diff --git a/src/analyze.c b/src/analyze.c
index 893d07c..2622810 100644
--- a/src/analyze.c
+++ b/src/analyze.c
@@ -477,6 +477,8 @@ diff_2_files (struct comparison *cmp)
     {
       /* Files with different lengths must be different.  */
       if (cmp->file[0].stat.st_size != cmp->file[1].stat.st_size
+	  && 0 < cmp->file[0].stat.st_size
+	  && 0 < cmp->file[1].stat.st_size
 	  && (cmp->file[0].desc < 0 || S_ISREG (cmp->file[0].stat.st_mode))
 	  && (cmp->file[1].desc < 0 || S_ISREG (cmp->file[1].stat.st_mode)))
 	changes = 1;
diff --git a/src/diff.c b/src/diff.c
index 46ac99d..cff23f0 100644
--- a/src/diff.c
+++ b/src/diff.c
@@ -1377,7 +1377,9 @@ compare_files (struct comparison const *parent,
   else if (files_can_be_treated_as_binary
 	   && S_ISREG (cmp.file[0].stat.st_mode)
 	   && S_ISREG (cmp.file[1].stat.st_mode)
-	   && cmp.file[0].stat.st_size != cmp.file[1].stat.st_size)
+	   && cmp.file[0].stat.st_size != cmp.file[1].stat.st_size
+	   && 0 < cmp.file[0].stat.st_size
+	   && 0 < cmp.file[1].stat.st_size)
     {
       message ("Files %s and %s differ\n",
 	       file_label[0] ? file_label[0] : cmp.file[0].name,
diff --git a/tests/Makefile.am b/tests/Makefile.am
index 805ccc2..b070a76 100644
--- a/tests/Makefile.am
+++ b/tests/Makefile.am
@@ -4,6 +4,7 @@ TESTS = \
   basic \
   bignum \
   binary \
+  brief-vs-stat-zero-kernel-lies \
   colliding-file-names \
   excess-slash \
   help-version	\
diff --git a/tests/brief-vs-stat-zero-kernel-lies b/tests/brief-vs-stat-zero-kernel-lies
new file mode 100755
index 0000000..9b272c6
--- /dev/null
+++ b/tests/brief-vs-stat-zero-kernel-lies
@@ -0,0 +1,30 @@
+#!/bin/sh
+# Before diff-3.4, diff --brief could mistakenly declare a difference.
+# For example, when comparing a file like /proc/cmdline (for which the linux
+# kernel reports a st_size of 0 even though it is not an empty file) to a
+# copy of that file's contents residing on a "normal" file system.
+
+. "${srcdir=.}/init.sh"; path_prepend_ ../src
+
+fail=0
+
+# Skip the test unless we have an appropriate file.
+boot=/proc/cmdline
+test -f $boot || skip_ no $boot file
+sz=$(stat --format %s $boot) || skip_ stat --format %s does not work
+test $sz = 0 || skip_ $boot has nonzero size
+
+# There are two code paths to test: one for non-binary and one for binary files.
+# $boot is non-binary.
+cat $boot > ref || framework_failure_
+diff --brief $boot ref > out 2>&1 || fail=1
+compare /dev/null out || fail=1
+
+# /proc/self/cmdline is a NUL-terminated list of argv values,
+# so construct the expected output here:
+printf 'diff\0--brief\0/proc/self/cmdline\0bin\0' > bin || framework_failure_
+# And run the command that is embedded in that output:
+diff --brief /proc/self/cmdline bin > out 2>&1 || fail=1
+compare /dev/null out || fail=1
+
+Exit $fail
-- 
2.6.2

Reply via email to