On Fri, Nov 20, 2015 at 2:33 PM, Stephan Müller <[email protected]> wrote: > Am Fri, 20 Nov 2015 23:12:40 +0100 > schrieb Jim Meyering <[email protected]>: > >> On Fri, Nov 20, 2015 at 9:06 PM, Stephan Müller <[email protected]> >> wrote: >> > Am Fri, 20 Nov 2015 18:28:37 +0100 >> > schrieb Jim Meyering <[email protected]>: >> > >> >> On Tue, Nov 17, 2015 at 12:44 PM, Stephan Müller >> >> <[email protected]> wrote: >> >> > recently I had to debug weird problem. Finally I figured it out. >> >> > >> >> > Virtual file systems like /sys or /proc usually don't care about >> >> > file sizes. All files have a size of 0. This leads to >> >> > difficulties as diff sometimes looks for file sizes. >> >> > >> >> > Say you do: >> >> > >> >> >> $ cp /proc/cmdline my_cmdline >> >> >> $ diff /proc/cmdline my_cmdline ; echo $? >> >> >> 0 // ok, files don't differ >> >> >> $ diff --brief /proc/cmdline my_cmdline >> >> >> Files /proc/cmdline and mycmdline differ >> >> > >> >> > The --brief option triggers a binary compare, as we aren't >> >> > interested in the actual differences this makes sense. As a first >> >> > step, file sizes are compared (0 vs ~150) and the files are >> >> > reported as different. >> >> >> >> thanks for the report. >> >> What version of diffutils are you using? >> >> I think this has been fixed for some time. >> >> I was unable to reproduce with 2.8.1 nor with the latest built from >> >> git. I.e., I created an empty file and used diff-2.8.1 to compare >> >> it with the nominally- >> >> zero-length /proc/cmdline file, and diff did the right thing. >> >> Also, I ran stat to show st_size of each file is indeed 0: >> >> >> >> $ : >> >> > /tmp/k; /p/p/diffutils-2.8.1/bin/diff /proc/cmdline /tmp/k; \ >> >> > stat --format %s /proc/cmdline /tmp/k >> >> 1d0 >> >> < ro root=LABEL=... >> >> 0 >> >> 0 >> >> >> >> In fact, I went ahead and built all available versions and tested >> >> them like this: >> >> >> >> $ for i in /p/p/*/bin/diff; do p=diffutils-$i; echo $i; $i >> >> /proc/cmdline /tmp/k > /dev/null && echo bad; done >> >> /p/p/diffutils-2.7/bin/diff >> >> /p/p/diffutils-2.8.1/bin/diff >> >> /p/p/diffutils-2.8/bin/diff >> >> /p/p/diffutils-2.9/bin/diff >> >> /p/p/diffutils-3.0/bin/diff >> >> /p/p/diffutils-3.1/bin/diff >> >> /p/p/diffutils-3.2/bin/diff >> >> /p/p/diffutils-3.3/bin/diff >> > >> > Hi, >> > >> > I am using v.3.3 of diffutils >> > >> > $ diff -v >> > diff (GNU diffutils) 3.3 >> > >> > but I think you misunderstood the problem. Sorry for being >> > ambiguous. I am not diffing against an empty file. That works well. >> > The point is procfs doesn't care about size, but 'normal' file >> > systems do. So for example on my system I have (after >> > cp /proc/cmdline mycmdline) >> > >> > $ stat --format %s /proc/cmdline mycmdline >> > 0 >> > 140 >> > >> > The result of diffing /proc/cmdline against mycmdline depends on the >> > --brief flag. >> > >> > STEPS TO REPRODUCE: >> > >> > cp /proc/cmdline mycmdline >> > diff --brief /proc/cmdline mycmdline > /dev/null ; echo ?$ >> > 1 >> > diff /proc/cmdline mycmdline ; echo $? >> > 0 >> > >> > EXPECTED RESULT: >> > >> > cp /proc/cmdline mycmdline >> > diff --brief /proc/cmdline mycmdline > /dev/null ; echo ?$ >> > 0 >> > diff /proc/cmdline mycmdline ; echo $? >> > 0 >> >> Oh, indeed. Thank you for clarifying. That feels like a bug. >> Here's a knee-jerk patch that refrains from using the >> st_size-comparing heuristic when either of the sizes is zero. This >> may well be wrong. I have only barely tested the diff.c code path. > > Thanks, that makes the problem at least (even) less unlikely. But if we > cant trust file sizes we're doomed. What do you think about a flag > controlling comparison by size and a notice if files differ by size. > > I can craft a patch for this.
Thank you, but I don't want to have to specify some new option to avoid this misbehavior, so will push the attached patch shortly. If someone finds a system for which a falsely reported stat.st_size is nonzero, we can revisit this.
From 1209d61ac0c940bdd2dd11448cc9e1b3e9b59a29 Mon Sep 17 00:00:00 2001 From: Jim Meyering <[email protected]> Date: Sat, 28 Nov 2015 18:02:05 -0800 Subject: [PATCH] diff --brief no longer mistakenly reports diff. with 0-sized /proc/ files MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit * src/analyze.c (diff_2_files): Don't declare difference when any st_size is 0. * src/diff.c (compare_files): Likewise. * tests/brief-vs-proc-stat-zero: New test. * tests/Makefile.am: Add it. * NEWS (Bug fixes): Describe it. Reported by Stephan Müller in http://debbugs.gnu.org/21942 --- NEWS | 8 ++++++++ src/analyze.c | 2 ++ src/diff.c | 4 +++- tests/Makefile.am | 1 + tests/brief-vs-stat-zero-kernel-lies | 30 ++++++++++++++++++++++++++++++ 5 files changed, 44 insertions(+), 1 deletion(-) create mode 100755 tests/brief-vs-stat-zero-kernel-lies diff --git a/NEWS b/NEWS index 088f13b..1724c74 100644 --- a/NEWS +++ b/NEWS @@ -23,6 +23,14 @@ GNU diffutils NEWS -*- outline -*- diff -B no longer generates incorrect output if the two inputs each end with a one-byte incomplete line. + diff --brief no longer reports a difference for unusual identical files. + For example, when comparing a file like /proc/cmdline (for which the linux + kernel reports st_size of 0 even though it is not an empty file) to a + copy of that file's contents residing on a "normal" file system: + $ f=/proc/cmdline; cp $f k; diff --brief $f k + Files /proc/cmdline and k differ + + ** Performance changes diff's default algorithm has been adjusted to output higher-quality diff --git a/src/analyze.c b/src/analyze.c index 893d07c..2622810 100644 --- a/src/analyze.c +++ b/src/analyze.c @@ -477,6 +477,8 @@ diff_2_files (struct comparison *cmp) { /* Files with different lengths must be different. */ if (cmp->file[0].stat.st_size != cmp->file[1].stat.st_size + && 0 < cmp->file[0].stat.st_size + && 0 < cmp->file[1].stat.st_size && (cmp->file[0].desc < 0 || S_ISREG (cmp->file[0].stat.st_mode)) && (cmp->file[1].desc < 0 || S_ISREG (cmp->file[1].stat.st_mode))) changes = 1; diff --git a/src/diff.c b/src/diff.c index 46ac99d..cff23f0 100644 --- a/src/diff.c +++ b/src/diff.c @@ -1377,7 +1377,9 @@ compare_files (struct comparison const *parent, else if (files_can_be_treated_as_binary && S_ISREG (cmp.file[0].stat.st_mode) && S_ISREG (cmp.file[1].stat.st_mode) - && cmp.file[0].stat.st_size != cmp.file[1].stat.st_size) + && cmp.file[0].stat.st_size != cmp.file[1].stat.st_size + && 0 < cmp.file[0].stat.st_size + && 0 < cmp.file[1].stat.st_size) { message ("Files %s and %s differ\n", file_label[0] ? file_label[0] : cmp.file[0].name, diff --git a/tests/Makefile.am b/tests/Makefile.am index 805ccc2..b070a76 100644 --- a/tests/Makefile.am +++ b/tests/Makefile.am @@ -4,6 +4,7 @@ TESTS = \ basic \ bignum \ binary \ + brief-vs-stat-zero-kernel-lies \ colliding-file-names \ excess-slash \ help-version \ diff --git a/tests/brief-vs-stat-zero-kernel-lies b/tests/brief-vs-stat-zero-kernel-lies new file mode 100755 index 0000000..9b272c6 --- /dev/null +++ b/tests/brief-vs-stat-zero-kernel-lies @@ -0,0 +1,30 @@ +#!/bin/sh +# Before diff-3.4, diff --brief could mistakenly declare a difference. +# For example, when comparing a file like /proc/cmdline (for which the linux +# kernel reports a st_size of 0 even though it is not an empty file) to a +# copy of that file's contents residing on a "normal" file system. + +. "${srcdir=.}/init.sh"; path_prepend_ ../src + +fail=0 + +# Skip the test unless we have an appropriate file. +boot=/proc/cmdline +test -f $boot || skip_ no $boot file +sz=$(stat --format %s $boot) || skip_ stat --format %s does not work +test $sz = 0 || skip_ $boot has nonzero size + +# There are two code paths to test: one for non-binary and one for binary files. +# $boot is non-binary. +cat $boot > ref || framework_failure_ +diff --brief $boot ref > out 2>&1 || fail=1 +compare /dev/null out || fail=1 + +# /proc/self/cmdline is a NUL-terminated list of argv values, +# so construct the expected output here: +printf 'diff\0--brief\0/proc/self/cmdline\0bin\0' > bin || framework_failure_ +# And run the command that is embedded in that output: +diff --brief /proc/self/cmdline bin > out 2>&1 || fail=1 +compare /dev/null out || fail=1 + +Exit $fail -- 2.6.2
