On Fri, Aug 12, 2016 at 9:35 AM, Jim Meyering <[email protected]> wrote:
> On Fri, Aug 12, 2016 at 7:22 AM, Bastian Beischer
> <[email protected]> wrote:
>> Hello,
>>
>> Please try to run diff3 from diffutils 3.4 as follows:
>>
>> echo a > a.txt
>> echo b > b.txt
>> echo c > c.txt
>> diff3 a.txt b.txt c.txt
...
>> There was only one commit in src/diff3.c between 3.3 and 3.4:
>>
>> http://git.savannah.gnu.org/cgit/diffutils.git/commit/src?id=3b74a905c5460e7979c53273ac90345860d001a7
>>
>> Reverting this commit fixes the issue.
>
> Yikes.
> Thank you for the report.
> I confirm that that patch is erroneous, and will prepare a complete
> fix (adding your test and a NEWS entry) shortly.
> Looks like diffutils-3.5 will have to be released pretty soon.

I've reverted that, added your test case and updated NEWS with a
commit in your name. Please review that first commit, in the attached.

    diff3: fix heap use-after-free; add minimal diff3 test coverage

I have also made two other changes:

      maint: require that commit messages be of a certain form
      diff3: fix leaks, for real

The first of those enforces some modicum of sanity on commit log
messages, and might have prevented the offending commit. This is
largely copied from what I did for coreutils.

The final change actually does plug diff3's leaks. The minimal test
cases did not exercise the code to handle multiple diff blocks, so I
added one that does. One can argue that explicitly freeing some of
those buffers is not required, so I have made it so that nearly all of
the code added by this fix can be compiled out.
From 88d911dbc717494febee4b0ebc790808054fefff Mon Sep 17 00:00:00 2001
From: Jim Meyering <[email protected]>
Date: Sat, 13 Aug 2016 19:40:20 -0700
Subject: [PATCH 1/4] build: ignore texinfo build artifacts

* .gitignore: Ignore texinfo artifacts in doc/.
---
 .gitignore | 6 +++++-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/.gitignore b/.gitignore
index bf11312..3fecace 100644
--- a/.gitignore
+++ b/.gitignore
@@ -15,10 +15,14 @@
 /config.log
 /config.status
 /configure
-/diffutils-*.tar.gz
 /diffutils-*.tar.xz
 /doc/.gitignore
+/doc/diffutils.aux
+/doc/diffutils.cp
+/doc/diffutils.cps
 /doc/diffutils.info
+/doc/diffutils.log
+/doc/diffutils.toc
 /doc/stamp-vti
 /doc/version.texi
 /gnulib-tests
-- 
2.7.4


From 1a0df4396ebe3b9a58b882bb976cfce3f50d3cac Mon Sep 17 00:00:00 2001
From: Bastian Beischer <[email protected]>
Date: Sat, 13 Aug 2016 18:53:36 -0700
Subject: [PATCH 2/4] diff3: fix heap use-after-free; add minimal diff3 test
 coverage

Commit v3.3-42-g3b74a90, "FIXME: src/diff3: plug a leak" added an
invalid use of free, leading to use-after-free in nearly any invocation
of diff3.  Revert that commit.
* NEWS (Bug fixes): Mention it.
* tests/diff3: New file, to add minimal test coverage.
* tests/Makefile.am (TESTS): Add it.
Reported by Bastian Beischer in http://bugs.gnu.org/24210
---
 NEWS              |  3 +++
 src/diff3.c       |  1 -
 tests/Makefile.am |  1 +
 tests/diff3       | 30 ++++++++++++++++++++++++++++++
 4 files changed, 34 insertions(+), 1 deletion(-)
 create mode 100644 tests/diff3

diff --git a/NEWS b/NEWS
index 9a8d2e1..73ff83b 100644
--- a/NEWS
+++ b/NEWS
@@ -4,6 +4,9 @@ GNU diffutils NEWS                                    -*- 
outline -*-

 ** Bug fixes

+  diff3 no longer malfunctions due to use-after-free
+  [bug introduced in 3.4]
+
   diff --color no longer colorizes when TERM=dumb


diff --git a/src/diff3.c b/src/diff3.c
index 6ef90f4..0eb643e 100644
--- a/src/diff3.c
+++ b/src/diff3.c
@@ -1039,7 +1039,6 @@ process_diff (char const *filea,

   *block_list_end = NULL;
   *last_block = bptr;
-  free (diff_contents);
   return block_list;
 }

diff --git a/tests/Makefile.am b/tests/Makefile.am
index 77d9fc3..66e25a5 100644
--- a/tests/Makefile.am
+++ b/tests/Makefile.am
@@ -6,6 +6,7 @@ TESTS = \
   binary \
   brief-vs-stat-zero-kernel-lies \
   colliding-file-names \
+  diff3 \
   excess-slash \
   help-version \
   function-line-vs-leading-space \
diff --git a/tests/diff3 b/tests/diff3
new file mode 100644
index 0000000..a1fc287
--- /dev/null
+++ b/tests/diff3
@@ -0,0 +1,30 @@
+#!/bin/sh
+# This would malfunction in diff-3.4
+
+. "${srcdir=.}/init.sh"; path_prepend_ ../src
+
+echo a > a || framework_failure_
+echo b > b || framework_failure_
+echo c > c || framework_failure_
+cat <<'EOF' > exp || framework_failure_
+====
+1:1c
+  a
+2:1c
+  b
+3:1c
+  c
+EOF
+
+fail=0
+
+diff3 a b c > out 2> err || fail=1
+compare exp out || fail=1
+compare /dev/null err || fail=1
+
+# Repeat, but with all three files the same:
+diff3 a a a > out 2> err || fail=1
+compare /dev/null out || fail=1
+compare /dev/null err || fail=1
+
+Exit $fail
-- 
2.7.4


From b3def738f3b435cbe6f2a8406bae5a71175a0b80 Mon Sep 17 00:00:00 2001
From: Jim Meyering <[email protected]>
Date: Fri, 12 Aug 2016 11:17:26 -0700
Subject: [PATCH 3/4] maint: require that commit messages be of a certain form

* bootstrap.conf (bootstrap_epilogue): Merge from coreutils, so that
a local commit hook will now help enforce consistent commit messages.
* Makefile.am (check-git-hook-script-sync): New rule, largely copied
from coreutils.
* scripts/git-hooks/commit-msg: New file, from coreutils, but
with adapted list of program names.
* scripts/git-hooks/applypatch-msg: New file, from git.
* scripts/git-hooks/pre-applypatch: Likewise.
* scripts/git-hooks/pre-commit: Likewise.
---
 Makefile.am                      |  14 ++++
 bootstrap.conf                   |  25 +++++++
 scripts/git-hooks/applypatch-msg |  15 ++++
 scripts/git-hooks/commit-msg     | 153 +++++++++++++++++++++++++++++++++++++++
 scripts/git-hooks/pre-applypatch |  14 ++++
 scripts/git-hooks/pre-commit     |  49 +++++++++++++
 6 files changed, 270 insertions(+)
 create mode 100755 scripts/git-hooks/applypatch-msg
 create mode 100755 scripts/git-hooks/commit-msg
 create mode 100755 scripts/git-hooks/pre-applypatch
 create mode 100755 scripts/git-hooks/pre-commit

diff --git a/Makefile.am b/Makefile.am
index 244024e..edee5fa 100644
--- a/Makefile.am
+++ b/Makefile.am
@@ -42,3 +42,17 @@ gen-ChangeLog:
 ALL_RECURSIVE_TARGETS += distcheck-hook
 distcheck-hook:
        $(MAKE) my-distcheck
+
+# Some of our git hook scripts are supposed to be identical to git's samples.
+# See if they are still in sync.
+.PHONY: check-git-hook-script-sync
+check-git-hook-script-sync:
+       @fail=0;                                                        \
+       t=$$(mktemp -d)                                                 \
+         && cd $$t && git init -q && cd .git/hooks                     \
+         && for i in pre-commit pre-applypatch applypatch-msg; do      \
+              diff -u $(abs_top_srcdir)/scripts/git-hooks/$$i $$i.sample       
\
+                || fail=1;                                             \
+            done;                                                      \
+       rm -rf $$t;                                                     \
+       test $$fail = 0
diff --git a/bootstrap.conf b/bootstrap.conf
index 828ffef..ac3c278 100644
--- a/bootstrap.conf
+++ b/bootstrap.conf
@@ -129,4 +129,29 @@ bootstrap_post_import_hook()
 bootstrap_epilogue()
 {
   perl -pi -e "s/\@PACKAGE\@/$package/g" README-release
+
+  # Since this is a "GNU" package, replace this line
+  #   if LC_ALL=C grep 'GNU @PACKAGE@' $(top_srcdir)/* 2>/dev/null \
+  #      | grep -v 'libtool:' >/dev/null; then
+  # with this:
+  #   if true; then
+  # Why?  That pipeline searches all files in $(top_srcdir), and if you
+  # happen to have large files (or apparently large sparse files), the
+  # first grep may well run out of memory.
+  perl -pi -e 's/if LC_ALL=C grep .GNU .PACKAGE.*; then/if true; then/' \
+    po/Makefile.in.in
+
+  # Install our git hooks, as long as "cp" accepts the --backup option,
+  # so that we can back up any existing files.
+  case $(cp --help) in *--backup*) backup=1;; *) backup=0;; esac
+  if test $backup = 1; then
+    hooks=$(cd scripts/git-hooks && git ls-files)
+    for f in $hooks; do
+      # If it is identical, skip it.
+      cmp scripts/git-hooks/$f .git/hooks/$f > /dev/null \
+        && continue
+      cp --backup=numbered scripts/git-hooks/$f .git/hooks
+      chmod a-w .git/hooks/$f
+    done
+  fi
 }
diff --git a/scripts/git-hooks/applypatch-msg b/scripts/git-hooks/applypatch-msg
new file mode 100755
index 0000000..a5d7b84
--- /dev/null
+++ b/scripts/git-hooks/applypatch-msg
@@ -0,0 +1,15 @@
+#!/bin/sh
+#
+# An example hook script to check the commit log message taken by
+# applypatch from an e-mail message.
+#
+# The hook should exit with non-zero status after issuing an
+# appropriate message if it wants to stop the commit.  The hook is
+# allowed to edit the commit message file.
+#
+# To enable this hook, rename this file to "applypatch-msg".
+
+. git-sh-setup
+commitmsg="$(git rev-parse --git-path hooks/commit-msg)"
+test -x "$commitmsg" && exec "$commitmsg" ${1+"$@"}
+:
diff --git a/scripts/git-hooks/commit-msg b/scripts/git-hooks/commit-msg
new file mode 100755
index 0000000..888b83b
--- /dev/null
+++ b/scripts/git-hooks/commit-msg
@@ -0,0 +1,153 @@
+eval '(exit $?0)' && eval 'exec perl -w "$0" ${1+"$@"}'
+  & eval 'exec perl -w "$0" $argv:q'
+    if 0;
+
+use strict;
+use warnings;
+(my $ME = $0) =~ s|.*/||;
+
+# Emulate Git's choice of the editor for the commit message.
+chomp (my $editor = `git var GIT_EDITOR`);
+# And have a sane, minimal fallback in case of weird failures.
+$editor = "vi" if $? != 0 or $editor =~ /^\s*\z/;
+
+# Keywords allowed before the colon on the first line of a commit message:
+# program names and a few general category names.
+my @valid = qw(
+    diff diff3 cmp sdiff
+
+    gnulib tests maint doc build scripts
+    );
+my $v_or = join '|', @valid;
+my $valid_regex = qr/^(?:$v_or)$/;
+
+# Rewrite the $LOG_FILE (old contents in @$LINE_REF) with an additional
+# a commented diagnostic "# $ERR" line at the top.
+sub rewrite($$$)
+{
+  my ($log_file, $err, $line_ref) = @_;
+  local *LOG;
+  open LOG, '>', $log_file
+    or die "$ME: $log_file: failed to open for writing: $!";
+  print LOG "# $err";
+  print LOG @$line_ref;
+  close LOG
+    or die "$ME: $log_file: failed to rewrite: $!\n";
+}
+
+sub re_edit($)
+{
+  my ($log_file) = @_;
+
+  warn "Interrupt (Ctrl-C) to abort...\n";
+
+  system 'sh', '-c', "$editor $log_file";
+  ($? & 127) || ($? >> 8)
+    and die "$ME: $log_file: the editor ($editor) failed, aborting\n";
+}
+
+sub bad_first_line($)
+{
+  my ($line) = @_;
+
+  $line =~ /^[Vv]ersion \d/
+    and return '';
+
+  $line =~ /:/
+    or return 'missing colon on first line of log message';
+
+  $line =~ /\.$/
+    and return 'do not use a period "." at the end of the first line';
+
+  # The token(s) before the colon on the first line must be on our list
+  # Tokens may be space- or comma-separated.
+  (my $pre_colon = $line) =~ s/:.*//;
+  my @word = split (/[ ,]/, $pre_colon);
+  my @bad = grep !/$valid_regex/, @word;
+  @bad
+    and return 'invalid first word(s) of summary line: ' . join (', ', @bad);
+
+  return '';
+}
+
+# Given a $LOG_FILE name and a \@LINE buffer,
+# read the contents of the file into the buffer and analyze it.
+# If the log message passes muster, return the empty string.
+# If not, return a diagnostic.
+sub check_msg($$)
+{
+  my ($log_file, $line_ref) = @_;
+
+  local *LOG;
+  open LOG, '<', $log_file
+    or return "failed to open for reading: $!";
+  @$line_ref = <LOG>;
+  close LOG;
+
+  my @line = @$line_ref;
+  chomp @line;
+
+  # Don't filter out blank or comment lines; git does that already,
+  # and if we were to ignore them here, it could lead to committing
+  # with lines that start with "#" in the log.
+
+  # Filter out leading blank and comment lines.
+  # while (@line && $line[0] =~ /^(?:#.*|[ \t]*)$/) { shift @line; }
+
+  # Filter out blank and comment lines at EOF.
+  # while (@line && $line[$#line] =~ /^(?:#.*|[ \t]*)$/) { pop @line; }
+
+  @line == 0
+    and return 'no log message';
+
+  my $bad = bad_first_line $line[0];
+  $bad
+    and return $bad;
+
+  # Second line should be blank or not present.
+  2 <= @line && length $line[1]
+    and return 'second line must be empty';
+
+  # Limit line length to allow for the ChangeLog's leading TAB.
+  foreach my $line (@line)
+    {
+      72 < length $line && $line =~ /^[^#]/
+        and return 'line longer than 72';
+    }
+
+  my $buf = join ("\n", @line) . "\n";
+  $buf =~ m!https?://bugzilla\.redhat\.com/show_bug\.cgi\?id=(\d+)!s
+    and return "use shorter http://bugzilla.redhat.com/$1";;
+
+  $buf =~ m!https?://debbugs\.gnu\.org/(?:cgi/bugreport\.cgi\?bug=)?(\d+)!s
+    and return "use shorter http://bugs.gnu.org/$1";;
+
+  $buf =~ /^ *Signed-off-by:/mi
+    and return q(do not use "Signed-off-by:");
+
+  return '';
+}
+
+{
+  @ARGV == 1
+    or die;
+
+  my $log_file = $ARGV[0];
+
+  while (1)
+    {
+      my @line;
+      my $err = check_msg $log_file, \@line;
+      $err eq ''
+        and last;
+      $err = "$ME: $err\n";
+      warn $err;
+      # Insert the diagnostic as a comment on the first line of $log_file.
+      rewrite $log_file, $err, \@line;
+      re_edit $log_file;
+
+      # Stop if our parent is killed.
+      getppid() == 1
+        and last;
+    }
+}
diff --git a/scripts/git-hooks/pre-applypatch b/scripts/git-hooks/pre-applypatch
new file mode 100755
index 0000000..4142082
--- /dev/null
+++ b/scripts/git-hooks/pre-applypatch
@@ -0,0 +1,14 @@
+#!/bin/sh
+#
+# An example hook script to verify what is about to be committed
+# by applypatch from an e-mail message.
+#
+# The hook should exit with non-zero status after issuing an
+# appropriate message if it wants to stop the commit.
+#
+# To enable this hook, rename this file to "pre-applypatch".
+
+. git-sh-setup
+precommit="$(git rev-parse --git-path hooks/pre-commit)"
+test -x "$precommit" && exec "$precommit" ${1+"$@"}
+:
diff --git a/scripts/git-hooks/pre-commit b/scripts/git-hooks/pre-commit
new file mode 100755
index 0000000..68d62d5
--- /dev/null
+++ b/scripts/git-hooks/pre-commit
@@ -0,0 +1,49 @@
+#!/bin/sh
+#
+# An example hook script to verify what is about to be committed.
+# Called by "git commit" with no arguments.  The hook should
+# exit with non-zero status after issuing an appropriate message if
+# it wants to stop the commit.
+#
+# To enable this hook, rename this file to "pre-commit".
+
+if git rev-parse --verify HEAD >/dev/null 2>&1
+then
+       against=HEAD
+else
+       # Initial commit: diff against an empty tree object
+       against=4b825dc642cb6eb9a060e54bf8d69288fbee4904
+fi
+
+# If you want to allow non-ASCII filenames set this variable to true.
+allownonascii=$(git config --bool hooks.allownonascii)
+
+# Redirect output to stderr.
+exec 1>&2
+
+# Cross platform projects tend to avoid non-ASCII filenames; prevent
+# them from being added to the repository. We exploit the fact that the
+# printable range starts at the space character and ends with tilde.
+if [ "$allownonascii" != "true" ] &&
+       # Note that the use of brackets around a tr range is ok here, (it's
+       # even required, for portability to Solaris 10's /usr/bin/tr), since
+       # the square bracket bytes happen to fall in the designated range.
+       test $(git diff --cached --name-only --diff-filter=A -z $against |
+         LC_ALL=C tr -d '[ -~]\0' | wc -c) != 0
+then
+       cat <<\EOF
+Error: Attempt to add a non-ASCII file name.
+
+This can cause problems if you want to work with people on other platforms.
+
+To be portable it is advisable to rename the file.
+
+If you know what you are doing you can disable this check using:
+
+  git config hooks.allownonascii true
+EOF
+       exit 1
+fi
+
+# If there are whitespace errors, print the offending file names and fail.
+exec git diff-index --check --cached $against --
-- 
2.7.4


From edd942ca27d570a33d612b12eecaa33a76640e46 Mon Sep 17 00:00:00 2001
From: Jim Meyering <[email protected]>
Date: Fri, 12 Aug 2016 21:40:29 -0700
Subject: [PATCH 4/4] diff3: fix leaks, for real

* src/diff3.c (struct diff_block)[lint]: Add member, n2.
(free_diff_block, next_to_n2): New functions.
* tests/diff3: Add more test coverage.
---
 src/diff3.c | 56 +++++++++++++++++++++++++++++++++++++++++++++++----
 tests/diff3 | 66 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 118 insertions(+), 4 deletions(-)

diff --git a/src/diff3.c b/src/diff3.c
index 0eb643e..b80aeb3 100644
--- a/src/diff3.c
+++ b/src/diff3.c
@@ -78,6 +78,9 @@ struct diff_block {
   char **lines[2];             /* The actual lines (may contain nulls) */
   size_t *lengths[2];          /* Line lengths (including newlines, if any) */
   struct diff_block *next;
+#ifdef lint
+  struct diff_block *n2;       /* Used only when freeing.  */
+#endif
 };

 /* Three way diff */
@@ -176,7 +179,7 @@ static struct diff3_block *create_diff3_block (lin, lin, 
lin, lin, lin, lin);
 static struct diff3_block *make_3way_diff (struct diff_block *, struct 
diff_block *);
 static struct diff3_block *reverse_diff3_blocklist (struct diff3_block *);
 static struct diff3_block *using_to_diff3_block (struct diff_block *[2], 
struct diff_block *[2], int, int, struct diff3_block const *);
-static struct diff_block *process_diff (char const *, char const *, struct 
diff_block **);
+static struct diff_block *process_diff (char const *, char const *, struct 
diff_block **, char **);
 static void check_stdout (void);
 static void fatal (char const *) __attribute__((noreturn));
 static void output_diff3 (FILE *, struct diff3_block *, int const[3], int 
const[3]);
@@ -212,6 +215,38 @@ static struct option const longopts[] =
   {0, 0, 0, 0}
 };

+static void
+free_diff_block (struct diff_block *p)
+{
+#ifndef lint
+  (void)p;
+#else
+  while (p)
+    {
+      free (p->lines[0]);
+      free (p->lines[1]);
+      free (p->lengths[0]);
+      free (p->lengths[1]);
+      struct diff_block *next = p->n2;
+      free (p);
+      p = next;
+    }
+#endif
+}
+
+/* Copy each next pointer to n2, since make_3way_diff would clobber the former,
+   yet we will still need something to free these buffers.  */
+static void
+next_to_n2 (struct diff_block *p)
+{
+#ifndef lint
+  (void)p;
+#else
+  while (p)
+    p = p->n2 = p->next;
+#endif
+}
+
 int
 main (int argc, char **argv)
 {
@@ -377,10 +412,19 @@ main (int argc, char **argv)
   /* Invoke diff twice on two pairs of input files, combine the two
      diffs, and output them.  */

+  char *b0, *b1;
   commonname = file[rev_mapping[FILEC]];
-  thread1 = process_diff (file[rev_mapping[FILE1]], commonname, &last_block);
-  thread0 = process_diff (file[rev_mapping[FILE0]], commonname, &last_block);
+  thread1 = process_diff (file[rev_mapping[FILE1]], commonname, &last_block, 
&b1);
+  thread0 = process_diff (file[rev_mapping[FILE0]], commonname, &last_block, 
&b0);
+
+  next_to_n2 (thread0);
+  next_to_n2 (thread1);
+
   diff3 = make_3way_diff (thread0, thread1);
+
+  free_diff_block (thread0);
+  free_diff_block (thread1);
+
   if (edscript)
     conflicts_found
       = output_diff3_edscript (stdout, diff3, mapping, rev_mapping,
@@ -400,6 +444,8 @@ main (int argc, char **argv)
       conflicts_found = false;
     }

+  free (b0);
+  free (b1);
   check_stdout ();
   exit (conflicts_found);
 }
@@ -938,7 +984,8 @@ compare_line_list (char * const list1[], size_t const 
lengths1[],
 static struct diff_block *
 process_diff (char const *filea,
              char const *fileb,
-             struct diff_block **last_block)
+             struct diff_block **last_block,
+             char **buf_to_free)
 {
   char *diff_contents;
   char *diff_limit;
@@ -953,6 +1000,7 @@ process_diff (char const *filea,
                                  sizeof *bptr->lengths[1]));

   diff_limit = read_diff (filea, fileb, &diff_contents);
+  *buf_to_free = diff_contents;
   scan_diff = diff_contents;

   while (scan_diff < diff_limit)
diff --git a/tests/diff3 b/tests/diff3
index a1fc287..a40631b 100644
--- a/tests/diff3
+++ b/tests/diff3
@@ -27,4 +27,70 @@ diff3 a a a > out 2> err || fail=1
 compare /dev/null out || fail=1
 compare /dev/null err || fail=1

+# This would have provoked a nontrivial leak prior to diffutils-3.5,
+# due to the nontrivial list of diff_block structs.
+seq 10 40|sed 's/1$/x/' > d || framework_failure_
+seq 10 40|sed 's/5$/y/' > e || framework_failure_
+seq 10 40|sed 's/8$/z/' > f || framework_failure_
+cat <<'EOF' > exp40 || framework_failure_
+====1
+1:2c
+  1x
+2:2c
+3:2c
+  11
+====2
+1:6c
+3:6c
+  15
+2:6c
+  1y
+====3
+1:9c
+2:9c
+  18
+3:9c
+  1z
+====1
+1:12c
+  2x
+2:12c
+3:12c
+  21
+====2
+1:16c
+3:16c
+  25
+2:16c
+  2y
+====3
+1:19c
+2:19c
+  28
+3:19c
+  2z
+====1
+1:22c
+  3x
+2:22c
+3:22c
+  31
+====2
+1:26c
+3:26c
+  35
+2:26c
+  3y
+====3
+1:29c
+2:29c
+  38
+3:29c
+  3z
+EOF
+
+diff3 d e f > out 2> err
+compare exp40 out || fail=1
+compare /dev/null err || fail=1
+
 Exit $fail
-- 
2.7.4

Reply via email to