On Fri, Aug 12, 2016 at 9:35 AM, Jim Meyering <[email protected]> wrote:
> On Fri, Aug 12, 2016 at 7:22 AM, Bastian Beischer
> <[email protected]> wrote:
>> Hello,
>>
>> Please try to run diff3 from diffutils 3.4 as follows:
>>
>> echo a > a.txt
>> echo b > b.txt
>> echo c > c.txt
>> diff3 a.txt b.txt c.txt
...
>> There was only one commit in src/diff3.c between 3.3 and 3.4:
>>
>> http://git.savannah.gnu.org/cgit/diffutils.git/commit/src?id=3b74a905c5460e7979c53273ac90345860d001a7
>>
>> Reverting this commit fixes the issue.
>
> Yikes.
> Thank you for the report.
> I confirm that that patch is erroneous, and will prepare a complete
> fix (adding your test and a NEWS entry) shortly.
> Looks like diffutils-3.5 will have to be released pretty soon.
I've reverted that, added your test case and updated NEWS with a
commit in your name. Please review that first commit, in the attached.
diff3: fix heap use-after-free; add minimal diff3 test coverage
I have also made two other changes:
maint: require that commit messages be of a certain form
diff3: fix leaks, for real
The first of those enforces some modicum of sanity on commit log
messages, and might have prevented the offending commit. This is
largely copied from what I did for coreutils.
The final change actually does plug diff3's leaks. The minimal test
cases did not exercise the code to handle multiple diff blocks, so I
added one that does. One can argue that explicitly freeing some of
those buffers is not required, so I have made it so that nearly all of
the code added by this fix can be compiled out.
From 88d911dbc717494febee4b0ebc790808054fefff Mon Sep 17 00:00:00 2001
From: Jim Meyering <[email protected]>
Date: Sat, 13 Aug 2016 19:40:20 -0700
Subject: [PATCH 1/4] build: ignore texinfo build artifacts
* .gitignore: Ignore texinfo artifacts in doc/.
---
.gitignore | 6 +++++-
1 file changed, 5 insertions(+), 1 deletion(-)
diff --git a/.gitignore b/.gitignore
index bf11312..3fecace 100644
--- a/.gitignore
+++ b/.gitignore
@@ -15,10 +15,14 @@
/config.log
/config.status
/configure
-/diffutils-*.tar.gz
/diffutils-*.tar.xz
/doc/.gitignore
+/doc/diffutils.aux
+/doc/diffutils.cp
+/doc/diffutils.cps
/doc/diffutils.info
+/doc/diffutils.log
+/doc/diffutils.toc
/doc/stamp-vti
/doc/version.texi
/gnulib-tests
--
2.7.4
From 1a0df4396ebe3b9a58b882bb976cfce3f50d3cac Mon Sep 17 00:00:00 2001
From: Bastian Beischer <[email protected]>
Date: Sat, 13 Aug 2016 18:53:36 -0700
Subject: [PATCH 2/4] diff3: fix heap use-after-free; add minimal diff3 test
coverage
Commit v3.3-42-g3b74a90, "FIXME: src/diff3: plug a leak" added an
invalid use of free, leading to use-after-free in nearly any invocation
of diff3. Revert that commit.
* NEWS (Bug fixes): Mention it.
* tests/diff3: New file, to add minimal test coverage.
* tests/Makefile.am (TESTS): Add it.
Reported by Bastian Beischer in http://bugs.gnu.org/24210
---
NEWS | 3 +++
src/diff3.c | 1 -
tests/Makefile.am | 1 +
tests/diff3 | 30 ++++++++++++++++++++++++++++++
4 files changed, 34 insertions(+), 1 deletion(-)
create mode 100644 tests/diff3
diff --git a/NEWS b/NEWS
index 9a8d2e1..73ff83b 100644
--- a/NEWS
+++ b/NEWS
@@ -4,6 +4,9 @@ GNU diffutils NEWS -*-
outline -*-
** Bug fixes
+ diff3 no longer malfunctions due to use-after-free
+ [bug introduced in 3.4]
+
diff --color no longer colorizes when TERM=dumb
diff --git a/src/diff3.c b/src/diff3.c
index 6ef90f4..0eb643e 100644
--- a/src/diff3.c
+++ b/src/diff3.c
@@ -1039,7 +1039,6 @@ process_diff (char const *filea,
*block_list_end = NULL;
*last_block = bptr;
- free (diff_contents);
return block_list;
}
diff --git a/tests/Makefile.am b/tests/Makefile.am
index 77d9fc3..66e25a5 100644
--- a/tests/Makefile.am
+++ b/tests/Makefile.am
@@ -6,6 +6,7 @@ TESTS = \
binary \
brief-vs-stat-zero-kernel-lies \
colliding-file-names \
+ diff3 \
excess-slash \
help-version \
function-line-vs-leading-space \
diff --git a/tests/diff3 b/tests/diff3
new file mode 100644
index 0000000..a1fc287
--- /dev/null
+++ b/tests/diff3
@@ -0,0 +1,30 @@
+#!/bin/sh
+# This would malfunction in diff-3.4
+
+. "${srcdir=.}/init.sh"; path_prepend_ ../src
+
+echo a > a || framework_failure_
+echo b > b || framework_failure_
+echo c > c || framework_failure_
+cat <<'EOF' > exp || framework_failure_
+====
+1:1c
+ a
+2:1c
+ b
+3:1c
+ c
+EOF
+
+fail=0
+
+diff3 a b c > out 2> err || fail=1
+compare exp out || fail=1
+compare /dev/null err || fail=1
+
+# Repeat, but with all three files the same:
+diff3 a a a > out 2> err || fail=1
+compare /dev/null out || fail=1
+compare /dev/null err || fail=1
+
+Exit $fail
--
2.7.4
From b3def738f3b435cbe6f2a8406bae5a71175a0b80 Mon Sep 17 00:00:00 2001
From: Jim Meyering <[email protected]>
Date: Fri, 12 Aug 2016 11:17:26 -0700
Subject: [PATCH 3/4] maint: require that commit messages be of a certain form
* bootstrap.conf (bootstrap_epilogue): Merge from coreutils, so that
a local commit hook will now help enforce consistent commit messages.
* Makefile.am (check-git-hook-script-sync): New rule, largely copied
from coreutils.
* scripts/git-hooks/commit-msg: New file, from coreutils, but
with adapted list of program names.
* scripts/git-hooks/applypatch-msg: New file, from git.
* scripts/git-hooks/pre-applypatch: Likewise.
* scripts/git-hooks/pre-commit: Likewise.
---
Makefile.am | 14 ++++
bootstrap.conf | 25 +++++++
scripts/git-hooks/applypatch-msg | 15 ++++
scripts/git-hooks/commit-msg | 153 +++++++++++++++++++++++++++++++++++++++
scripts/git-hooks/pre-applypatch | 14 ++++
scripts/git-hooks/pre-commit | 49 +++++++++++++
6 files changed, 270 insertions(+)
create mode 100755 scripts/git-hooks/applypatch-msg
create mode 100755 scripts/git-hooks/commit-msg
create mode 100755 scripts/git-hooks/pre-applypatch
create mode 100755 scripts/git-hooks/pre-commit
diff --git a/Makefile.am b/Makefile.am
index 244024e..edee5fa 100644
--- a/Makefile.am
+++ b/Makefile.am
@@ -42,3 +42,17 @@ gen-ChangeLog:
ALL_RECURSIVE_TARGETS += distcheck-hook
distcheck-hook:
$(MAKE) my-distcheck
+
+# Some of our git hook scripts are supposed to be identical to git's samples.
+# See if they are still in sync.
+.PHONY: check-git-hook-script-sync
+check-git-hook-script-sync:
+ @fail=0; \
+ t=$$(mktemp -d) \
+ && cd $$t && git init -q && cd .git/hooks \
+ && for i in pre-commit pre-applypatch applypatch-msg; do \
+ diff -u $(abs_top_srcdir)/scripts/git-hooks/$$i $$i.sample
\
+ || fail=1; \
+ done; \
+ rm -rf $$t; \
+ test $$fail = 0
diff --git a/bootstrap.conf b/bootstrap.conf
index 828ffef..ac3c278 100644
--- a/bootstrap.conf
+++ b/bootstrap.conf
@@ -129,4 +129,29 @@ bootstrap_post_import_hook()
bootstrap_epilogue()
{
perl -pi -e "s/\@PACKAGE\@/$package/g" README-release
+
+ # Since this is a "GNU" package, replace this line
+ # if LC_ALL=C grep 'GNU @PACKAGE@' $(top_srcdir)/* 2>/dev/null \
+ # | grep -v 'libtool:' >/dev/null; then
+ # with this:
+ # if true; then
+ # Why? That pipeline searches all files in $(top_srcdir), and if you
+ # happen to have large files (or apparently large sparse files), the
+ # first grep may well run out of memory.
+ perl -pi -e 's/if LC_ALL=C grep .GNU .PACKAGE.*; then/if true; then/' \
+ po/Makefile.in.in
+
+ # Install our git hooks, as long as "cp" accepts the --backup option,
+ # so that we can back up any existing files.
+ case $(cp --help) in *--backup*) backup=1;; *) backup=0;; esac
+ if test $backup = 1; then
+ hooks=$(cd scripts/git-hooks && git ls-files)
+ for f in $hooks; do
+ # If it is identical, skip it.
+ cmp scripts/git-hooks/$f .git/hooks/$f > /dev/null \
+ && continue
+ cp --backup=numbered scripts/git-hooks/$f .git/hooks
+ chmod a-w .git/hooks/$f
+ done
+ fi
}
diff --git a/scripts/git-hooks/applypatch-msg b/scripts/git-hooks/applypatch-msg
new file mode 100755
index 0000000..a5d7b84
--- /dev/null
+++ b/scripts/git-hooks/applypatch-msg
@@ -0,0 +1,15 @@
+#!/bin/sh
+#
+# An example hook script to check the commit log message taken by
+# applypatch from an e-mail message.
+#
+# The hook should exit with non-zero status after issuing an
+# appropriate message if it wants to stop the commit. The hook is
+# allowed to edit the commit message file.
+#
+# To enable this hook, rename this file to "applypatch-msg".
+
+. git-sh-setup
+commitmsg="$(git rev-parse --git-path hooks/commit-msg)"
+test -x "$commitmsg" && exec "$commitmsg" ${1+"$@"}
+:
diff --git a/scripts/git-hooks/commit-msg b/scripts/git-hooks/commit-msg
new file mode 100755
index 0000000..888b83b
--- /dev/null
+++ b/scripts/git-hooks/commit-msg
@@ -0,0 +1,153 @@
+eval '(exit $?0)' && eval 'exec perl -w "$0" ${1+"$@"}'
+ & eval 'exec perl -w "$0" $argv:q'
+ if 0;
+
+use strict;
+use warnings;
+(my $ME = $0) =~ s|.*/||;
+
+# Emulate Git's choice of the editor for the commit message.
+chomp (my $editor = `git var GIT_EDITOR`);
+# And have a sane, minimal fallback in case of weird failures.
+$editor = "vi" if $? != 0 or $editor =~ /^\s*\z/;
+
+# Keywords allowed before the colon on the first line of a commit message:
+# program names and a few general category names.
+my @valid = qw(
+ diff diff3 cmp sdiff
+
+ gnulib tests maint doc build scripts
+ );
+my $v_or = join '|', @valid;
+my $valid_regex = qr/^(?:$v_or)$/;
+
+# Rewrite the $LOG_FILE (old contents in @$LINE_REF) with an additional
+# a commented diagnostic "# $ERR" line at the top.
+sub rewrite($$$)
+{
+ my ($log_file, $err, $line_ref) = @_;
+ local *LOG;
+ open LOG, '>', $log_file
+ or die "$ME: $log_file: failed to open for writing: $!";
+ print LOG "# $err";
+ print LOG @$line_ref;
+ close LOG
+ or die "$ME: $log_file: failed to rewrite: $!\n";
+}
+
+sub re_edit($)
+{
+ my ($log_file) = @_;
+
+ warn "Interrupt (Ctrl-C) to abort...\n";
+
+ system 'sh', '-c', "$editor $log_file";
+ ($? & 127) || ($? >> 8)
+ and die "$ME: $log_file: the editor ($editor) failed, aborting\n";
+}
+
+sub bad_first_line($)
+{
+ my ($line) = @_;
+
+ $line =~ /^[Vv]ersion \d/
+ and return '';
+
+ $line =~ /:/
+ or return 'missing colon on first line of log message';
+
+ $line =~ /\.$/
+ and return 'do not use a period "." at the end of the first line';
+
+ # The token(s) before the colon on the first line must be on our list
+ # Tokens may be space- or comma-separated.
+ (my $pre_colon = $line) =~ s/:.*//;
+ my @word = split (/[ ,]/, $pre_colon);
+ my @bad = grep !/$valid_regex/, @word;
+ @bad
+ and return 'invalid first word(s) of summary line: ' . join (', ', @bad);
+
+ return '';
+}
+
+# Given a $LOG_FILE name and a \@LINE buffer,
+# read the contents of the file into the buffer and analyze it.
+# If the log message passes muster, return the empty string.
+# If not, return a diagnostic.
+sub check_msg($$)
+{
+ my ($log_file, $line_ref) = @_;
+
+ local *LOG;
+ open LOG, '<', $log_file
+ or return "failed to open for reading: $!";
+ @$line_ref = <LOG>;
+ close LOG;
+
+ my @line = @$line_ref;
+ chomp @line;
+
+ # Don't filter out blank or comment lines; git does that already,
+ # and if we were to ignore them here, it could lead to committing
+ # with lines that start with "#" in the log.
+
+ # Filter out leading blank and comment lines.
+ # while (@line && $line[0] =~ /^(?:#.*|[ \t]*)$/) { shift @line; }
+
+ # Filter out blank and comment lines at EOF.
+ # while (@line && $line[$#line] =~ /^(?:#.*|[ \t]*)$/) { pop @line; }
+
+ @line == 0
+ and return 'no log message';
+
+ my $bad = bad_first_line $line[0];
+ $bad
+ and return $bad;
+
+ # Second line should be blank or not present.
+ 2 <= @line && length $line[1]
+ and return 'second line must be empty';
+
+ # Limit line length to allow for the ChangeLog's leading TAB.
+ foreach my $line (@line)
+ {
+ 72 < length $line && $line =~ /^[^#]/
+ and return 'line longer than 72';
+ }
+
+ my $buf = join ("\n", @line) . "\n";
+ $buf =~ m!https?://bugzilla\.redhat\.com/show_bug\.cgi\?id=(\d+)!s
+ and return "use shorter http://bugzilla.redhat.com/$1";
+
+ $buf =~ m!https?://debbugs\.gnu\.org/(?:cgi/bugreport\.cgi\?bug=)?(\d+)!s
+ and return "use shorter http://bugs.gnu.org/$1";
+
+ $buf =~ /^ *Signed-off-by:/mi
+ and return q(do not use "Signed-off-by:");
+
+ return '';
+}
+
+{
+ @ARGV == 1
+ or die;
+
+ my $log_file = $ARGV[0];
+
+ while (1)
+ {
+ my @line;
+ my $err = check_msg $log_file, \@line;
+ $err eq ''
+ and last;
+ $err = "$ME: $err\n";
+ warn $err;
+ # Insert the diagnostic as a comment on the first line of $log_file.
+ rewrite $log_file, $err, \@line;
+ re_edit $log_file;
+
+ # Stop if our parent is killed.
+ getppid() == 1
+ and last;
+ }
+}
diff --git a/scripts/git-hooks/pre-applypatch b/scripts/git-hooks/pre-applypatch
new file mode 100755
index 0000000..4142082
--- /dev/null
+++ b/scripts/git-hooks/pre-applypatch
@@ -0,0 +1,14 @@
+#!/bin/sh
+#
+# An example hook script to verify what is about to be committed
+# by applypatch from an e-mail message.
+#
+# The hook should exit with non-zero status after issuing an
+# appropriate message if it wants to stop the commit.
+#
+# To enable this hook, rename this file to "pre-applypatch".
+
+. git-sh-setup
+precommit="$(git rev-parse --git-path hooks/pre-commit)"
+test -x "$precommit" && exec "$precommit" ${1+"$@"}
+:
diff --git a/scripts/git-hooks/pre-commit b/scripts/git-hooks/pre-commit
new file mode 100755
index 0000000..68d62d5
--- /dev/null
+++ b/scripts/git-hooks/pre-commit
@@ -0,0 +1,49 @@
+#!/bin/sh
+#
+# An example hook script to verify what is about to be committed.
+# Called by "git commit" with no arguments. The hook should
+# exit with non-zero status after issuing an appropriate message if
+# it wants to stop the commit.
+#
+# To enable this hook, rename this file to "pre-commit".
+
+if git rev-parse --verify HEAD >/dev/null 2>&1
+then
+ against=HEAD
+else
+ # Initial commit: diff against an empty tree object
+ against=4b825dc642cb6eb9a060e54bf8d69288fbee4904
+fi
+
+# If you want to allow non-ASCII filenames set this variable to true.
+allownonascii=$(git config --bool hooks.allownonascii)
+
+# Redirect output to stderr.
+exec 1>&2
+
+# Cross platform projects tend to avoid non-ASCII filenames; prevent
+# them from being added to the repository. We exploit the fact that the
+# printable range starts at the space character and ends with tilde.
+if [ "$allownonascii" != "true" ] &&
+ # Note that the use of brackets around a tr range is ok here, (it's
+ # even required, for portability to Solaris 10's /usr/bin/tr), since
+ # the square bracket bytes happen to fall in the designated range.
+ test $(git diff --cached --name-only --diff-filter=A -z $against |
+ LC_ALL=C tr -d '[ -~]\0' | wc -c) != 0
+then
+ cat <<\EOF
+Error: Attempt to add a non-ASCII file name.
+
+This can cause problems if you want to work with people on other platforms.
+
+To be portable it is advisable to rename the file.
+
+If you know what you are doing you can disable this check using:
+
+ git config hooks.allownonascii true
+EOF
+ exit 1
+fi
+
+# If there are whitespace errors, print the offending file names and fail.
+exec git diff-index --check --cached $against --
--
2.7.4
From edd942ca27d570a33d612b12eecaa33a76640e46 Mon Sep 17 00:00:00 2001
From: Jim Meyering <[email protected]>
Date: Fri, 12 Aug 2016 21:40:29 -0700
Subject: [PATCH 4/4] diff3: fix leaks, for real
* src/diff3.c (struct diff_block)[lint]: Add member, n2.
(free_diff_block, next_to_n2): New functions.
* tests/diff3: Add more test coverage.
---
src/diff3.c | 56 +++++++++++++++++++++++++++++++++++++++++++++++----
tests/diff3 | 66 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
2 files changed, 118 insertions(+), 4 deletions(-)
diff --git a/src/diff3.c b/src/diff3.c
index 0eb643e..b80aeb3 100644
--- a/src/diff3.c
+++ b/src/diff3.c
@@ -78,6 +78,9 @@ struct diff_block {
char **lines[2]; /* The actual lines (may contain nulls) */
size_t *lengths[2]; /* Line lengths (including newlines, if any) */
struct diff_block *next;
+#ifdef lint
+ struct diff_block *n2; /* Used only when freeing. */
+#endif
};
/* Three way diff */
@@ -176,7 +179,7 @@ static struct diff3_block *create_diff3_block (lin, lin,
lin, lin, lin, lin);
static struct diff3_block *make_3way_diff (struct diff_block *, struct
diff_block *);
static struct diff3_block *reverse_diff3_blocklist (struct diff3_block *);
static struct diff3_block *using_to_diff3_block (struct diff_block *[2],
struct diff_block *[2], int, int, struct diff3_block const *);
-static struct diff_block *process_diff (char const *, char const *, struct
diff_block **);
+static struct diff_block *process_diff (char const *, char const *, struct
diff_block **, char **);
static void check_stdout (void);
static void fatal (char const *) __attribute__((noreturn));
static void output_diff3 (FILE *, struct diff3_block *, int const[3], int
const[3]);
@@ -212,6 +215,38 @@ static struct option const longopts[] =
{0, 0, 0, 0}
};
+static void
+free_diff_block (struct diff_block *p)
+{
+#ifndef lint
+ (void)p;
+#else
+ while (p)
+ {
+ free (p->lines[0]);
+ free (p->lines[1]);
+ free (p->lengths[0]);
+ free (p->lengths[1]);
+ struct diff_block *next = p->n2;
+ free (p);
+ p = next;
+ }
+#endif
+}
+
+/* Copy each next pointer to n2, since make_3way_diff would clobber the former,
+ yet we will still need something to free these buffers. */
+static void
+next_to_n2 (struct diff_block *p)
+{
+#ifndef lint
+ (void)p;
+#else
+ while (p)
+ p = p->n2 = p->next;
+#endif
+}
+
int
main (int argc, char **argv)
{
@@ -377,10 +412,19 @@ main (int argc, char **argv)
/* Invoke diff twice on two pairs of input files, combine the two
diffs, and output them. */
+ char *b0, *b1;
commonname = file[rev_mapping[FILEC]];
- thread1 = process_diff (file[rev_mapping[FILE1]], commonname, &last_block);
- thread0 = process_diff (file[rev_mapping[FILE0]], commonname, &last_block);
+ thread1 = process_diff (file[rev_mapping[FILE1]], commonname, &last_block,
&b1);
+ thread0 = process_diff (file[rev_mapping[FILE0]], commonname, &last_block,
&b0);
+
+ next_to_n2 (thread0);
+ next_to_n2 (thread1);
+
diff3 = make_3way_diff (thread0, thread1);
+
+ free_diff_block (thread0);
+ free_diff_block (thread1);
+
if (edscript)
conflicts_found
= output_diff3_edscript (stdout, diff3, mapping, rev_mapping,
@@ -400,6 +444,8 @@ main (int argc, char **argv)
conflicts_found = false;
}
+ free (b0);
+ free (b1);
check_stdout ();
exit (conflicts_found);
}
@@ -938,7 +984,8 @@ compare_line_list (char * const list1[], size_t const
lengths1[],
static struct diff_block *
process_diff (char const *filea,
char const *fileb,
- struct diff_block **last_block)
+ struct diff_block **last_block,
+ char **buf_to_free)
{
char *diff_contents;
char *diff_limit;
@@ -953,6 +1000,7 @@ process_diff (char const *filea,
sizeof *bptr->lengths[1]));
diff_limit = read_diff (filea, fileb, &diff_contents);
+ *buf_to_free = diff_contents;
scan_diff = diff_contents;
while (scan_diff < diff_limit)
diff --git a/tests/diff3 b/tests/diff3
index a1fc287..a40631b 100644
--- a/tests/diff3
+++ b/tests/diff3
@@ -27,4 +27,70 @@ diff3 a a a > out 2> err || fail=1
compare /dev/null out || fail=1
compare /dev/null err || fail=1
+# This would have provoked a nontrivial leak prior to diffutils-3.5,
+# due to the nontrivial list of diff_block structs.
+seq 10 40|sed 's/1$/x/' > d || framework_failure_
+seq 10 40|sed 's/5$/y/' > e || framework_failure_
+seq 10 40|sed 's/8$/z/' > f || framework_failure_
+cat <<'EOF' > exp40 || framework_failure_
+====1
+1:2c
+ 1x
+2:2c
+3:2c
+ 11
+====2
+1:6c
+3:6c
+ 15
+2:6c
+ 1y
+====3
+1:9c
+2:9c
+ 18
+3:9c
+ 1z
+====1
+1:12c
+ 2x
+2:12c
+3:12c
+ 21
+====2
+1:16c
+3:16c
+ 25
+2:16c
+ 2y
+====3
+1:19c
+2:19c
+ 28
+3:19c
+ 2z
+====1
+1:22c
+ 3x
+2:22c
+3:22c
+ 31
+====2
+1:26c
+3:26c
+ 35
+2:26c
+ 3y
+====3
+1:29c
+2:29c
+ 38
+3:29c
+ 3z
+EOF
+
+diff3 d e f > out 2> err
+compare exp40 out || fail=1
+compare /dev/null err || fail=1
+
Exit $fail
--
2.7.4