On Sun, Apr 13, 2014 at 12:13 PM, Jim Meyering <[email protected]> wrote:
> On Fri, Apr 11, 2014 at 4:47 PM, damon <[email protected]> wrote:
>> Hi there -
>>
>> I recently noticed a bug after upgrading grep and have tracked it
>> through a few versions now.
>>
>> I was using grep -P (PCRE grep) in some scripts to grep through
>> directory of files, and the process would keep aborting with a
>> segmentation fault.
>>
>> The last known good version is grep-2.14.  Every version after that has
>> failed in a slightly different way, making me think this could be a bug
>> in grep, not in pcre.
>>
>> I tried compiling greps 2.14 through 2.18 against the latest pcre
>> library, pcre-8.33.  Here's what happens when i try each version against
>> a random binary file, attached to this message as test-image.png.  This
>> file was just one of many that caused the errors, though not every
>> binary file does.
>>
>> Below are some results demonstrating what's going wrong.  Note that all
>> of these seem to work fine with regular grep or with grep -E.  Please
>> let me know what else i can do to help track this down!
>>
>> # grep-2.14/src/grep -P '\[.?max' test-image.png
>> (works, does not match)
> ...
>> # grep-2.18/src/grep -P '\[.?max' test-image.png
>> Segmentation fault
>>
>> # grep-2.18/src/grep -P '.?ma' test-image.png
>> Segmentation fault
>>
>> # grep-2.18/src/grep -P '.?m' test-image.png
>> Binary file test-image.png matches
>
> Thank you for the bug report.
> That is due to a bug in libpcre.  I've confirmed that it is still
> triggered even when using the latest grep.git linked with
> the latest from pcre.git (latest commit has "Final tidies for
> 8.35 release." as the subject).  I built grep as usual, and
> then ran this:
>
>   rm src/grep; make LIB_PCRE=$PWD/../pcre/.libs/libpcre.a
>
> Confirm that grep is not using a shared libpcre (this must print nothing):
>
>   ldd src/grep|grep pcre
>
> That presumes I had already built the latest pcre/ in ../pcre.
> Then, run this to test it with a non-UTF8 locale, and it is
> error-free, correctly finding no match:
>
>   LC_ALL=ja_JP.eucJP valgrind src/grep -P '\[.?max' test-image.png
>
> Repeat using a UTF8 locale, and you see that valgrind reports
> numerous buffer overrun and heap-use-after-free errors:
>
>   LC_ALL=en_US.utf8 valgrind src/grep -P '\[.?max' test-image.png
>
> Here is an equivalent but much smaller test case:
>
>   $ printf 'a\201b\r'|LC_ALL=en_US.utf8 valgrind src/grep -P 'a.?XXb'
>
> That segfaults.  Interestingly, if I replace each X with a ".",
> grep gets into an infinite loop within libpcre's match function.

FYI, I'm pushing the attached patch, to add a test for this.
It fails with the latest pcre from git (8.35), but passes with debian
unstable's libpcre3 8.31-3:
From c0384ddfb4806697973cafce01f51ff8a775d67f Mon Sep 17 00:00:00 2001
From: Jim Meyering <[email protected]>
Date: Sun, 13 Apr 2014 13:21:14 -0700
Subject: [PATCH] tests: detect an infloop-inducing bug in grep -P (pcre-8.35)

* tests/pcre-infloop: New test.
* tests/Makefile.am (TESTS): Add it.
---
 tests/Makefile.am  |  1 +
 tests/pcre-infloop | 33 +++++++++++++++++++++++++++++++++
 2 files changed, 34 insertions(+)
 create mode 100755 tests/pcre-infloop

diff --git a/tests/Makefile.am b/tests/Makefile.am
index 49d6cba..cc79903 100644
--- a/tests/Makefile.am
+++ b/tests/Makefile.am
@@ -83,6 +83,7 @@ TESTS =                                               \
   options                                      \
   pcre                                         \
   pcre-abort                                   \
+  pcre-infloop                                 \
   pcre-invalid-utf8-input                      \
   pcre-utf8                                    \
   pcre-w                                       \
diff --git a/tests/pcre-infloop b/tests/pcre-infloop
new file mode 100755
index 0000000..57b67ae
--- /dev/null
+++ b/tests/pcre-infloop
@@ -0,0 +1,33 @@
+#!/bin/sh
+# With some versions of libpcre, apparently including 8.35,
+# the following would trigger an infinite loop in its match function.
+
+# Copyright 2014 Free Software Foundation, Inc.
+
+# This program is free software: you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation, either version 3 of the License, or
+# (at your option) any later version.
+
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+
+# You should have received a copy of the GNU General Public License
+# along with this program.  If not, see <http://www.gnu.org/licenses/>.
+
+. "${srcdir=.}/init.sh"; path_prepend_ ../src
+require_pcre_
+require_timeout_
+require_en_utf8_locale_
+require_compiled_in_MB_support
+
+printf 'a\201b\r' > in || framework_failure_
+
+fail=0
+
+LC_ALL=en_US.utf8 timeout 3 grep -P 'a.?..b' in
+test $? = 1 || fail_ "libpcre's match function appears to infloop"
+
+Exit $fail
-- 
1.9.2.459.g68773ac

Reply via email to