Preparing for the release of grep-2.27, I was surprised to find inputs
that triggered an infinite loop.

With yesterday's pre-release snapshot
(http://meyering.net/grep/grep-2.26.37-03b4.tar.xz), this would not
terminate:

  echo cx | LC_ALL=C src/grep -E 'c\b[x ]'

I traced it to a recent change in gnulib's dfa.c and then banged my
head against that code until I understood enough to fix it. I've
attached the gnulib bug fix and the grep patch that adds a test case
and pulls in the gnulib fix.

I'll make a new snapshot later today.
From ca3ca77fc9e497ff5048a9f1ee0d1acc8e050f51 Mon Sep 17 00:00:00 2001
From: Jim Meyering <meyer...@fb.com>
Date: Tue, 29 Nov 2016 10:45:46 -0800
Subject: [PATCH] dfa: avoid new infinite loop

This would infloop: echo cx | LC_ALL=C grep -E 'c\b[x ]'
* lib/dfa.c (dfastate): When constructing a new state table, we could
initially declare that we had found a match, and later find that
constraints eliminate that possibility, yet continue to use the
now stale "matched" indicator.  That would lead to an infinite loop.
The solution is to update "matched" when necessary.
Introduced by commit v0.1-983-g403adf1.
---
 ChangeLog | 11 +++++++++++
 lib/dfa.c |  6 ++++++
 2 files changed, 17 insertions(+)

diff --git a/ChangeLog b/ChangeLog
index fd062ae..3b18083 100644
--- a/ChangeLog
+++ b/ChangeLog
@@ -1,3 +1,14 @@
+2016-11-29  Jim Meyering  <meyer...@fb.com>
+
+       dfa: avoid new infinite loop
+       This would infloop: echo cx | LC_ALL=C grep -E 'c\b[x ]'
+       * lib/dfa.c (dfastate): When constructing a new state table, we could
+       initially declare that we had found a match, and later find that
+       constraints eliminate that possibility, yet continue to use the
+       now stale "matched" indicator.  That would lead to an infinite loop.
+       The solution is to update "matched" when necessary.
+       Introduced by commit v0.1-983-g403adf1.
+
 2016-11-27  Norihiro Tanaka <nori...@kcn.ne.jp>

         dfa: avoid match middle in multibyte character
diff --git a/lib/dfa.c b/lib/dfa.c
index 673ef95..0412b2c 100644
--- a/lib/dfa.c
+++ b/lib/dfa.c
@@ -2609,6 +2609,12 @@ dfastate (state_num s, struct dfa *d, unsigned char uc, 
state_num trans[])
             continue;
           if (j == CHARCLASS_WORDS)
             continue;
+
+          /* If we have reset the bit that made us declare "matched", reset
+             that indicator, too.  This is required to avoid an infinite loop
+             with this command: echo cx | LC_ALL=C grep -E 'c\b[x ]'  */
+          if (!tstbit (uc, matches))
+            matched = false;
         }

 #ifdef DEBUG
-- 
2.9.3

From ae3f57c99c3c7b89cc15448a3365e71795a1a21b Mon Sep 17 00:00:00 2001
From: Jim Meyering <meyer...@fb.com>
Date: Tue, 29 Nov 2016 10:55:30 -0800
Subject: [PATCH] grep: fix DFA-induced infloop

* gnulib: Update to latest, for the DFA infloop fix.
* tests/dfa-infloop: New test, to trigger an infinite loop
in the DFA matcher.
* tests/Makefile.am (TESTS): Add it.
---
 gnulib            |  2 +-
 tests/Makefile.am |  1 +
 tests/dfa-infloop | 12 ++++++++++++
 3 files changed, 14 insertions(+), 1 deletion(-)
 create mode 100755 tests/dfa-infloop

diff --git a/gnulib b/gnulib
index 9cba42f..ca3ca77 160000
--- a/gnulib
+++ b/gnulib
@@ -1 +1 @@
-Subproject commit 9cba42f87e1e88ac746e2341c51e78f9f640fefa
+Subproject commit ca3ca77fc9e497ff5048a9f1ee0d1acc8e050f51
diff --git a/tests/Makefile.am b/tests/Makefile.am
index 442e85a..3ded7a7 100644
--- a/tests/Makefile.am
+++ b/tests/Makefile.am
@@ -85,6 +85,7 @@ TESTS =                                               \
   count-newline                                        \
   dfa-coverage                                 \
   dfa-heap-overrun                             \
+  dfa-infloop                                  \
   dfaexec-multibyte                            \
   empty                                                \
   empty-line                                   \
diff --git a/tests/dfa-infloop b/tests/dfa-infloop
new file mode 100755
index 0000000..e35eef5
--- /dev/null
+++ b/tests/dfa-infloop
@@ -0,0 +1,12 @@
+#!/bin/sh
+# This would infloop for some unreleased versions between 2.26 and 2.27
+. "${srcdir=.}/init.sh"; path_prepend_ ../src
+
+require_timeout_
+
+fail=0
+
+echo cx > in || framework_failure_
+returns_ 1 timeout 10 env LC_ALL=C grep -E 'c\b[x ]' in || fail=1
+
+Exit $fail
-- 
2.9.3

Reply via email to