On Fri, May 30, 2014 at 8:58 AM, Jim Meyering <[email protected]> wrote:
> On Fri, May 30, 2014 at 8:56 AM, Jim Meyering <[email protected]> wrote:
>> On Thu, May 29, 2014 at 10:45 PM, Marc Aldorasi <[email protected]> wrote:
>>> With grep 2.18, the -m option would cause grep to stop reading input
>>> after printing the requested number of matching lines.  With version
>>> 2.19, grep reads the entire input before exiting.  Interestingly, grep
>>> does not read the entire input if the -c or -C0 options are added in
>>> addition to -m, and also when using -l or -q instead of -m.  I believe
>>> this is caused by commit 5122195.
>>
>> Thanks a lot for the report.  Just in time.
>> I confirm that it's a bug introduced in 2.19.
>> To test, run "seq 1000000 > million", then
>>  "strace -e read grep 0 million" first using grep-2.18
>> (shows just a few read syscalls), and then with 2.19,
>> which shows grep reading the entire million-line file.
>
> Correction: to reproduce, you'll have to insert -m1 in that grep command.
>
>> Here's an incomplete patch.  Obviously there's a lot more
>> to be added, including NEWS and a nontrivial test. This
>> was introduced by commit v2.18-140-g6f07900

This bears some explanation.  I've attached a more complete patch
(albeit still hastily composed, so I'll wait a few hours,
in case there's feedback)

Prior to grep-2.19, with --max-count=N, this first disjunct would
be true after the Nth match, because pending would be 0:

          if ((!outleft && !pending) || (nlines && done_on_match))
            goto finish_grep;

However, a seemingly unrelated change affected how "pending" is set:

      pending = out_quiet ? 0 : out_after;

We used to ensure that "out_after" was non-negative, because
default_context was always non-negative:

      if (out_after < 0)
        out_after = default_context;

But the recent context-related change invalidated that assumption:

      -  default_context = 0;
      +  default_context = -1;

Here's the patch:
From dae857d8d787c986761c37673b31a1aa0c7e74c2 Mon Sep 17 00:00:00 2001
From: Jim Meyering <[email protected]>
Date: Fri, 30 May 2014 09:19:33 -0700
Subject: [PATCH] grep: fix --max-count=N (-m N) to stop reading after Nth
 match

With --max-count=N (-m N), grep is supposed to stop reading input
after it has found the Nth match.  However, a recent context-
related change made it so grep would always read to end of file.
* src/grep.c (prtext): Don't let a negative "out_after" value
make "pending" line count negative.
* tests/max-count-overread: New test, for this.
* tests/Makefile.am (TESTS): Add it.
* NEWS (Bug fixes): Mention it.
This bug was introduced by commit v2.18-140-g6f07900.
Reported by Marc Aldorasi in http://bugs.gnu.org/17640.
---
 NEWS                     |  3 +++
 src/grep.c               |  2 +-
 tests/Makefile.am        |  1 +
 tests/max-count-overread | 12 ++++++++++++
 4 files changed, 17 insertions(+), 1 deletion(-)
 create mode 100755 tests/max-count-overread

diff --git a/NEWS b/NEWS
index 0caad22..0e9b2cb 100644
--- a/NEWS
+++ b/NEWS
@@ -4,6 +4,9 @@ GNU grep NEWS                                    -*- outline -*-

 ** Bug fixes

+  grep --max-count=N FILE would no longer stop reading after the Nth match.
+  [bug introduced in grep-2.19]
+
   A command like echo aa|grep -E 'a(b$|c$)' would mistakenly
   report the input as a matched line.
   [bug introduced in grep-2.19]
diff --git a/src/grep.c b/src/grep.c
index acc08c7..7c0f8a8 100644
--- a/src/grep.c
+++ b/src/grep.c
@@ -1043,7 +1043,7 @@ prtext (char const *beg, char const *lim)
     }

   after_last_match = bufoffset - (buflim - p);
-  pending = out_quiet ? 0 : out_after;
+  pending = out_quiet ? 0 : MAX (0, out_after);
   used = true;
   outleft -= n;
 }
diff --git a/tests/Makefile.am b/tests/Makefile.am
index a38f074..31e2a81 100644
--- a/tests/Makefile.am
+++ b/tests/Makefile.am
@@ -73,6 +73,7 @@ TESTS =                                               \
   invalid-multibyte-infloop                    \
   khadafy                                      \
   long-line-vs-2GiB-read                       \
+  max-count-overread                           \
   max-count-vs-context                         \
   mb-non-UTF8-performance                      \
   multibyte-white-space                                \
diff --git a/tests/max-count-overread b/tests/max-count-overread
new file mode 100755
index 0000000..56bc331
--- /dev/null
+++ b/tests/max-count-overread
@@ -0,0 +1,12 @@
+#!/bin/sh
+# Ensure that -m1 stops reading after the first match.
+# In grep-2.19, yes x|grep -m1 x would infloop.
+
+. "${srcdir=.}/init.sh"; path_prepend_ ../src
+require_timeout_
+
+fail=0
+
+yes x | timeout 3 grep -m1 x > out || fail=1
+
+Exit $fail
-- 
2.0.0.rc3

Reply via email to