On Fri, May 30, 2014 at 8:58 AM, Jim Meyering <[email protected]> wrote:
> On Fri, May 30, 2014 at 8:56 AM, Jim Meyering <[email protected]> wrote:
>> On Thu, May 29, 2014 at 10:45 PM, Marc Aldorasi <[email protected]> wrote:
>>> With grep 2.18, the -m option would cause grep to stop reading input
>>> after printing the requested number of matching lines. With version
>>> 2.19, grep reads the entire input before exiting. Interestingly, grep
>>> does not read the entire input if the -c or -C0 options are added in
>>> addition to -m, and also when using -l or -q instead of -m. I believe
>>> this is caused by commit 5122195.
>>
>> Thanks a lot for the report. Just in time.
>> I confirm that it's a bug introduced in 2.19.
>> To test, run "seq 1000000 > million", then
>> "strace -e read grep 0 million" first using grep-2.18
>> (shows just a few read syscalls), and then with 2.19,
>> which shows grep reading the entire million-line file.
>
> Correction: to reproduce, you'll have to insert -m1 in that grep command.
>
>> Here's an incomplete patch. Obviously there's a lot more
>> to be added, including NEWS and a nontrivial test. This
>> was introduced by commit v2.18-140-g6f07900
This bears some explanation. I've attached a more complete patch
(albeit still hastily composed, so I'll wait a few hours,
in case there's feedback)
Prior to grep-2.19, with --max-count=N, this first disjunct would
be true after the Nth match, because pending would be 0:
if ((!outleft && !pending) || (nlines && done_on_match))
goto finish_grep;
However, a seemingly unrelated change affected how "pending" is set:
pending = out_quiet ? 0 : out_after;
We used to ensure that "out_after" was non-negative, because
default_context was always non-negative:
if (out_after < 0)
out_after = default_context;
But the recent context-related change invalidated that assumption:
- default_context = 0;
+ default_context = -1;
Here's the patch:
From dae857d8d787c986761c37673b31a1aa0c7e74c2 Mon Sep 17 00:00:00 2001
From: Jim Meyering <[email protected]>
Date: Fri, 30 May 2014 09:19:33 -0700
Subject: [PATCH] grep: fix --max-count=N (-m N) to stop reading after Nth
match
With --max-count=N (-m N), grep is supposed to stop reading input
after it has found the Nth match. However, a recent context-
related change made it so grep would always read to end of file.
* src/grep.c (prtext): Don't let a negative "out_after" value
make "pending" line count negative.
* tests/max-count-overread: New test, for this.
* tests/Makefile.am (TESTS): Add it.
* NEWS (Bug fixes): Mention it.
This bug was introduced by commit v2.18-140-g6f07900.
Reported by Marc Aldorasi in http://bugs.gnu.org/17640.
---
NEWS | 3 +++
src/grep.c | 2 +-
tests/Makefile.am | 1 +
tests/max-count-overread | 12 ++++++++++++
4 files changed, 17 insertions(+), 1 deletion(-)
create mode 100755 tests/max-count-overread
diff --git a/NEWS b/NEWS
index 0caad22..0e9b2cb 100644
--- a/NEWS
+++ b/NEWS
@@ -4,6 +4,9 @@ GNU grep NEWS -*- outline -*-
** Bug fixes
+ grep --max-count=N FILE would no longer stop reading after the Nth match.
+ [bug introduced in grep-2.19]
+
A command like echo aa|grep -E 'a(b$|c$)' would mistakenly
report the input as a matched line.
[bug introduced in grep-2.19]
diff --git a/src/grep.c b/src/grep.c
index acc08c7..7c0f8a8 100644
--- a/src/grep.c
+++ b/src/grep.c
@@ -1043,7 +1043,7 @@ prtext (char const *beg, char const *lim)
}
after_last_match = bufoffset - (buflim - p);
- pending = out_quiet ? 0 : out_after;
+ pending = out_quiet ? 0 : MAX (0, out_after);
used = true;
outleft -= n;
}
diff --git a/tests/Makefile.am b/tests/Makefile.am
index a38f074..31e2a81 100644
--- a/tests/Makefile.am
+++ b/tests/Makefile.am
@@ -73,6 +73,7 @@ TESTS = \
invalid-multibyte-infloop \
khadafy \
long-line-vs-2GiB-read \
+ max-count-overread \
max-count-vs-context \
mb-non-UTF8-performance \
multibyte-white-space \
diff --git a/tests/max-count-overread b/tests/max-count-overread
new file mode 100755
index 0000000..56bc331
--- /dev/null
+++ b/tests/max-count-overread
@@ -0,0 +1,12 @@
+#!/bin/sh
+# Ensure that -m1 stops reading after the first match.
+# In grep-2.19, yes x|grep -m1 x would infloop.
+
+. "${srcdir=.}/init.sh"; path_prepend_ ../src
+require_timeout_
+
+fail=0
+
+yes x | timeout 3 grep -m1 x > out || fail=1
+
+Exit $fail
--
2.0.0.rc3