CVSROOT: /webcvs/grep Module name: grep Changes by: Jim Meyering <meyering> 20/01/02 18:18:45
Index: grep.html =================================================================== RCS file: /webcvs/grep/grep/manual/grep.html,v retrieving revision 1.29 retrieving revision 1.30 diff -u -b -r1.29 -r1.30 --- grep.html 30 Dec 2018 06:24:21 -0000 1.29 +++ grep.html 2 Jan 2020 23:18:43 -0000 1.30 @@ -2,7 +2,7 @@ <html> <!-- This manual is for grep, a pattern matching engine. -Copyright (C) 1999-2002, 2005, 2008-2018 Free Software Foundation, +Copyright (C) 1999-2002, 2005, 2008-2020 Free Software Foundation, Inc. Permission is granted to copy, distribute and/or modify this document @@ -14,10 +14,10 @@ <!-- Created by GNU Texinfo 6.5, http://www.gnu.org/software/texinfo/ --> <head> <meta http-equiv="Content-Type" content="text/html; charset=utf-8"> -<title>GNU Grep 3.3</title> +<title>GNU Grep 3.4</title> -<meta name="description" content="GNU Grep 3.3"> -<meta name="keywords" content="GNU Grep 3.3"> +<meta name="description" content="GNU Grep 3.4"> +<meta name="keywords" content="GNU Grep 3.4"> <meta name="resource-type" content="document"> <meta name="distribution" content="global"> <meta name="Generator" content="makeinfo"> @@ -58,7 +58,7 @@ </head> <body lang="en"> -<h1 class="settitle" align="center">GNU Grep 3.3</h1> +<h1 class="settitle" align="center">GNU Grep 3.4</h1> @@ -101,7 +101,7 @@ <li><a name="toc-Performance-1" href="#Performance">5 Performance</a></li> <li><a name="toc-Reporting-bugs" href="#Reporting-Bugs">6 Reporting bugs</a> <ul class="no-bullet"> - <li><a name="toc-Known-Bugs" href="#Known-Bugs">6.1 Known Bugs</a></li> + <li><a name="toc-Known-Bugs-1" href="#Known-Bugs">6.1 Known Bugs</a></li> </ul></li> <li><a name="toc-Copying-1" href="#Copying">7 Copying</a> <ul class="no-bullet"> @@ -123,11 +123,11 @@ <p><code>grep</code> prints lines that contain a match for one or more patterns. </p> -<p>This manual is for version 3.3 of GNU Grep. +<p>This manual is for version 3.4 of GNU Grep. </p> <p>This manual is for <code>grep</code>, a pattern matching engine. </p> -<p>Copyright © 1999-2002, 2005, 2008-2018 Free Software Foundation, +<p>Copyright © 1999–2002, 2005, 2008–2020 Free Software Foundation, Inc. </p> <blockquote> @@ -201,12 +201,12 @@ <pre class="example">grep [<var>option</var>...] [<var>patterns</var>] [<var>file</var>...] </pre></div> - <p>There can be zero or more <var>option</var> arguments, and zero or more <var>file</var> arguments. The <var>patterns</var> argument contains one or more patterns separated by newlines, and is omitted when patterns are given via the ‘<samp>-e <var>patterns</var></samp>’ or ‘<samp>-f <var>file</var></samp>’ -options. +options. Typically <var>patterns</var> should be quoted when +<code>grep</code> is used in a shell command. </p> <table class="menu" border="0" cellspacing="0"> <tr><td align="left" valign="top">• <a href="#Command_002dline-Options" accesskey="1">Command-line Options</a>:</td><td> </td><td align="left" valign="top">Short and long names, grouped by category. @@ -307,6 +307,8 @@ <var>patterns</var> separate each pattern from the next. If this option is used multiple times or is combined with the <samp>-f</samp> (<samp>--file</samp>) option, search for all patterns given. +Typically <var>patterns</var> should be quoted when <code>grep</code> is used +in a shell command. (<samp>-e</samp> is specified by POSIX.) </p> </dd> @@ -329,7 +331,8 @@ <a name="index-_002dy"></a> <a name="index-_002d_002dignore_002dcase"></a> <a name="index-case-insensitive-search"></a> -<p>Ignore case distinctions, so that characters that differ only in case +<p>Ignore case distinctions in patterns and input data, +so that characters that differ only in case match each other. Although this is straightforward when letters differ in case only via lowercase-uppercase pairs, the behavior is unspecified in other situations. For example, uppercase “S” has an @@ -346,6 +349,14 @@ (<samp>-i</samp> is specified by POSIX.) </p> </dd> +<dt><samp>--no-ignore-case</samp></dt> +<dd><a name="index-_002d_002dno_002dignore_002dcase"></a> +<p>Do not ignore case distinctions in patterns and input data. This is +the default. This option is useful for passing to shell scripts that +already use <samp>-i</samp>, in order to cancel its effects because the +two options override each other. +</p> +</dd> <dt><samp>-v</samp></dt> <dt><samp>--invert-match</samp></dt> <dd><a name="index-_002dv"></a> @@ -472,7 +483,7 @@ For example, the following shell script makes use of it: </p> <div class="example"> -<pre class="example">while grep -m 1 PATTERN +<pre class="example">while grep -m 1 'PATTERN' do echo xxxx done < FILE @@ -484,7 +495,7 @@ <div class="example"> <pre class="example"># This probably will not work. cat FILE | -while grep -m 1 PATTERN +while grep -m 1 'PATTERN' do echo xxxx done @@ -594,12 +605,12 @@ <dd><a name="index-_002d_002dlabel"></a> <a name="index-changing-name-of-standard-input"></a> <p>Display input actually coming from standard input -as input coming from file <var>LABEL</var>. This is -especially useful when implementing tools like -<code>zgrep</code>; e.g.: +as input coming from file <var>LABEL</var>. +This can be useful for commands that transform a file’s contents +before searching; e.g.: </p> <div class="example"> -<pre class="example">gzip -cd foo.gz | grep --label=foo -H something +<pre class="example">gzip -cd foo.gz | grep --label=foo -H 'some pattern' </pre></div> </dd> @@ -840,10 +851,11 @@ <a name="index-searching-directory-trees"></a> <p>Skip any command-line file with a name suffix that matches the pattern <var>glob</var>, using wildcard matching; a name suffix is either the whole -name, or any suffix starting after a ‘<samp>/</samp>’ and before a -non-‘<samp>/</samp>’. When searching recursively, skip any subfile whose base +name, or a trailing part that starts with a non-slash character +immediately after a slash (‘<samp>/</samp>’) in the name. +When searching recursively, skip any subfile whose base name matches <var>glob</var>; the base name is the part after the last -‘<samp>/</samp>’. A pattern can use +slash. A pattern can use ‘<samp>*</samp>’, ‘<samp>?</samp>’, and ‘<samp>[</samp>’...‘<samp>]</samp>’ as wildcards, and <code>\</code> to quote a wildcard or backslash character literally. </p> @@ -1809,6 +1821,8 @@ <samp>-e</samp> or from a file (‘<samp>-f <var>file</var></samp>’), back-references are local to each expression. </p> +<p>See <a href="#Known-Bugs">Known Bugs</a>, for some known problems with back-references. +</p> <hr> <a name="Basic-vs-Extended"></a> <div class="header"> @@ -1863,7 +1877,27 @@ The <samp>-i</samp> option causes <code>grep</code> to ignore case, causing it to match the line ‘<samp>Hello, world!</samp>’, which it would not otherwise match. -See <a href="#Invoking">Invoking</a>, for more details about +</p> +<p>Here is a more complex example session, +showing the location and contents of any line +containing ‘<samp>f</samp>’ and ending in ‘<samp>.c</samp>’, +within all files in the current directory whose names +contain ‘<samp>g</samp>’ and end in ‘<samp>.h</samp>’. +The <samp>-n</samp> option outputs line numbers, the <samp>--</samp> argument +treats any later arguments starting with ‘<samp>-</samp>’ as file names not +options, and the empty file <samp>/dev/null</samp> causes file names to be output +even if only one file name happens to be of the form ‘<samp>*g*.h</samp>’. +</p> +<div class="example"> +<pre class="example">$ <kbd>grep -n -- 'f.*\.c$' *g*.h /dev/null</kbd> +argmatch.h:1:/* definitions and prototypes for argmatch.c +</pre></div> + +<p>The only line that contains a match is line 1 of <samp>argmatch.h</samp>. +Note that the regular expression syntax used in the pattern differs +from the globbing syntax that the shell uses to match file names. +</p> +<p>See <a href="#Invoking">Invoking</a>, for more details about how to invoke <code>grep</code>. </p> <a name="index-using-grep_002c-Q_0026A"></a> @@ -1874,10 +1908,10 @@ <li> How can I list just the names of matching files? <div class="example"> -<pre class="example">grep -l 'main' *.c +<pre class="example">grep -l 'main' test-*.c </pre></div> -<p>lists the names of all C files in the current directory whose contents +<p>lists names of ‘<samp>test-*.c</samp>’ files in the current directory whose contents mention ‘<samp>main</samp>’. </p> </li><li> How do I search directories recursively? @@ -1889,42 +1923,51 @@ <p>searches for ‘<samp>hello</samp>’ in all files under the <samp>/home/gigi</samp> directory. For more control over which files are searched, -use <code>find</code>, <code>grep</code>, and <code>xargs</code>. +use <code>find</code> and <code>grep</code>. For example, the following command searches only C files: </p> <div class="example"> -<pre class="example">find /home/gigi -name '*.c' -print0 | xargs -0r grep -H 'hello' +<pre class="example">find /home/gigi -name '*.c' ! -type d \ + -exec grep -H 'hello' '{}' + </pre></div> <p>This differs from the command: </p> <div class="example"> -<pre class="example">grep -H 'hello' *.c +<pre class="example">grep -H 'hello' /home/gigi/*.c </pre></div> -<p>which merely looks for ‘<samp>hello</samp>’ in all files in the current -directory whose names end in ‘<samp>.c</samp>’. -The ‘<samp>find ...</samp>’ command line above is more similar to the command: +<p>which merely looks for ‘<samp>hello</samp>’ in non-hidden C files in +<samp>/home/gigi</samp> whose names end in ‘<samp>.c</samp>’. +The <code>find</code> command line above is more similar to the command: </p> <div class="example"> -<pre class="example">grep -rH --include='*.c' 'hello' /home/gigi +<pre class="example">grep -r --include='*.c' 'hello' /home/gigi </pre></div> -</li><li> What if a pattern has a leading ‘<samp>-</samp>’? +</li><li> What if a pattern or file has a leading ‘<samp>-</samp>’? <div class="example"> -<pre class="example">grep -e '--cut here--' * +<pre class="example">grep -- '--cut here--' * </pre></div> <p>searches for all lines matching ‘<samp>--cut here--</samp>’. -Without <samp>-e</samp>, +Without <samp>--</samp>, <code>grep</code> would attempt to parse ‘<samp>--cut here--</samp>’ as a list of -options. +options, and there would be similar problems with any file names +beginning with ‘<samp>-</samp>’. </p> +<p>Alternatively, you can prevent misinterpretation of leading ‘<samp>-</samp>’ +by using <samp>-e</samp> for patterns and leading ‘<samp>./</samp>’ for files: +</p> +<div class="example"> +<pre class="example">grep -e '--cut here--' ./* +</pre></div> + </li><li> Suppose I want to search for a whole word, not a part of a word? <div class="example"> -<pre class="example">grep -w 'hello' * +<pre class="example">grep -w 'hello' test*.log </pre></div> <p>searches only for instances of ‘<samp>hello</samp>’ that are entire words; @@ -1934,7 +1977,7 @@ For example: </p> <div class="example"> -<pre class="example">grep 'hello\>' * +<pre class="example">grep 'hello\>' test*.log </pre></div> <p>searches only for words ending in ‘<samp>hello</samp>’, so it matches the word @@ -1943,7 +1986,7 @@ </li><li> How do I output context around the matching lines? <div class="example"> -<pre class="example">grep -C 2 'hello' * +<pre class="example">grep -C 2 'hello' test*.log </pre></div> <p>prints two lines of context around each matching line. @@ -2017,7 +2060,7 @@ <p>The <code>grep</code> command searches for lines that contain strings that match a pattern. Every line contains the empty string, so an empty pattern causes <code>grep</code> to find a match on each line. It -is not the only such pattern: ‘<samp>^</samp>’, ‘<samp>$</samp>’, ‘<samp>.*</samp>’, and many +is not the only such pattern: ‘<samp>^</samp>’, ‘<samp>$</samp>’, and many other patterns cause <code>grep</code> to match every line. </p> <p>To match empty lines, use the pattern ‘<samp>^$</samp>’. To match blank @@ -2032,30 +2075,6 @@ <pre class="example">cat /etc/passwd | grep 'alain' - /etc/motd </pre></div> -</li><li> <a name="index-palindromes"></a> -How to express palindromes in a regular expression? - -<p>It can be done by using back-references; -for example, -a palindrome of 4 characters can be written with a BRE: -</p> -<div class="example"> -<pre class="example">grep -w -e '\(.\)\(.\).\2\1' file -</pre></div> - -<p>It matches the word “radar” or “civic.” -</p> -<p>Guglielmo Bondioni proposed a single RE -that finds all palindromes up to 19 characters long -using 9 subexpressions<!-- /@w --> and 9 <span class="nolinebreak">back-references</span><!-- /@w -->: -</p> -<div class="smallexample"> -<pre class="smallexample">grep -E -e '^(.?)(.?)(.?)(.?)(.?)(.?)(.?)(.?)(.?).?\9\8\7\6\5\4\3\2\1$' file -</pre></div> - -<p>Note this is done by using GNU ERE extensions; -it might not be portable to other implementations of <code>grep</code>. -</p> </li><li> Why is this back-reference failing? <div class="example"> @@ -2208,7 +2227,18 @@ If you find a bug not listed there, please email it to <a href="mailto:[email protected]">[email protected]</a> to create a new bug report. </p> +<table class="menu" border="0" cellspacing="0"> +<tr><td align="left" valign="top">• <a href="#Known-Bugs" accesskey="1">Known Bugs</a>:</td><td> </td><td align="left" valign="top"> +</td></tr> +</table> + +<hr> <a name="Known-Bugs"></a> +<div class="header"> +<p> +Up: <a href="#Reporting-Bugs" accesskey="u" rel="up">Reporting Bugs</a> [<a href="#SEC_Contents" title="Table of contents" rel="contents">Contents</a>][<a href="#Index" title="Index" rel="index">Index</a>]</p> +</div> +<a name="Known-Bugs-1"></a> <h3 class="section">6.1 Known Bugs</h3> <a name="index-Bugs_002c-known"></a> @@ -2218,7 +2248,17 @@ obscure regular expressions require exponential time and space, and may cause <code>grep</code> to run out of memory. </p> -<p>Back-references are very slow, and may require exponential time. +<p>Back-references can greatly slow down matching, as they can generate +exponentially many matching possibilities that can consume both time +and memory to explore. Also, the POSIX specification for +back-references is at times unclear. Furthermore, many regular +expression implementations have back-reference bugs that can cause +programs to return incorrect answers or even crash, and fixing these +bugs has often been low-priority—for example, as of 2019 the GNU C +library bug database contained back-reference bugs 52, 10844, 11053, +and 25322, with little sign of forthcoming fixes. Luckily, +back-references are rarely useful and it should be little trouble to +avoid them in practical applications. </p> <hr> @@ -2852,6 +2892,7 @@ <tr><td></td><td valign="top"><a href="#index-_002d_002dline_002dregexp"><code>--line-regexp</code></a>:</td><td> </td><td valign="top"><a href="#Matching-Control">Matching Control</a></td></tr> <tr><td></td><td valign="top"><a href="#index-_002d_002dmax_002dcount"><code>--max-count</code></a>:</td><td> </td><td valign="top"><a href="#General-Output-Control">General Output Control</a></td></tr> <tr><td></td><td valign="top"><a href="#index-_002d_002dno_002dfilename"><code>--no-filename</code></a>:</td><td> </td><td valign="top"><a href="#Output-Line-Prefix-Control">Output Line Prefix Control</a></td></tr> +<tr><td></td><td valign="top"><a href="#index-_002d_002dno_002dignore_002dcase"><code>--no-ignore-case</code></a>:</td><td> </td><td valign="top"><a href="#Matching-Control">Matching Control</a></td></tr> <tr><td></td><td valign="top"><a href="#index-_002d_002dno_002dmessages"><code>--no-messages</code></a>:</td><td> </td><td valign="top"><a href="#General-Output-Control">General Output Control</a></td></tr> <tr><td></td><td valign="top"><a href="#index-_002d_002dnull"><code>--null</code></a>:</td><td> </td><td valign="top"><a href="#Output-Line-Prefix-Control">Output Line Prefix Control</a></td></tr> <tr><td></td><td valign="top"><a href="#index-_002d_002dnull_002ddata"><code>--null-data</code></a>:</td><td> </td><td valign="top"><a href="#Other-Options">Other Options</a></td></tr> @@ -2943,7 +2984,7 @@ <tr><td></td><td valign="top"><a href="#index-braces_002c-second-argument-omitted">braces, second argument omitted</a>:</td><td> </td><td valign="top"><a href="#Fundamental-Structure">Fundamental Structure</a></td></tr> <tr><td></td><td valign="top"><a href="#index-braces_002c-two-arguments">braces, two arguments</a>:</td><td> </td><td valign="top"><a href="#Fundamental-Structure">Fundamental Structure</a></td></tr> <tr><td></td><td valign="top"><a href="#index-bracket-expression">bracket expression</a>:</td><td> </td><td valign="top"><a href="#Character-Classes-and-Bracket-Expressions">Character Classes and Bracket Expressions</a></td></tr> -<tr><td></td><td valign="top"><a href="#index-Bugs_002c-known">Bugs, known</a>:</td><td> </td><td valign="top"><a href="#Reporting-Bugs">Reporting Bugs</a></td></tr> +<tr><td></td><td valign="top"><a href="#index-Bugs_002c-known">Bugs, known</a>:</td><td> </td><td valign="top"><a href="#Known-Bugs">Known Bugs</a></td></tr> <tr><td></td><td valign="top"><a href="#index-bugs_002c-reporting">bugs, reporting</a>:</td><td> </td><td valign="top"><a href="#Reporting-Bugs">Reporting Bugs</a></td></tr> <tr><td></td><td valign="top"><a href="#index-byte-offset">byte offset</a>:</td><td> </td><td valign="top"><a href="#Output-Line-Prefix-Control">Output Line Prefix Control</a></td></tr> <tr><td colspan="4"> <hr></td></tr> @@ -3069,7 +3110,6 @@ <tr><td></td><td valign="top"><a href="#index-option-delimiter">option delimiter</a>:</td><td> </td><td valign="top"><a href="#Other-Options">Other Options</a></td></tr> <tr><td colspan="4"> <hr></td></tr> <tr><th><a name="Index_cp_letter-P">P</a></th><td></td><td></td></tr> -<tr><td></td><td valign="top"><a href="#index-palindromes">palindromes</a>:</td><td> </td><td valign="top"><a href="#Usage">Usage</a></td></tr> <tr><td></td><td valign="top"><a href="#index-patterns-from-file">patterns from file</a>:</td><td> </td><td valign="top"><a href="#Matching-Control">Matching Control</a></td></tr> <tr><td></td><td valign="top"><a href="#index-patterns-option">patterns option</a>:</td><td> </td><td valign="top"><a href="#Matching-Control">Matching Control</a></td></tr> <tr><td></td><td valign="top"><a href="#index-performance">performance</a>:</td><td> </td><td valign="top"><a href="#Performance">Performance</a></td></tr>
