CVSROOT: /webcvs/grep Module name: grep Changes by: Jim Meyering <meyering> 25/04/11 13:06:09
Index: html_node/grep-Programs.html =================================================================== RCS file: /webcvs/grep/grep/manual/html_node/grep-Programs.html,v retrieving revision 1.35 retrieving revision 1.36 diff -u -b -r1.35 -r1.36 --- html_node/grep-Programs.html 13 May 2023 09:23:53 -0000 1.35 +++ html_node/grep-Programs.html 11 Apr 2025 17:06:08 -0000 1.36 @@ -1,11 +1,11 @@ <!DOCTYPE html> <html> -<!-- Created by GNU Texinfo 7.0dev, https://www.gnu.org/software/texinfo/ --> +<!-- Created by GNU Texinfo 7.1.1, https://www.gnu.org/software/texinfo/ --> <head> <meta http-equiv="Content-Type" content="text/html; charset=utf-8"> <!-- This manual is for grep, a pattern matching engine. -Copyright © 1999-2002, 2005, 2008-2023 Free Software Foundation, +Copyright © 1999-2002, 2005, 2008-2025 Free Software Foundation, Inc. Permission is granted to copy, distribute and/or modify this document @@ -14,10 +14,10 @@ Invariant Sections, with no Front-Cover Texts, and with no Back-Cover Texts. A copy of the license is included in the section entitled "GNU Free Documentation License". --> -<title>grep Programs (GNU Grep 3.11)</title> +<title>grep Programs (GNU Grep 3.12)</title> -<meta name="description" content="grep Programs (GNU Grep 3.11)"> -<meta name="keywords" content="grep Programs (GNU Grep 3.11)"> +<meta name="description" content="grep Programs (GNU Grep 3.12)"> +<meta name="keywords" content="grep Programs (GNU Grep 3.12)"> <meta name="resource-type" content="document"> <meta name="distribution" content="global"> <meta name="Generator" content="makeinfo"> @@ -63,54 +63,77 @@ controlled by the following options. </p> <dl class="table"> -<dt><samp class="option">-G</samp></dt> -<dt><samp class="option">--basic-regexp</samp></dt> -<dd><a class="index-entry-id" id="index-_002dG"></a> -<a class="index-entry-id" id="index-_002d_002dbasic_002dregexp"></a> +<dt><a class="index-entry-id" id="index-_002d_002dbasic_002dregexp"></a> <a class="index-entry-id" id="index-matching-basic-regular-expressions"></a> -<p>Interpret patterns as basic regular expressions (BREs). +<a id="index-_002dG"></a><span><samp class="option">-G</samp><a class="copiable-link" href="#index-_002dG"> ¶</a></span></dt> +<dt><samp class="option">--basic-regexp</samp></dt> +<dd><p>Interpret patterns as basic regular expressions (BREs). This is the default. </p> </dd> -<dt><samp class="option">-E</samp></dt> -<dt><samp class="option">--extended-regexp</samp></dt> -<dd><a class="index-entry-id" id="index-_002dE"></a> -<a class="index-entry-id" id="index-_002d_002dextended_002dregexp"></a> +<dt><a class="index-entry-id" id="index-_002d_002dextended_002dregexp"></a> <a class="index-entry-id" id="index-matching-extended-regular-expressions"></a> -<p>Interpret patterns as extended regular expressions (EREs). +<a id="index-_002dE"></a><span><samp class="option">-E</samp><a class="copiable-link" href="#index-_002dE"> ¶</a></span></dt> +<dt><samp class="option">--extended-regexp</samp></dt> +<dd><p>Interpret patterns as extended regular expressions (EREs). (<samp class="option">-E</samp> is specified by POSIX.) </p> </dd> -<dt><samp class="option">-F</samp></dt> -<dt><samp class="option">--fixed-strings</samp></dt> -<dd><a class="index-entry-id" id="index-_002dF"></a> -<a class="index-entry-id" id="index-_002d_002dfixed_002dstrings"></a> +<dt><a class="index-entry-id" id="index-_002d_002dfixed_002dstrings"></a> <a class="index-entry-id" id="index-matching-fixed-strings"></a> -<p>Interpret patterns as fixed strings, not regular expressions. +<a id="index-_002dF"></a><span><samp class="option">-F</samp><a class="copiable-link" href="#index-_002dF"> ¶</a></span></dt> +<dt><samp class="option">--fixed-strings</samp></dt> +<dd><p>Interpret patterns as fixed strings, not regular expressions. (<samp class="option">-F</samp> is specified by POSIX.) </p> </dd> -<dt><samp class="option">-P</samp></dt> -<dt><samp class="option">--perl-regexp</samp></dt> -<dd><a class="index-entry-id" id="index-_002dP"></a> -<a class="index-entry-id" id="index-_002d_002dperl_002dregexp"></a> +<dt><a class="index-entry-id" id="index-_002d_002dperl_002dregexp"></a> <a class="index-entry-id" id="index-matching-Perl_002dcompatible-regular-expressions"></a> -<p>Interpret patterns as Perl-compatible regular expressions (PCREs). -PCRE support is here to stay, but consider this option experimental when -combined with the <samp class="option">-z</samp> (<samp class="option">--null-data</samp>) option, and note that -‘<samp class="samp">grep -P</samp>’ may warn of unimplemented features. -See <a class="xref" href="Other-Options.html">Other Options</a>. +<a id="index-_002dP"></a><span><samp class="option">-P</samp><a class="copiable-link" href="#index-_002dP"> ¶</a></span></dt> +<dt><samp class="option">--perl-regexp</samp></dt> +<dd><p>Interpret patterns as Perl-compatible regular expressions (PCREs). </p> <p>For documentation, refer to <a class="url" href="https://www.pcre.org/">https://www.pcre.org/</a>, with these caveats: </p><ul class="itemize mark-bullet"> -<li>‘<samp class="samp">\d</samp>’ matches only the ten ASCII digits -(and ‘<samp class="samp">\D</samp>’ matches the complement), regardless of locale. -Use ‘<samp class="samp">\p{Nd}</samp>’ to also match non-ASCII digits. -(The behavior of ‘<samp class="samp">\d</samp>’ and ‘<samp class="samp">\D</samp>’ is unspecified after -in-regexp directives like ‘<samp class="samp">(?aD)</samp>’.) +<li>In a UTF-8 locale, Perl treats data as UTF-8 only under certain +conditions, e.g., if <code class="command">perl</code> is invoked with the <samp class="option">-C</samp> +option or the <code class="env">PERL_UNICODE</code> environment variable set appropriately. +Similarly, <code class="command">pcre2grep</code> treats data as UTF-8 only if +invoked with <samp class="option">-u</samp> or <samp class="option">-U</samp>. +In contrast, in a UTF-8 locale <code class="command">grep</code> and <code class="command">git grep</code> +always treat data as UTF-8. + +</li><li>In Perl and <code class="command">git grep -P</code>, ‘<samp class="samp">\d</samp>’ matches all Unicode digits, +even if they are not ASCII. +For example, ‘<samp class="samp">\d</samp>’ matches +“Ù£” +(U+0663 ARABIC-INDIC DIGIT THREE). +In contrast, in ‘<samp class="samp">grep -P</samp>’, ‘<samp class="samp">\d</samp>’ matches only +the ten ASCII digits, regardless of locale. +In <code class="command">pcre2grep</code>, ‘<samp class="samp">\d</samp>’ ordinarily behaves like Perl and +<code class="command">git grep -P</code>, but when given the <samp class="option">--posix-digit</samp> option +it behaves like ‘<samp class="samp">grep -P</samp>’. +(On all platforms, ‘<samp class="samp">\D</samp>’ matches the complement of ‘<samp class="samp">\d</samp>’.) + +</li><li>The pattern ‘<samp class="samp">[[:digit:]]</samp>’ matches all Unicode digits +in Perl, ‘<samp class="samp">grep -P</samp>’, <code class="command">git grep -P</code>, and <code class="command">pcre2grep</code>, +so you can use it +to get the effect of Perl’s ‘<samp class="samp">\d</samp>’ on all these platforms. +In other words, in Perl and <code class="command">git grep -P</code>, +‘<samp class="samp">\d</samp>’ is equivalent to ‘<samp class="samp">[[:digit:]]</samp>’, +whereas in ‘<samp class="samp">grep -P</samp>’, ‘<samp class="samp">\d</samp>’ is equivalent to ‘<samp class="samp">[0-9]</samp>’, +and <code class="command">pcre2grep</code> ordinarily follows Perl but +when given <samp class="option">--posix-digit</samp> it follows ‘<samp class="samp">grep -P</samp>’. + +<p>(On all these platforms, ‘<samp class="samp">[[:digit:]]</samp>’ is equivalent to ‘<samp class="samp">\p{Nd}</samp>’ +and to ‘<samp class="samp">\p{General_Category: Decimal_Number}</samp>’.) +</p> +</li><li>If <code class="command">grep</code> is built with PCRE2 version 10.43 (2024) or later, +‘<samp class="samp">(?aD)</samp>’ causes ‘<samp class="samp">\d</samp>’ to behave like ‘<samp class="samp">[0-9]</samp>’ and +‘<samp class="samp">(?-aD)</samp>’ causes it to behave like ‘<samp class="samp">[[:digit:]]</samp>’. </li><li>Although PCRE tracks the syntax and semantics of Perl’s regular -expressions, the match is not always exact. For example, Perl +expressions, the match is not always exact. Perl evolves and a Perl installation may predate or postdate the PCRE2 installation on the same host, or their Unicode versions may differ, or Perl and PCRE2 may disagree about an obscure construct.
