CVSROOT:        /webcvs/grep
Module name:    grep
Changes by:     Jim Meyering <meyering> 25/04/11 13:06:09

Index: html_node/grep-Programs.html
===================================================================
RCS file: /webcvs/grep/grep/manual/html_node/grep-Programs.html,v
retrieving revision 1.35
retrieving revision 1.36
diff -u -b -r1.35 -r1.36
--- html_node/grep-Programs.html        13 May 2023 09:23:53 -0000      1.35
+++ html_node/grep-Programs.html        11 Apr 2025 17:06:08 -0000      1.36
@@ -1,11 +1,11 @@
 <!DOCTYPE html>
 <html>
-<!-- Created by GNU Texinfo 7.0dev, https://www.gnu.org/software/texinfo/ -->
+<!-- Created by GNU Texinfo 7.1.1, https://www.gnu.org/software/texinfo/ -->
 <head>
 <meta http-equiv="Content-Type" content="text/html; charset=utf-8">
 <!-- This manual is for grep, a pattern matching engine.
 
-Copyright © 1999-2002, 2005, 2008-2023 Free Software Foundation,
+Copyright © 1999-2002, 2005, 2008-2025 Free Software Foundation,
 Inc.
 
 Permission is granted to copy, distribute and/or modify this document
@@ -14,10 +14,10 @@
 Invariant Sections, with no Front-Cover Texts, and with no Back-Cover
 Texts.  A copy of the license is included in the section entitled
 "GNU Free Documentation License". -->
-<title>grep Programs (GNU Grep 3.11)</title>
+<title>grep Programs (GNU Grep 3.12)</title>
 
-<meta name="description" content="grep Programs (GNU Grep 3.11)">
-<meta name="keywords" content="grep Programs (GNU Grep 3.11)">
+<meta name="description" content="grep Programs (GNU Grep 3.12)">
+<meta name="keywords" content="grep Programs (GNU Grep 3.12)">
 <meta name="resource-type" content="document">
 <meta name="distribution" content="global">
 <meta name="Generator" content="makeinfo">
@@ -63,54 +63,77 @@
 controlled by the following options.
 </p>
 <dl class="table">
-<dt><samp class="option">-G</samp></dt>
-<dt><samp class="option">--basic-regexp</samp></dt>
-<dd><a class="index-entry-id" id="index-_002dG"></a>
-<a class="index-entry-id" id="index-_002d_002dbasic_002dregexp"></a>
+<dt><a class="index-entry-id" id="index-_002d_002dbasic_002dregexp"></a>
 <a class="index-entry-id" id="index-matching-basic-regular-expressions"></a>
-<p>Interpret patterns as basic regular expressions (BREs).
+<a id="index-_002dG"></a><span><samp class="option">-G</samp><a 
class="copiable-link" href="#index-_002dG"> &para;</a></span></dt>
+<dt><samp class="option">--basic-regexp</samp></dt>
+<dd><p>Interpret patterns as basic regular expressions (BREs).
 This is the default.
 </p>
 </dd>
-<dt><samp class="option">-E</samp></dt>
-<dt><samp class="option">--extended-regexp</samp></dt>
-<dd><a class="index-entry-id" id="index-_002dE"></a>
-<a class="index-entry-id" id="index-_002d_002dextended_002dregexp"></a>
+<dt><a class="index-entry-id" id="index-_002d_002dextended_002dregexp"></a>
 <a class="index-entry-id" id="index-matching-extended-regular-expressions"></a>
-<p>Interpret patterns as extended regular expressions (EREs).
+<a id="index-_002dE"></a><span><samp class="option">-E</samp><a 
class="copiable-link" href="#index-_002dE"> &para;</a></span></dt>
+<dt><samp class="option">--extended-regexp</samp></dt>
+<dd><p>Interpret patterns as extended regular expressions (EREs).
 (<samp class="option">-E</samp> is specified by POSIX.)
 </p>
 </dd>
-<dt><samp class="option">-F</samp></dt>
-<dt><samp class="option">--fixed-strings</samp></dt>
-<dd><a class="index-entry-id" id="index-_002dF"></a>
-<a class="index-entry-id" id="index-_002d_002dfixed_002dstrings"></a>
+<dt><a class="index-entry-id" id="index-_002d_002dfixed_002dstrings"></a>
 <a class="index-entry-id" id="index-matching-fixed-strings"></a>
-<p>Interpret patterns as fixed strings, not regular expressions.
+<a id="index-_002dF"></a><span><samp class="option">-F</samp><a 
class="copiable-link" href="#index-_002dF"> &para;</a></span></dt>
+<dt><samp class="option">--fixed-strings</samp></dt>
+<dd><p>Interpret patterns as fixed strings, not regular expressions.
 (<samp class="option">-F</samp> is specified by POSIX.)
 </p>
 </dd>
-<dt><samp class="option">-P</samp></dt>
-<dt><samp class="option">--perl-regexp</samp></dt>
-<dd><a class="index-entry-id" id="index-_002dP"></a>
-<a class="index-entry-id" id="index-_002d_002dperl_002dregexp"></a>
+<dt><a class="index-entry-id" id="index-_002d_002dperl_002dregexp"></a>
 <a class="index-entry-id" 
id="index-matching-Perl_002dcompatible-regular-expressions"></a>
-<p>Interpret patterns as Perl-compatible regular expressions (PCREs).
-PCRE support is here to stay, but consider this option experimental when
-combined with the <samp class="option">-z</samp> (<samp 
class="option">--null-data</samp>) option, and note that
-&lsquo;<samp class="samp">grep&nbsp;-P</samp>&rsquo; may warn of unimplemented 
features.
-See <a class="xref" href="Other-Options.html">Other Options</a>.
+<a id="index-_002dP"></a><span><samp class="option">-P</samp><a 
class="copiable-link" href="#index-_002dP"> &para;</a></span></dt>
+<dt><samp class="option">--perl-regexp</samp></dt>
+<dd><p>Interpret patterns as Perl-compatible regular expressions (PCREs).
 </p>
 <p>For documentation, refer to <a class="url" 
href="https://www.pcre.org/";>https://www.pcre.org/</a>, with these caveats:
 </p><ul class="itemize mark-bullet">
-<li>&lsquo;<samp class="samp">\d</samp>&rsquo; matches only the ten ASCII 
digits
-(and &lsquo;<samp class="samp">\D</samp>&rsquo; matches the complement), 
regardless of locale.
-Use &lsquo;<samp class="samp">\p{Nd}</samp>&rsquo; to also match non-ASCII 
digits.
-(The behavior of &lsquo;<samp class="samp">\d</samp>&rsquo; and &lsquo;<samp 
class="samp">\D</samp>&rsquo; is unspecified after
-in-regexp directives like &lsquo;<samp class="samp">(?aD)</samp>&rsquo;.)
+<li>In a UTF-8 locale, Perl treats data as UTF-8 only under certain
+conditions, e.g., if <code class="command">perl</code> is invoked with the 
<samp class="option">-C</samp>
+option or the <code class="env">PERL_UNICODE</code> environment variable set 
appropriately.
+Similarly, <code class="command">pcre2grep</code> treats data as UTF-8 only if
+invoked with <samp class="option">-u</samp> or <samp class="option">-U</samp>.
+In contrast, in a UTF-8 locale <code class="command">grep</code> and <code 
class="command">git grep</code>
+always treat data as UTF-8.
+
+</li><li>In Perl and <code class="command">git grep -P</code>, &lsquo;<samp 
class="samp">\d</samp>&rsquo; matches all Unicode digits,
+even if they are not ASCII.
+For example, &lsquo;<samp class="samp">\d</samp>&rsquo; matches
+&ldquo;Ù£&rdquo;
+(U+0663 ARABIC-INDIC DIGIT THREE).
+In contrast, in &lsquo;<samp class="samp">grep -P</samp>&rsquo;, &lsquo;<samp 
class="samp">\d</samp>&rsquo; matches only
+the ten ASCII digits, regardless of locale.
+In <code class="command">pcre2grep</code>, &lsquo;<samp 
class="samp">\d</samp>&rsquo; ordinarily behaves like Perl and
+<code class="command">git grep -P</code>, but when given the <samp 
class="option">--posix-digit</samp> option
+it behaves like &lsquo;<samp class="samp">grep -P</samp>&rsquo;.
+(On all platforms, &lsquo;<samp class="samp">\D</samp>&rsquo; matches the 
complement of &lsquo;<samp class="samp">\d</samp>&rsquo;.)
+
+</li><li>The pattern &lsquo;<samp class="samp">[[:digit:]]</samp>&rsquo; 
matches all Unicode digits
+in Perl, &lsquo;<samp class="samp">grep -P</samp>&rsquo;, <code 
class="command">git grep -P</code>, and <code class="command">pcre2grep</code>,
+so you can use it
+to get the effect of Perl&rsquo;s &lsquo;<samp class="samp">\d</samp>&rsquo; 
on all these platforms.
+In other words, in Perl and <code class="command">git grep -P</code>,
+&lsquo;<samp class="samp">\d</samp>&rsquo; is equivalent to &lsquo;<samp 
class="samp">[[:digit:]]</samp>&rsquo;,
+whereas in &lsquo;<samp class="samp">grep -P</samp>&rsquo;, &lsquo;<samp 
class="samp">\d</samp>&rsquo; is equivalent to &lsquo;<samp 
class="samp">[0-9]</samp>&rsquo;,
+and <code class="command">pcre2grep</code> ordinarily follows Perl but
+when given <samp class="option">--posix-digit</samp> it follows &lsquo;<samp 
class="samp">grep -P</samp>&rsquo;.
+
+<p>(On all these platforms, &lsquo;<samp 
class="samp">[[:digit:]]</samp>&rsquo; is equivalent to &lsquo;<samp 
class="samp">\p{Nd}</samp>&rsquo;
+and to &lsquo;<samp class="samp">\p{General_Category: 
Decimal_Number}</samp>&rsquo;.)
+</p>
+</li><li>If <code class="command">grep</code> is built with PCRE2 version 
10.43 (2024) or later,
+&lsquo;<samp class="samp">(?aD)</samp>&rsquo; causes &lsquo;<samp 
class="samp">\d</samp>&rsquo; to behave like &lsquo;<samp 
class="samp">[0-9]</samp>&rsquo; and
+&lsquo;<samp class="samp">(?-aD)</samp>&rsquo; causes it to behave like 
&lsquo;<samp class="samp">[[:digit:]]</samp>&rsquo;.
 
 </li><li>Although PCRE tracks the syntax and semantics of Perl&rsquo;s regular
-expressions, the match is not always exact.  For example, Perl
+expressions, the match is not always exact.  Perl
 evolves and a Perl installation may predate or postdate the PCRE2
 installation on the same host, or their Unicode versions may differ,
 or Perl and PCRE2 may disagree about an obscure construct.

Reply via email to