[Rd] should sub(perl=TRUE) also handle \E in replacement, to complement \U and \L?

William Dunlap Mon, 13 Apr 2009 11:58:38 -0700

Currently sub(perl=TRUE) allows you to specify \U and \L
in the replacement argument so that the rest of the subpatterns
in the line (the \\<digit> things) will be converted to upper
or lower case, respectively.  perl also also has a \E operator
to end these case conversions for the rest of the subpatterns
(so they retain whatever case they had in the original text).
For symmetry's sake I think it would be nice if R supported that
also.  E.g., to capitalize the first and last letters of every
word, leaving the case of the interior letters alone, could be
done with:


> gsub("(\\w)(\\w*)(\\w)", "\\U\\1\\E\\2\\U\\3", "useRs may fly into JFK
or laGuardia", perl=TRUE)
[1] "UseRS MaY FlY IntO JFK OR LaGuardiA"
> sub("(\\w)(\\w*)(\\w)", "\\U\\1\\E\\2\\U\\3", "useRs may fly into JFK
or laGuardia", perl=TRUE)
[1] "UseRS may fly into JFK or laGuardia"

A question regarding this came up in r-help today.

Bill Dunlap
TIBCO Software Inc - Spotfire Division
wdunlap tibco.com 

Index: src/library/base/man/grep.Rd
===================================================================
--- src/library/base/man/grep.Rd        (revision 48319)
+++ src/library/base/man/grep.Rd        (working copy)
@@ -73,7 +73,7 @@
     \code{"\\9"} to parenthesized subexpressions of \code{pattern}.
For
     \code{perl = TRUE} only, it can also contain \code{"\\U"} or
     \code{"\\L"} to convert the rest of the replacement to upper or
-    lower case.
+    lower case, or \code{"\\E"} to end such case conversion.
   }
 }
 \details{


Index: src/main/pcre.c
===================================================================
--- src/main/pcre.c     (revision 48319)
+++ src/main/pcre.c     (working copy)
@@ -90,6 +90,9 @@
            } else if (p[1] == 'L') {
                p++; n -= 2;
                upper = FALSE; lower = TRUE;
+           } else if (p[1] == 'E') { /* end case modification */
+               p++; n -= 2;
+               upper = FALSE; lower = FALSE;
            } else if (p[1] == 0) {
                /* can't escape the final '\0' */
                n--;
@@ -168,6 +171,9 @@
            } else if (p[1] == 'L') {
                p += 2;
                upper = FALSE; lower = TRUE;
+           } else if (p[1] == 'E') { /* end case modification */
+               p += 2;
+               upper = FALSE; lower = FALSE;
            } else if (p[1] == 0) {
                p += 1;
            } else {

Index: src/library/base/man/grep.Rd
===================================================================
--- src/library/base/man/grep.Rd        (revision 48319)
+++ src/library/base/man/grep.Rd        (working copy)
@@ -73,7 +73,7 @@
     \code{"\\9"} to parenthesized subexpressions of \code{pattern}.  For
     \code{perl = TRUE} only, it can also contain \code{"\\U"} or
     \code{"\\L"} to convert the rest of the replacement to upper or
-    lower case.
+    lower case, or \code{"\\E"} to end such case conversion.
   }
 }
 \details{


Index: src/main/pcre.c
===================================================================
--- src/main/pcre.c     (revision 48319)
+++ src/main/pcre.c     (working copy)
@@ -90,6 +90,9 @@
            } else if (p[1] == 'L') {
                p++; n -= 2;
                upper = FALSE; lower = TRUE;
+           } else if (p[1] == 'E') { /* end case modification */
+               p++; n -= 2;
+               upper = FALSE; lower = FALSE;
            } else if (p[1] == 0) {
                /* can't escape the final '\0' */
                n--;
@@ -168,6 +171,9 @@
            } else if (p[1] == 'L') {
                p += 2;
                upper = FALSE; lower = TRUE;
+           } else if (p[1] == 'E') { /* end case modification */
+               p += 2;
+               upper = FALSE; lower = FALSE;
            } else if (p[1] == 0) {
                p += 1;
            } else {

______________________________________________
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

[Rd] should sub(perl=TRUE) also handle \E in replacement, to complement \U and \L?

Reply via email to