On 11/04/2024 12:57 p.m., Dave Dixon wrote:
Backslashes in regex expressions in R are maddening, but they make sense.

R string handling interprets your replacement string "\\" as just one
backslash. Your string is received by gsub as "\" - that is, just the
control backslash, NOT the character backslash. gsub is expecting to see
\0, \1, \2, or some other control starting with backslash.

If you want gsub to replace with a backslash character, you have to send
it as "\\". In order to get two backslash characters in an R string, you
have to double them ALL: "\\\\".

You can use "\\" if the pattern is declared as "fixed", via

  sub("a", "\\", "abcdef", fixed = TRUE)

or

  stringr::str_replace("abcdef", fixed("a"), "\\")

My first question was whether there is a sub-like function with a way to declare the pattern as a regexp, but the replacement as fixed. Thanks for your answer to my second question.

Duncan Murdoch


The string that is output is an R string: the backslashes are escaped
with a backslash, so "\\\\" really means two backslashes.

There are lots of special characters in the search string, but only one
in the replacement string: backslash.

Here's my favorite resource on this topic is
https://www.regular-expressions.info/replacecharacters.html


On 4/11/24 10:35, Duncan Murdoch wrote:
I noticed this issue in stringr::str_replace, but it also affects
sub() in base R.

If the pattern in a call to one of these needs to be a regular
expression, then backslashes in the replacement text are treated
specially.

For example,

   gsub("a|b", "\\", "abcdef")

gives "def", not "\\\\def" as I wanted.  To get the latter, I need to
escape the replacement backslashes, e.g.

   gsub("a|b", "\\\\", "abcdef")

which gives "\\\\cdef".

I have two questions:

1.  Is there a variant on sub or str_replace which allows the pattern
to be declared as a regular expression, but the replacement to be
declared as fixed?

2.  To get what I want, I can double the backslashes in the
replacement text.  This would do that:

    replacement <- gsub("\\\\", "\\\\\\\\", replacement)

Are there any other special characters to worry about besides
backslashes?

Duncan Murdoch

______________________________________________
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

______________________________________________
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

______________________________________________
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to