Backslashes in regex expressions in R are maddening, but they make sense.

R string handling interprets your replacement string "\\" as just one backslash. Your string is received by gsub as "\" - that is, just the control backslash, NOT the character backslash. gsub is expecting to see \0, \1, \2, or some other control starting with backslash.

If you want gsub to replace with a backslash character, you have to send it as "\\". In order to get two backslash characters in an R string, you have to double them ALL: "\\\\".

The string that is output is an R string: the backslashes are escaped with a backslash, so "\\\\" really means two backslashes.

There are lots of special characters in the search string, but only one in the replacement string: backslash.

Here's my favorite resource on this topic is https://www.regular-expressions.info/replacecharacters.html


On 4/11/24 10:35, Duncan Murdoch wrote:
I noticed this issue in stringr::str_replace, but it also affects sub() in base R.

If the pattern in a call to one of these needs to be a regular expression, then backslashes in the replacement text are treated specially.

For example,

  gsub("a|b", "\\", "abcdef")

gives "def", not "\\\\def" as I wanted.  To get the latter, I need to escape the replacement backslashes, e.g.

  gsub("a|b", "\\\\", "abcdef")

which gives "\\\\cdef".

I have two questions:

1.  Is there a variant on sub or str_replace which allows the pattern to be declared as a regular expression, but the replacement to be declared as fixed?

2.  To get what I want, I can double the backslashes in the replacement text.  This would do that:

   replacement <- gsub("\\\\", "\\\\\\\\", replacement)

Are there any other special characters to worry about besides backslashes?

Duncan Murdoch

______________________________________________
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

______________________________________________
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to