Note that my previous strategy can be expressed slightly more clearly as: x <- c("STRING 01. Remainder of the string", "STR ING 01. Remainder of the string", "STRIN G 01. Remainder of the string", "STR IN G 01. Remainder of the string") ## more spaces in this last example entry
rx <- "([^[:digit:]]+)([[:digit:]]+.+)" > gsub(" ","",gsub(rx,"\\1",x)) [1] "STRING" "STRING" "STRING" "STRING" > gsub(rx,"\\2",x) [1] "01. Remainder of the string" "01. Remainder of the string" [3] "01. Remainder of the string" "01. Remainder of the string" Bert Gunter On Tue, Jul 28, 2020 at 2:53 PM Bert Gunter <bgunter.4...@gmail.com> wrote: > 1. Thanks for the nice reprex. > 2. However, I thought there was still a bit of ambiguity. I interpreted > your specification to mean: "any number of spaces could occur in the > beginning alphabetic part of the strings before one or more digits occur > followed by a '.' (a period) and then more stuff after." > 3. My strategy was simply to split the strings into the first part > consisting of the alphabetic characters and spaces and the second part with > the numbers and everything else. Then I just removed the spaces in the > first part. You can then concatenate them together again (using paste()) > however you like. Thus > > >x > [1] "STRING 01. Remainder of the string" "STR ING 01. Remainder of the > string" > [3] "STRIN G 01. Remainder of the string" > > p1 <-gsub(" ","",gsub("([^[:digit:]]+)[[:digit:]]+\\..*$","\\1",x)) > > p2 <- gsub("[^[:digit:]]+([[:digit:]]+\\..*$)","\\1",x) > > p1 > [1] "STRING" "STRING" "STRING" > > p2 > [1] "01. Remainder of the string" "01. Remainder of the string" > [3] "01. Remainder of the string" > > I look forward to better approaches using basic regex's (no additional > packages), however. > > > Bert Gunter > > "The trouble with having an open mind is that people keep coming along and > sticking things into it." > -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip ) > > > On Tue, Jul 28, 2020 at 1:20 PM Dennis Fisher <fis...@plessthan.com> > wrote: > >> R 4.0.2 >> OS X >> >> Colleagues >> >> I have strings that contain a space in an unexpected location. The >> intended string is: >> “STRING 01. Remainder of the string" >> However, variants are: >> “STR ING 01. Remainder of the string" >> “STRIN G 01. Remainder of the string" >> >> I would like a general approach to deleting a space, but only if it >> appears before the period. Any suggestions on a regular expression for >> this? >> >> Dennis >> >> Dennis Fisher MD >> P < (The "P Less Than" Company) >> Phone / Fax: 1-866-PLessThan (1-866-753-7784) >> www.PLessThan.com <http://www.plessthan.com/> >> >> >> >> >> >> [[alternative HTML version deleted]] >> >> ______________________________________________ >> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide >> http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. >> > [[alternative HTML version deleted]] ______________________________________________ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.