Hi Michael, Your strings were long so I made a bit smaller example. Sarah made one good point, you want to be using gsub() not sub(), but when I use your code, I do not think it even works precisely for one instance. Try this on for size, you were 99% there:
## simplified cases form1 <- c('product + action * mean + CTA + help + mean * product') form2 <- c('product+action*mean+CTA+help+mean*product') ## what I believe your desired output is 'product + CTA + help' 'product+CTA+help' gsub("\\s\\+\\s[[:alnum:]]*\\s\\*\\s[[:alnum:]]*", "", form1) gsub("\\+[[:alnum:]]*\\*[[:alnum:]]*", "", form2) ## your code (using gsub() instead of sub()) gsub("\\+*\\s*[[:alnum:]]*\\s*\\*.[[:alnum:]]", "", form1) ######## Running on r57586 Windows x64 ######## > gsub("\\s\\+\\s[[:alnum:]]*\\s\\*\\s[[:alnum:]]*", "", form1) [1] "product + CTA + help" > gsub("\\+[[:alnum:]]*\\*[[:alnum:]]*", "", form2) [1] "product+CTA+help" > > ## your code (using gsub() instead of sub()) > gsub("\\+*\\s*[[:alnum:]]*\\s*\\*.[[:alnum:]]", "", form1) [1] "product ean + CTA + help roduct" Hope this helps, Josh On Tue, Nov 15, 2011 at 9:18 AM, Michael Griffiths <griffi...@upstreamsystems.com> wrote: > Good afternoon list, > > I have the following character strings; one with spaces between the maths > operators and variable names, and one without said spaces. > > form<-c('~ Sentence + LEGAL + Intro + Intro / Intro1 + Intro * LEGAL + > benefit + benefit / benefit1 + product + action * mean + CTA + help + mean > * product') > form<-c('~Sentence+LEGAL+Intro+Intro/Intro1+Intro*LEGAL+benefit+benefit/benefit1+product+action*mean+CTA+help+mean*product') > > I would like to remove the following target strings, either: > > 1. '+ Intro * LEGAL' which is '+ space name space * space name' > 2. '+Intro*LEGAL' which is '+ nospace name nospace * nospace name' > > Having delved into a variety of sites (e.g. > http://www.zytrax.com/tech/web/regex.htm#search) investigating regular > expressions I now have a basic grasp, but I am having difficulties removing > ALL of the instances or 1. or 2. > > The code below removes just a SINGLE instance of the target string, but I > was expecting it to remove all instances as I have \\*.[[allnum]]. I did > try \\*.[[allnum]]*, but this did not work. > > form<-sub("\\+*\\s*[[:alnum:]]*\\s*\\*.[[:alnum:]]", "", form) > > I am obviously still not understanding something. If the list could offer > some guidance I would be most grateful. > > Regards > > Mike Griffiths > > > > -- > > *Michael Griffiths, Ph.D > *Statistician > > *Upstream Systems* > > 8th Floor > Portland House > Bressenden Place > SW1E 5BH > > <http://www.google.com/url?q=http%3A%2F%2Fwww.upstreamsystems.com%2F&sa=D&sntz=1&usg=AFrqEzfKYfaAalqvahwrpywpJDL9DxUmWw> > > Tel +44 (0) 20 7869 5147 > Fax +44 207 290 1321 > Mob +44 789 4944 145 > > www.upstreamsystems.com<http://www.google.com/url?q=http%3A%2F%2Fwww.upstreamsystems.com%2F&sa=D&sntz=1&usg=AFrqEzfKYfaAalqvahwrpywpJDL9DxUmWw> > > *griffi...@upstreamsystems.com <einst...@upstreamsystems.com>* > > <http://www.upstreamsystems.com/> > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > -- Joshua Wiley Ph.D. Student, Health Psychology Programmer Analyst II, ATS Statistical Consulting Group University of California, Los Angeles https://joshuawiley.com/ ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.