Re: [R] reg expr that retains only bracketed text from strings
Hi Nevil, In case you are still having trouble with this, I wrote something in R that should do what you want: mystrings<-c("ABC","A(B)C","AB[C]","BC","{AB}C") get_enclosed<-function(x,left=c("(","[","<","{"),right=c(")","]",">","}")) { newx<-rep("",length(x)) for(li in 1:length(left)) { for(xi in 1:length(x)) { lp<-regexpr(left[li],x[xi],fixed=TRUE) rp<-regexpr(right[li],x[xi],fixed=TRUE) if(lp > 0 && rp > 0) newx[xi]<-substr(x[xi],lp+1,rp-1) } } return(newx) } get_enclosed(mystrings) Jim On Thu, Jun 13, 2019 at 12:32 AM William Dunlap via R-help wrote: > > strcapture() can help here. > > > mystrings<-c("ABC","A(B)C","AB(C)") > > strcapture("^[^{]*(\\([^(]*\\)).*$", mystrings, > proto=data.frame(InParen="")) > InParen > 1 > 2 (B) > 3 (C) > > Classic regular expressions don't do so well with nested parentheses. > Perhaps a perl-style RE could do that. > > strcapture("^[^{]*(\\([^(]*\\)).*$", proto=data.frame(InParen=""), > x=c("()", "a(s(d)f)g")) > InParen > 1 () > 2 (d)f) > > Bill Dunlap > TIBCO Software > wdunlap tibco.com > > > On Tue, Jun 11, 2019 at 10:46 PM nevil amos wrote: > > > Hi > > > > I am trying to extract only the text contained in brackets from a vector of > > strings > > not all of the strings contain closed bracketed text, they should return an > > empty string or NA > > > > this is what I have at the moment > > > > > > mystrings<-c("ABC","A(B)C","AB(C)") > > > > substring(mystrings, regexpr("\\(|\\)", mystrings)) > > > > > > #this returns the whole string if there are no brackets. > > [1] "ABC" "(B)C" "(C)" > > > > > > # my desired desired output: > > #[1] "" "(B)" "(C)" > > > > many thanks for any suggestions > > Nevil Amos > > > > [[alternative HTML version deleted]] > > > > __ > > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see > > https://stat.ethz.ch/mailman/listinfo/r-help > > PLEASE do read the posting guide > > http://www.R-project.org/posting-guide.html > > and provide commented, minimal, self-contained, reproducible code. > > > > [[alternative HTML version deleted]] > > __ > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] reg expr that retains only bracketed text from strings
strcapture() can help here. > mystrings<-c("ABC","A(B)C","AB(C)") > strcapture("^[^{]*(\\([^(]*\\)).*$", mystrings, proto=data.frame(InParen="")) InParen 1 2 (B) 3 (C) Classic regular expressions don't do so well with nested parentheses. Perhaps a perl-style RE could do that. > strcapture("^[^{]*(\\([^(]*\\)).*$", proto=data.frame(InParen=""), x=c("()", "a(s(d)f)g")) InParen 1 () 2 (d)f) Bill Dunlap TIBCO Software wdunlap tibco.com On Tue, Jun 11, 2019 at 10:46 PM nevil amos wrote: > Hi > > I am trying to extract only the text contained in brackets from a vector of > strings > not all of the strings contain closed bracketed text, they should return an > empty string or NA > > this is what I have at the moment > > > mystrings<-c("ABC","A(B)C","AB(C)") > > substring(mystrings, regexpr("\\(|\\)", mystrings)) > > > #this returns the whole string if there are no brackets. > [1] "ABC" "(B)C" "(C)" > > > # my desired desired output: > #[1] "" "(B)" "(C)" > > many thanks for any suggestions > Nevil Amos > > [[alternative HTML version deleted]] > > __ > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] reg expr that retains only bracketed text from strings
On Wed, 12 Jun 2019 15:45:04 +1000 nevil amos wrote: > # my desired desired output: > #[1] "" "(B)" "(C)" (function(s) regmatches( s, gregexpr('\\([^)]+\\)', s) ))(c("ABC","A(B)C","AB(C)")) # [[1]] # character(0) # # [[2]] # [1] "(B)" # # [[3]] # [1] "(C)" This matches all substrings that start with an ( and are followed by non-zero amount of non-) characters, then terminated by ). If there are multiple such substrings, all are returned. -- Best regards, Ivan __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] reg expr that retains only bracketed text from strings
Hi Nevil, Here's one way to do it. (No doubt some regular-expression-gurus will have more concise ways to get the job done.) a1 <- sub(".*\\(","\\(",mystrings) a2 <- sub("\\).*","\\)",a1) a2[grep("\\(",a2,invert=TRUE)] <- "" a2 HTH, Eric On Wed, Jun 12, 2019 at 8:46 AM nevil amos wrote: > Hi > > I am trying to extract only the text contained in brackets from a vector of > strings > not all of the strings contain closed bracketed text, they should return an > empty string or NA > > this is what I have at the moment > > > mystrings<-c("ABC","A(B)C","AB(C)") > > substring(mystrings, regexpr("\\(|\\)", mystrings)) > > > #this returns the whole string if there are no brackets. > [1] "ABC" "(B)C" "(C)" > > > # my desired desired output: > #[1] "" "(B)" "(C)" > > many thanks for any suggestions > Nevil Amos > > [[alternative HTML version deleted]] > > __ > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] reg expr that retains only bracketed text from strings
Hi I am trying to extract only the text contained in brackets from a vector of strings not all of the strings contain closed bracketed text, they should return an empty string or NA this is what I have at the moment mystrings<-c("ABC","A(B)C","AB(C)") substring(mystrings, regexpr("\\(|\\)", mystrings)) #this returns the whole string if there are no brackets. [1] "ABC" "(B)C" "(C)" # my desired desired output: #[1] "" "(B)" "(C)" many thanks for any suggestions Nevil Amos [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.