Re: [R] Extracting first number after * in a character vector
Elegant I don't know, but I think the appended does the trick. -- Mike > foo <- c(" 1 X[0,SMITH] * 0 0 1 ", + " 2 X[0,JOHNSON] * 0 0 1 ", + " 3 X[0,WILLIAMS] * 1 0 1 ", + " 4 X[0,JONES] * 0 0 1 ", + [TRUNCATED] > as.numeric(gsub("^[^*]+[*][^0-9]+([01]).*$", "\\1", foo)) [1] 0 0 1 0 0 0 0 0 0 > On Mon, Jan 23, 2017 at 1:27 PM, Jim Lemonwrote: > Hi Abhinaba, > I'm sure that someone will post a terrifyingly elegant regular > expression that does this, but: > > ardat<- > c([1] " 1 X[0,SMITH] * 0 0 1 ", > ... > numpoststar<-function(x) { > xsplit<-unlist(strsplit(x,"")) > starpos<-which(xsplit=="*") > # watch out for a missing asterisk, they cause an infinite loop > if(length(starpos)) { > digits<-c("0","1","2","3","4","5","6","7","8","9") > while(!any(digits %in% xsplit[starpos])) starpos<-starpos+1 > return(as.numeric(xsplit[starpos])) > } > return(NA) > } > > for(i in 1:length(ardat)) print(numpoststar(ardat[i])) > > The observant will wonder why I didn't use sapply. Because for some > reason it returned the original strings rather than the numbers. I > dunno. > > Jim > > On Mon, Jan 23, 2017 at 11:29 PM, Abhinaba Roy > wrote: >> Hi, >> >> How do I extract the first number after '*' in a vector? >> >> The vector is given below >> >>> dput(out[1:10]) >> c(" 1 X[0,SMITH] * 0 0 1 ", >> " 2 X[0,JOHNSON] * 0 0 1 ", >> " 3 X[0,WILLIAMS]", "* 1 0 >> 1 ", >> " 4 X[0,JONES] * 0 0 1 ", >> " 5 X[0,BROWN] * 0 0 1 ", >> " 6 X[0,DAVIS] * 0 0 1 ", >> " 7 X[0,MILLER] * 0 0 1 ", >> " 8 X[0,WILSON] * 0 0 1 ", >> " 9 X[0,MOORE] * 0 0 1 " >> ) >> >> I want a vector with the first number after the asterisk. >> >> So the output would give me, a vector (0,0,1,0,0,0,0,0,0,0) >> >> How can I do it in R? >> >> Best, >> Abhinaba >> >> [[alternative HTML version deleted]] >> >> __ >> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. > > __ > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Extracting first number after * in a character vector
Hi Abhinaba, I'm sure that someone will post a terrifyingly elegant regular expression that does this, but: ardat<- c([1] " 1 X[0,SMITH] * 0 0 1 ", ... numpoststar<-function(x) { xsplit<-unlist(strsplit(x,"")) starpos<-which(xsplit=="*") # watch out for a missing asterisk, they cause an infinite loop if(length(starpos)) { digits<-c("0","1","2","3","4","5","6","7","8","9") while(!any(digits %in% xsplit[starpos])) starpos<-starpos+1 return(as.numeric(xsplit[starpos])) } return(NA) } for(i in 1:length(ardat)) print(numpoststar(ardat[i])) The observant will wonder why I didn't use sapply. Because for some reason it returned the original strings rather than the numbers. I dunno. Jim On Mon, Jan 23, 2017 at 11:29 PM, Abhinaba Roywrote: > Hi, > > How do I extract the first number after '*' in a vector? > > The vector is given below > >> dput(out[1:10]) > c(" 1 X[0,SMITH] * 0 0 1 ", > " 2 X[0,JOHNSON] * 0 0 1 ", > " 3 X[0,WILLIAMS]", "* 1 0 > 1 ", > " 4 X[0,JONES] * 0 0 1 ", > " 5 X[0,BROWN] * 0 0 1 ", > " 6 X[0,DAVIS] * 0 0 1 ", > " 7 X[0,MILLER] * 0 0 1 ", > " 8 X[0,WILSON] * 0 0 1 ", > " 9 X[0,MOORE] * 0 0 1 " > ) > > I want a vector with the first number after the asterisk. > > So the output would give me, a vector (0,0,1,0,0,0,0,0,0,0) > > How can I do it in R? > > Best, > Abhinaba > > [[alternative HTML version deleted]] > > __ > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Extracting first number after * in a character vector
On 23.01.2017 13:29, Abhinaba Roy wrote: Hi, How do I extract the first number after '*' in a vector? The vector is given below dput(out[1:10]) c(" 1 X[0,SMITH] * 0 0 1 ", " 2 X[0,JOHNSON] * 0 0 1 ", " 3 X[0,WILLIAMS]", "* 1 0 1 ", " 4 X[0,JONES] * 0 0 1 ", " 5 X[0,BROWN] * 0 0 1 ", " 6 X[0,DAVIS] * 0 0 1 ", " 7 X[0,MILLER] * 0 0 1 ", " 8 X[0,WILSON] * 0 0 1 ", " 9 X[0,MOORE] * 0 0 1 " ) I want a vector with the first number after the asterisk. So the output would give me, a vector (0,0,1,0,0,0,0,0,0,0) How can I do it in R? You know that your vector (called x below) contains an element without an asterisk? If that happened by accident, use gsub(".+\\* *([[:digit:]]+).*", "\\1", x) and if it could happen to have elements without an asterisk or number that follows, you can set these results to NA in a seperate step. Best, Uwe Ligges Best, Abhinaba [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.