On Fri, 2007-08-17 at 14:40 -0700, Daniel Lakeland wrote: > On Fri, Aug 17, 2007 at 05:32:54PM -0400, Dale Steele wrote: > > I'm trying to create two variables (dka and newsonset) from the > > following composite character variable diagnosis: > > > > diagnosis <- c("hypoglycemia","diabetes" ,"newonset&dka", "newonset", > > "diabetes", "dka&GI", "diabetes&GI", "newonset", "dka") > > > > I can extract the indices for dka and newonset using the following.... > > > > > grep("dka", diagnosis) > > [1] 3 6 9 > > > grep("newonset", diagnosis) > > [1] 3 4 8 > > > > How do I create > > > > dka = c(0,0,1,0,0,1,0,0,1) > > newonset = c(0,0,1,1,0,0,0,1,0) > > dka <- sequence(0,0,length.out=NROW(diagnosis)) > dka[grep("dka",diagnosis)] <- 1 > > similar for newonset
Or alternatively: > (regexpr("dka", diagnosis) > 0) * 1 [1] 0 0 1 0 0 1 0 0 1 > (regexpr("newonset", diagnosis) > 0) * 1 [1] 0 0 1 1 0 0 0 1 0 Unlike grep(), regexpr() will return a vector the same length as the target vector. HTH, Marc Schwartz ______________________________________________ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.