William et al:

Thanks! I think I have a somewhat more complicated issue due to the type of string I'm using -- the split is " | " (space pipe space) -- how do I code that based on your sub code below? Using " | *" doesn't seem to be working. Thanks!

--j

William Dunlap wrote:
-----Original Message-----
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Jonathan Greenberg
Sent: Thursday, October 22, 2009 7:35 PM
To: r-help
Subject: [R] splitting a vector of strings...

Quick question -- if I have a vector of strings that I'd like to split into two new vectors based on a substring that is inside of each string, what is the most efficient way to do this? The substring that I want to split on is multiple characters, if that matters, and it is contained in every element of the character vector.

strsplit and sub can both be used for this.  If you know
the string will be split into 2 parts then 2 calls to sub
with slightly different patterns will do it.  strsplit requires
less fiddling with the pattern and is handier when the number
of parts is variable or large.  strsplit's output often needs to
be rearranged for convenient use.

E.g., I made 100,000 strings with a 'qaz' in their middles with
  x<-paste("X",sample(1e5),sep="")
  y<-sub("X","Y",x)
  xy<-paste(x,y,sep="qaz")
and split them by the 'qaz' in two ways:
  system.time(ret1<-list(x=sub("qaz.*","",xy),y=sub(".*qaz","",xy)))
# user system elapsed # 0.22 0.00 0.21 system.time({tmp<-strsplit(xy,"qaz");ret2<-list(x=unlist(lapply(tmp,`[`,
1)),y=unlist(lapply(tmp,`[`,2)))})
user system elapsed # 2.42 0.00 2.20 identical(ret1,ret2)
  #[1] TRUE
  identical(ret1$x,x) && identical(ret1$y,y)
  #[1] TRUE

Bill Dunlap
Spotfire, TIBCO Software
wdunlap tibco.com
--j

--

Jonathan A. Greenberg, PhD
Postdoctoral Scholar
Center for Spatial Technologies and Remote Sensing (CSTARS)
University of California, Davis
One Shields Avenue
The Barn, Room 250N
Davis, CA 95616
Phone: 415-763-5476
AIM: jgrn307, MSN: jgrn...@hotmail.com, Gchat: jgrn307

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


--

Jonathan A. Greenberg, PhD
Postdoctoral Scholar
Center for Spatial Technologies and Remote Sensing (CSTARS)
University of California, Davis
One Shields Avenue
The Barn, Room 250N
Davis, CA 95616
Phone: 415-763-5476
AIM: jgrn307, MSN: jgrn...@hotmail.com, Gchat: jgrn307

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to