William et al:
Thanks! I think I have a somewhat more complicated issue due to the
type of string I'm using -- the split is " | " (space pipe space) -- how
do I code that based on your sub code below? Using " | *" doesn't seem
to be working. Thanks!
--j
William Dunlap wrote:
-----Original Message-----
From: r-help-boun...@r-project.org
[mailto:r-help-boun...@r-project.org] On Behalf Of Jonathan Greenberg
Sent: Thursday, October 22, 2009 7:35 PM
To: r-help
Subject: [R] splitting a vector of strings...
Quick question -- if I have a vector of strings that I'd like
to split
into two new vectors based on a substring that is inside of
each string,
what is the most efficient way to do this? The substring
that I want to
split on is multiple characters, if that matters, and it is
contained in
every element of the character vector.
strsplit and sub can both be used for this. If you know
the string will be split into 2 parts then 2 calls to sub
with slightly different patterns will do it. strsplit requires
less fiddling with the pattern and is handier when the number
of parts is variable or large. strsplit's output often needs to
be rearranged for convenient use.
E.g., I made 100,000 strings with a 'qaz' in their middles with
x<-paste("X",sample(1e5),sep="")
y<-sub("X","Y",x)
xy<-paste(x,y,sep="qaz")
and split them by the 'qaz' in two ways:
system.time(ret1<-list(x=sub("qaz.*","",xy),y=sub(".*qaz","",xy)))
# user system elapsed
# 0.22 0.00 0.21
system.time({tmp<-strsplit(xy,"qaz");ret2<-list(x=unlist(lapply(tmp,`[`,
1)),y=unlist(lapply(tmp,`[`,2)))})
user system elapsed
# 2.42 0.00 2.20
identical(ret1,ret2)
#[1] TRUE
identical(ret1$x,x) && identical(ret1$y,y)
#[1] TRUE
Bill Dunlap
Spotfire, TIBCO Software
wdunlap tibco.com
--j
--
Jonathan A. Greenberg, PhD
Postdoctoral Scholar
Center for Spatial Technologies and Remote Sensing (CSTARS)
University of California, Davis
One Shields Avenue
The Barn, Room 250N
Davis, CA 95616
Phone: 415-763-5476
AIM: jgrn307, MSN: jgrn...@hotmail.com, Gchat: jgrn307
______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
--
Jonathan A. Greenberg, PhD
Postdoctoral Scholar
Center for Spatial Technologies and Remote Sensing (CSTARS)
University of California, Davis
One Shields Avenue
The Barn, Room 250N
Davis, CA 95616
Phone: 415-763-5476
AIM: jgrn307, MSN: jgrn...@hotmail.com, Gchat: jgrn307
______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.