> On Oct 23, 2015, at 2:17 PM, Jun Shen <[email protected]> wrote:
> 
> Dear list,
> 
> Say I have a vector that has two different types of string
> 
> test <- c('aaa.bb.cc','aaa.dd')
> 
> I want to extract the first part of the string (aaa) as a name and save the
> rest of the string as another name.
> 
> I was thinking something like
> 
> sub('(.*)\\.(.*)','\\1',test) but doesn't give me what I want.
> 
> 
> Appreciate any comments. Thanks.
> 
> Jun


How about something like this, which presumes that the characters (besides the 
periods) are only letters:

> gsub("^([[:alpha:]]+)\\.(.*)$", "\\1|\\2", test) 
[1] "aaa|bb.cc" "aaa|dd"   

or

> sub("^([[:alpha:]]+)\\.(.*)$", "\\1|\\2", test) 
[1] "aaa|bb.cc" "aaa|dd"   


The above takes the two components, before and after the first '.', adds the 
"|" as a character in between, to then be used in strsplit():


> strsplit(gsub("^([[:alpha:]]+)\\.(.*)$", "\\1|\\2", test), split = "\\|") 
[[1]]
[1] "aaa"   "bb.cc"

[[2]]
[1] "aaa" "dd" 


See ?regex

Regards,

Marc Schwartz

______________________________________________
[email protected] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to