> On Oct 23, 2015, at 2:17 PM, Jun Shen <[email protected]> wrote:
>
> Dear list,
>
> Say I have a vector that has two different types of string
>
> test <- c('aaa.bb.cc','aaa.dd')
>
> I want to extract the first part of the string (aaa) as a name and save the
> rest of the string as another name.
>
> I was thinking something like
>
> sub('(.*)\\.(.*)','\\1',test) but doesn't give me what I want.
>
>
> Appreciate any comments. Thanks.
>
> Jun
How about something like this, which presumes that the characters (besides the
periods) are only letters:
> gsub("^([[:alpha:]]+)\\.(.*)$", "\\1|\\2", test)
[1] "aaa|bb.cc" "aaa|dd"
or
> sub("^([[:alpha:]]+)\\.(.*)$", "\\1|\\2", test)
[1] "aaa|bb.cc" "aaa|dd"
The above takes the two components, before and after the first '.', adds the
"|" as a character in between, to then be used in strsplit():
> strsplit(gsub("^([[:alpha:]]+)\\.(.*)$", "\\1|\\2", test), split = "\\|")
[[1]]
[1] "aaa" "bb.cc"
[[2]]
[1] "aaa" "dd"
See ?regex
Regards,
Marc Schwartz
______________________________________________
[email protected] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.