On Wed, Jul 14, 2010 at 2:21 PM, karena <[email protected]> wrote:
>
> Hi,
>
> I have a data.frame as following:
> var1 var2
> 1 ab_c_(ok)
> 2 okf789(db)_c
> 3 jojfiod(90).gt
> 4 "ij"_(78)__op
> 5 (iojfodjfo)_ab
>
> what I want is to create a new variable called "var3". the value of var3 is
> the content in the Parentheses. so var3 would be:
> var3
> ok
> db
> 90
> 78
> iojfodjfo
>
Here are several alternatives. The gsub solution matches everything
up to the ( as well as everything after the ) and replaces each with
nothing. The strsplit solution splits each into three fields,
everything before the (, everything with in the (), and everything
after the ) and the picks off the second. The strapply solution
matches everything from ( to ) and returns everything between them.
The below works whether DF$var2 is factor or character but if you know
its character you can drop the as.character in #2 and #3.
# 1
gsub(".*[(]|[)].*", "", DF$var2)
# 2
sapply(strsplit(as.character(DF$var2), "[()]"), "[", 2)
# 3
library(gsubfn)
strapply(as.character(DF$var2), "[(](.*)[)]", simplify = TRUE)
______________________________________________
[email protected] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.