subject:"\[R\] Subsetting a data frame"

Re: [R] Subsetting a data frame

2011-12-05 Thread jim holtman

does this do what you want:

> db <- structure(list(ind = c("ind1", "ind2", "ind3", "ind4"), test1 = c(1,
+ 2, 1.3, 3), test2 = c(56L, 27L, 58L, 2L), test3 = c(1.1, 28,
+ 9, 1.2)), .Names = c("ind", "test1", "test2", "test3"), class =
+ "data.frame", row.names = c(NA,
+ -4L))
>
> terms_include <- c("1","2","3")
> terms_exclude <- c("1.1","1.2","1.3")
>
> f.match <- function(obj, inc, exc){
+ pat <- paste("^(", paste(inc, collapse = "|"), ")", sep = '')
+ patex <- paste(exc, collapse = "|")
+ isMatch <- apply(obj, 1, function(x) any(grepl(pat, x)))
+ notMatch <- !apply(obj, 1, function(x) any(grepl(patex, x)))
+ obj[isMatch & notMatch,]
+ }
>
> db
   ind test1 test2 test3
1 ind1   1.056   1.1
2 ind2   2.027  28.0
3 ind3   1.358   9.0
4 ind4   3.0 2   1.2
> f.match(db, terms_include, terms_exclude)
   ind test1 test2 test3
2 ind2 22728
>

On Mon, Dec 5, 2011 at 6:32 AM, natalie.vanzuydam  wrote:
> Hi R users,
>
> I really need help with subsetting  data frames:
>
> I have a large database of medical records and I want to be able to match
> patterns from a list of search terms .
>
> I've used this simplified data frame in a previous example:
>
>
> db <- structure(list(ind = c("ind1", "ind2", "ind3", "ind4"), test1 = c(1,
> 2, 1.3, 3), test2 = c(56L, 27L, 58L, 2L), test3 = c(1.1, 28,
> 9, 1.2)), .Names = c("ind", "test1", "test2", "test3"), class =
> "data.frame", row.names = c(NA,
> -4L))
>
> terms_include <- c("1","2","3")
> terms_exclude <- c("1.1","1.2","1.3")
>
>
> So in this example I want to include all the terms from terms include as
> long as they don't occur with terms exclude in the same row of the data
> frame.
>
> Previously I was given this function which works very well if you want to
> match exactly:
>
>
> f <- function(x)  !any(x %in% terms_exclude) && any(x %in% terms_include)
> db[apply(db[, -1], 1, f), ]
>
>   ind test1 test2 test3
> 2 ind2     2    27  28.0
> 4 ind4     3     2   1.2
>
>
> I would like to know if there is a way to write a similar function that
> looks for matches that start with the query string:  as in
> grepl("^pattern",x)
>
> I started writing a function but am not sure how to get it to return the
> dataframe or matrix:
>
>
> for (i in 1:length(terms_include)){
> db_new <- apply(db,2, grepl,pattern=i)
> }
>
> Applying this function gives me:
>
> db_new <- structure(c(FALSE, FALSE, TRUE, FALSE, FALSE, FALSE, TRUE, TRUE,
> FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE), .Dim = c(4L,
> 4L), .Dimnames = list(NULL, c("ind", "test1", "test2", "test3"
> )))
>
> So the above is searching the pattern anywhere in the dataframe instead of
> just at the beginning of the string.
>
> How would I incorporate look for terms to include but don't return the row
> of the data frame if it also includes one of the terms to exclude while
> using partial matching?
>
> I hope that this makes sense.
>
> Many thanks,
> Natalie
>
> -
> Natalie Van Zuydam
>
> PhD Student
> University of Dundee
> nvanzuy...@dundee.ac.uk
> --
> View this message in context: 
> http://r.789695.n4.nabble.com/Subsetting-a-data-frame-tp4160127p4160127.html
> Sent from the R help mailing list archive at Nabble.com.
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.



-- 
Jim Holtman
Data Munger Guru

What is the problem that you are trying to solve?
Tell me what you want to do, not how you want to do it.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Subsetting a data frame

2011-12-05 Thread natalie.vanzuydam

Hi R users,

I really need help with subsetting  data frames:

I have a large database of medical records and I want to be able to match
patterns from a list of search terms .

I've used this simplified data frame in a previous example:


db <- structure(list(ind = c("ind1", "ind2", "ind3", "ind4"), test1 = c(1, 
2, 1.3, 3), test2 = c(56L, 27L, 58L, 2L), test3 = c(1.1, 28, 
9, 1.2)), .Names = c("ind", "test1", "test2", "test3"), class =
"data.frame", row.names = c(NA, 
-4L)) 

terms_include <- c("1","2","3") 
terms_exclude <- c("1.1","1.2","1.3") 


So in this example I want to include all the terms from terms include as
long as they don't occur with terms exclude in the same row of the data
frame.

Previously I was given this function which works very well if you want to
match exactly:


f <- function(x)  !any(x %in% terms_exclude) && any(x %in% terms_include) 
db[apply(db[, -1], 1, f), ] 

   ind test1 test2 test3 
2 ind2 227  28.0 
4 ind4 3 2   1.2 


I would like to know if there is a way to write a similar function that
looks for matches that start with the query string:  as in
grepl("^pattern",x)  

I started writing a function but am not sure how to get it to return the
dataframe or matrix:


for (i in 1:length(terms_include)){
db_new <- apply(db,2, grepl,pattern=i)
}

Applying this function gives me:

db_new <- structure(c(FALSE, FALSE, TRUE, FALSE, FALSE, FALSE, TRUE, TRUE, 
FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE), .Dim = c(4L, 
4L), .Dimnames = list(NULL, c("ind", "test1", "test2", "test3"
)))

So the above is searching the pattern anywhere in the dataframe instead of
just at the beginning of the string.  

How would I incorporate look for terms to include but don't return the row
of the data frame if it also includes one of the terms to exclude while
using partial matching?

I hope that this makes sense.

Many thanks,
Natalie

-
Natalie Van Zuydam

PhD Student
University of Dundee
nvanzuy...@dundee.ac.uk
--
View this message in context: 
http://r.789695.n4.nabble.com/Subsetting-a-data-frame-tp4160127p4160127.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Subsetting a data frame with multiple values and exclusions.

2011-10-06 Thread natalie.vanzuydam

Thanks.  Such a short and sweet answer that does what it should.

-
Natalie Van Zuydam

PhD Student
University of Dundee
nvanzuy...@dundee.ac.uk
--
View this message in context: 
http://r.789695.n4.nabble.com/Subsetting-a-data-frame-with-multiple-values-and-exclusions-tp3874967p3877472.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Subsetting a data frame with multiple values and exclusions.

2011-10-05 Thread Dennis Murphy

Hi:

Is this what you're after?

f <- function(x)  !any(x %in% terms_exclude) && any(x %in% terms_include)
db[apply(db[, -1], 1, f), ]

   ind test1 test2 test3
2 ind2 227  28.0
4 ind4 3 2   1.2

HTH,
Dennis

On Wed, Oct 5, 2011 at 8:53 AM, natalie.vanzuydam  wrote:
> Hi all,
>
> I realise that the convention is to provide a working example of my problem
> but the data are  of a sensitive nature so I'm not able to do that in this
> case.
>
> I need to query a database for multiple search terms:
>
> db <- structure(list(ind = c("ind1", "ind2", "ind3", "ind4"), test1 = c(1,
> 2, 1.3, 3), test2 = c(56L, 27L, 58L, 2L), test3 = c(1.1, 28,
> 9, 1.2)), .Names = c("ind", "test1", "test2", "test3"), class =
> "data.frame", row.names = c(NA,
> -4L))
>
> terms_include <- c("1","2","3")
> terms_exclude <- c("1.1","1.2","1.3")
>
> So I need to write a loop where the search of each value in the list of
> terms_include is searched over the entire data frame.  I thought of using
> apply with grepl and subset?  At the same time if the value of terms_include
> occurs in the same row as values from terms_exclude then that row must be
> excluded from the output dataframe.
>
> I'm not sure where to even begin.  I've only worked very basically with
> subset.  The final database is much larger and the number of search terms is
> many more than are presented here so I would really need to be able to loop
> over the data frame successively to return a final df with my searched
> values in at least one of the columns.
>
> Your help and assistance is much appreciated,
> Natalie
>
>
>
> -
> Natalie Van Zuydam
>
> PhD Student
> University of Dundee
> nvanzuy...@dundee.ac.uk
> --
> View this message in context: 
> http://r.789695.n4.nabble.com/Subsetting-a-data-frame-with-multiple-values-and-exclusions-tp3874967p3874967.html
> Sent from the R help mailing list archive at Nabble.com.
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Subsetting a data frame with multiple values and exclusions.

2011-10-05 Thread natalie.vanzuydam

Hi all,

I realise that the convention is to provide a working example of my problem
but the data are  of a sensitive nature so I'm not able to do that in this
case.

I need to query a database for multiple search terms:

db <- structure(list(ind = c("ind1", "ind2", "ind3", "ind4"), test1 = c(1, 
2, 1.3, 3), test2 = c(56L, 27L, 58L, 2L), test3 = c(1.1, 28, 
9, 1.2)), .Names = c("ind", "test1", "test2", "test3"), class =
"data.frame", row.names = c(NA, 
-4L))

terms_include <- c("1","2","3")
terms_exclude <- c("1.1","1.2","1.3")

So I need to write a loop where the search of each value in the list of
terms_include is searched over the entire data frame.  I thought of using
apply with grepl and subset?  At the same time if the value of terms_include
occurs in the same row as values from terms_exclude then that row must be
excluded from the output dataframe.

I'm not sure where to even begin.  I've only worked very basically with
subset.  The final database is much larger and the number of search terms is
many more than are presented here so I would really need to be able to loop
over the data frame successively to return a final df with my searched
values in at least one of the columns.

Your help and assistance is much appreciated,
Natalie



-
Natalie Van Zuydam

PhD Student
University of Dundee
nvanzuy...@dundee.ac.uk
--
View this message in context: 
http://r.789695.n4.nabble.com/Subsetting-a-data-frame-with-multiple-values-and-exclusions-tp3874967p3874967.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Subsetting a data frame by dropping correlated variables

2011-04-27 Thread Juliet Hannah

The 'findCorrelation' function in the caret package may be helpful.


On Tue, Apr 19, 2011 at 3:10 PM, Rita Carreira  wrote:
>
> Hello R Users!
> I have a data frame that has many variables, some with missing observations, 
> and some that are correlated with each other. I would like to subset the data 
> by dropping one of the variables that is correlated with another variable 
> that I will keep int he data frame. Alternatively, I could also drop both the 
> variables that are correlated with each other. Worry not! I am not deleting 
> data, I am just finding a subset of the data that I can use to impute some 
> missing observations.
> I have tried the following statement
> dfQuc <- dfQ[ , sapply(dfQ, function(x) cor(dfQ, use = 
> "pairwise.complete.obs", method ="pearson")<0.8)]
> but it gives me the following error:
> Error in `[.data.frame`(dfQ, , sapply(dfQ, function(x) cor(dfQ, use = 
> "pairwise.complete.obs",  :
>  undefined columns selected
> Since I have several dozen data frames, it is impractical for me to manually 
> inspect the correlation matrices and select which variables to drop, so I am 
> trying to have R make the selection for me. Does any one have any idea on how 
> to accomplish this?
> Thank you very much!
> Rita = "If you think education is 
> expensive, try ignorance."--Derek Bok
>
>
>
>        [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Subsetting a data frame by dropping correlated variables

2011-04-19 Thread Rita Carreira


Hello R Users!
I have a data frame that has many variables, some with missing observations, 
and some that are correlated with each other. I would like to subset the data 
by dropping one of the variables that is correlated with another variable that 
I will keep int he data frame. Alternatively, I could also drop both the 
variables that are correlated with each other. Worry not! I am not deleting 
data, I am just finding a subset of the data that I can use to impute some 
missing observations. 
I have tried the following statement 
dfQuc <- dfQ[ , sapply(dfQ, function(x) cor(dfQ, use = "pairwise.complete.obs", 
method ="pearson")<0.8)]
but it gives me the following error:
Error in `[.data.frame`(dfQ, , sapply(dfQ, function(x) cor(dfQ, use = 
"pairwise.complete.obs",  : 
  undefined columns selected
Since I have several dozen data frames, it is impractical for me to manually 
inspect the correlation matrices and select which variables to drop, so I am 
trying to have R make the selection for me. Does any one have any idea on how 
to accomplish this? 
Thank you very much!
Rita = "If you think education is 
expensive, try ignorance."--Derek Bok


  
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] subsetting a data frame

2008-09-06 Thread Jorge Ivan Velez

Hi Joseph,
Try this:

# Data set
DF=read.table(textConnection("V1 V2 V3
ab0:1:12
df1:2:1
cd1:0:9
be2:2:6
fc5:5:0"),header=TRUE)
closeAllConnections()

target=10
 DF[sapply(strsplit(as.character(DF$V3), ":"), function(x)
sum(as.numeric(x))== target), ]


HTH,

Jorge



On Sat, Sep 6, 2008 at 2:25 PM, joseph <[EMAIL PROTECTED]> wrote:

> Hi Jorge
> I got the rows where V3 looks like this  10:10:10;  Ithe sum here is 30 and
> not 10.
> I want the rows where the sum is 10 for exaple 5:5:0 and 2:2:6
> thanks
> Joseph
>
> - Original Message 
> From: Jorge Ivan Velez <[EMAIL PROTECTED]>
> To: joseph <[EMAIL PROTECTED]>
> Sent: Saturday, September 6, 2008 10:43:09 AM
> Subject: Re: [R] subsetting a data frame
>
>
> Dear Joseph,
> Try
>
>
>  DF[sapply(strsplit(as.character(DF$V3), ":"),
>function(i) all(as.numeric(i) == 10)), ]
>
> HTH,
>
> Jorge
>
>
> On Sat, Sep 6, 2008 at 1:24 PM, joseph <[EMAIL PROTECTED]> wrote:
>
>> Hello
>> How can I change the function to get the rows with the sum (x+y+z) = 10?
>> Thank you very much
>> Joseph
>>
>>
>>
>> - Original Message ----
>> From: Marc Schwartz <[EMAIL PROTECTED]>
>> To: joseph <[EMAIL PROTECTED]>
>> Cc: r-help@r-project.org
>> Sent: Wednesday, September 3, 2008 3:24:58 PM
>> Subject: Re: [R] subsetting a data frame
>>
>> on 09/03/2008 05:06 PM joseph wrote:
>> > I have a data frame that looks like this:
>> > V1 V2 V3
>> > ab0:1:12
>> > df1:2:1
>> > cd1:0:9
>> > where V3 is in the form x:y:z
>> > Can someone show me how to subset the rows where the values of x, y and
>> z <= 10:
>> > V1 V2 V3
>> > df1:2:1
>> > cd1:0:9
>> > Thanks
>> > Joseph
>>
>>
>> How about this:
>>
>>
>>
>> > DF[sapply(strsplit(as.character(DF$V3), ":"),
>>function(i) all(as.numeric(i) <= 10)), ]
>>  V1 V2V3
>> 2  d  f 1:2:1
>> 3  c  d 1:0:9
>>
>>
>> Basically, use strsplit() to break apart 'V3':
>>
>> > strsplit(as.character(DF$V3), ":")
>> [[1]]
>> [1] "0"  "1"  "12"
>>
>> [[2]]
>> [1] "1" "2" "1"
>>
>> [[3]]
>> [1] "1" "0" "9"
>>
>>
>> The use sapply() to crawl the list, converting the elements to numerics
>> and do the value comparison:
>>
>> > sapply(strsplit(as.character(DF$V3), ":"),
>> function(i) all(as.numeric(i) <= 10))
>> [1] FALSE  TRUE  TRUE
>>
>>
>> The above then returns the logical vector to subset the rows of 'DF'.
>>
>> HTH,
>>
>> Marc Schwartz
>>
>>
>>
>> [[alternative HTML version deleted]]
>>
>> __
>> R-help@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>
>
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] subsetting a data frame

2008-09-06 Thread stephen sefick

Sorry I didn't read the problem carefully.

On Sat, Sep 6, 2008 at 2:53 PM, Jorge Ivan Velez
<[EMAIL PROTECTED]> wrote:
>
> Hi Joseph,
> Try this:
> # Data set
> DF=read.table(textConnection("V1 V2 V3
> ab0:1:12
> df1:2:1
> cd1:0:9
> be2:2:6
> fc5:5:0"),header=TRUE)
> closeAllConnections()
>
> target=10
>  DF[sapply(strsplit(as.character(DF$V3), ":"), function(x)
> sum(as.numeric(x))== target), ]
>
> HTH,
> Jorge
>
>
> On Sat, Sep 6, 2008 at 2:25 PM, joseph <[EMAIL PROTECTED]> wrote:
>>
>> Hi Jorge
>> I got the rows where V3 looks like this  10:10:10;  Ithe sum here is 30
>> and not 10.
>> I want the rows where the sum is 10 for exaple 5:5:0 and 2:2:6
>> thanks
>> Joseph
>>
>> - Original Message 
>> From: Jorge Ivan Velez <[EMAIL PROTECTED]>
>> To: joseph <[EMAIL PROTECTED]>
>> Sent: Saturday, September 6, 2008 10:43:09 AM
>> Subject: Re: [R] subsetting a data frame
>>
>>
>> Dear Joseph,
>> Try
>>
>>  DF[sapply(strsplit(as.character(DF$V3), ":"),
>>function(i) all(as.numeric(i) == 10)), ]
>> HTH,
>> Jorge
>>
>> On Sat, Sep 6, 2008 at 1:24 PM, joseph <[EMAIL PROTECTED]> wrote:
>>>
>>> Hello
>>> How can I change the function to get the rows with the sum (x+y+z) = 10?
>>> Thank you very much
>>> Joseph
>>>
>>>
>>>
>>> - Original Message 
>>> From: Marc Schwartz <[EMAIL PROTECTED]>
>>> To: joseph <[EMAIL PROTECTED]>
>>> Cc: r-help@r-project.org
>>> Sent: Wednesday, September 3, 2008 3:24:58 PM
>>> Subject: Re: [R] subsetting a data frame
>>>
>>> on 09/03/2008 05:06 PM joseph wrote:
>>> > I have a data frame that looks like this:
>>> > V1 V2 V3
>>> > ab0:1:12
>>> > df1:2:1
>>> > cd1:0:9
>>> > where V3 is in the form x:y:z
>>> > Can someone show me how to subset the rows where the values of x, y and
>>> > z <= 10:
>>> > V1 V2 V3
>>> > df1:2:1
>>> > cd1:0:9
>>> > Thanks
>>> > Joseph
>>>
>>>
>>> How about this:
>>>
>>>
>>>
>>> > DF[sapply(strsplit(as.character(DF$V3), ":"),
>>>function(i) all(as.numeric(i) <= 10)), ]
>>>  V1 V2V3
>>> 2  d  f 1:2:1
>>> 3  c  d 1:0:9
>>>
>>>
>>> Basically, use strsplit() to break apart 'V3':
>>>
>>> > strsplit(as.character(DF$V3), ":")
>>> [[1]]
>>> [1] "0"  "1"  "12"
>>>
>>> [[2]]
>>> [1] "1" "2" "1"
>>>
>>> [[3]]
>>> [1] "1" "0" "9"
>>>
>>>
>>> The use sapply() to crawl the list, converting the elements to numerics
>>> and do the value comparison:
>>>
>>> > sapply(strsplit(as.character(DF$V3), ":"),
>>> function(i) all(as.numeric(i) <= 10))
>>> [1] FALSE  TRUE  TRUE
>>>
>>>
>>> The above then returns the logical vector to subset the rows of 'DF'.
>>>
>>> HTH,
>>>
>>> Marc Schwartz
>>>
>>>
>>>
>>>[[alternative HTML version deleted]]
>>>
>>> __
>>> R-help@r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide
>>> http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>>
>>
>
>



-- 
Stephen Sefick
Research Scientist
Southeastern Natural Sciences Academy

Let's not spend our time and resources thinking about things that are
so little or so large that all they really do for us is puff us up and
make us feel like gods. We are mammals, and have not exhausted the
annoying little problems of being mammals.

-K. Mullis

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] subsetting a data frame

2008-09-06 Thread stephen sefick

#something like this?
V1 <- c(1:10)
V2 <- c(0,5,7,8,1,6,5,13,7,0)
V3 <- c(9,5,6,8,1,7,5,33,88,0)
z <- cbind(V1,V2,V3)
row.sums <- rowSums(z)
d <- cbind(z, row.sums)
subset(d, row.sums==10)

On Sat, Sep 6, 2008 at 2:25 PM, joseph <[EMAIL PROTECTED]> wrote:
> Hi Jorge
> I got the rows where V3 looks like this  10:10:10;  Ithe sum here is 30 and 
> not 10.
> I want the rows where the sum is 10 for exaple 5:5:0 and 2:2:6
>
> thanks
> Joseph
>
>
> - Original Message 
> From: Jorge Ivan Velez <[EMAIL PROTECTED]>
> To: joseph <[EMAIL PROTECTED]>
> Sent: Saturday, September 6, 2008 10:43:09 AM
> Subject: Re: [R] subsetting a data frame
>
>
>
> Dear Joseph,
>
> Try
>
>
>  DF[sapply(strsplit(as.character(DF$V3), ":"),
>   function(i) all(as.numeric(i) == 10)), ]
>
> HTH,
>
> Jorge
>
>
>
> On Sat, Sep 6, 2008 at 1:24 PM, joseph <[EMAIL PROTECTED]> wrote:
>
> Hello
> How can I change the function to get the rows with the sum (x+y+z) = 10?
> Thank you very much
> Joseph
>
>
>
>
> - Original Message 
> From: Marc Schwartz <[EMAIL PROTECTED]>
> To: joseph <[EMAIL PROTECTED]>
> Cc: r-help@r-project.org
> Sent: Wednesday, September 3, 2008 3:24:58 PM
> Subject: Re: [R] subsetting a data frame
>
> on 09/03/2008 05:06 PM joseph wrote:
>> I have a data frame that looks like this:
>> V1 V2 V3
>> ab0:1:12
>> df1:2:1
>> cd1:0:9
>> where V3 is in the form x:y:z
>> Can someone show me how to subset the rows where the values of x, y and z <= 
>> 10:
>> V1 V2 V3
>> df1:2:1
>> cd1:0:9
>> Thanks
>> Joseph
>
>
> How about this:
>
>
>
>> DF[sapply(strsplit(as.character(DF$V3), ":"),
>   function(i) all(as.numeric(i) <= 10)), ]
>  V1 V2V3
> 2  d  f 1:2:1
> 3  c  d 1:0:9
>
>
> Basically, use strsplit() to break apart 'V3':
>
>> strsplit(as.character(DF$V3), ":")
> [[1]]
> [1] "0"  "1"  "12"
>
> [[2]]
> [1] "1" "2" "1"
>
> [[3]]
> [1] "1" "0" "9"
>
>
> The use sapply() to crawl the list, converting the elements to numerics
> and do the value comparison:
>
>> sapply(strsplit(as.character(DF$V3), ":"),
>function(i) all(as.numeric(i) <= 10))
> [1] FALSE  TRUE  TRUE
>
>
> The above then returns the logical vector to subset the rows of 'DF'.
>
> HTH,
>
> Marc Schwartz
>
>
>
>
>   [[alternative HTML version deleted]]
>
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
>
>
>[[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



-- 
Stephen Sefick
Research Scientist
Southeastern Natural Sciences Academy

Let's not spend our time and resources thinking about things that are
so little or so large that all they really do for us is puff us up and
make us feel like gods. We are mammals, and have not exhausted the
annoying little problems of being mammals.

-K. Mullis

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] subsetting a data frame

2008-09-06 Thread joseph

Hi Jorge
I got the rows where V3 looks like this  10:10:10;  Ithe sum here is 30 and not 
10.
I want the rows where the sum is 10 for exaple 5:5:0 and 2:2:6 

thanks 
Joseph

- Original Message 
From: Jorge Ivan Velez <[EMAIL PROTECTED]>
To: joseph <[EMAIL PROTECTED]>
Sent: Saturday, September 6, 2008 10:43:09 AM
Subject: Re: [R] subsetting a data frame

Dear Joseph,

Try

 DF[sapply(strsplit(as.character(DF$V3), ":"),
   function(i) all(as.numeric(i) == 10)), ]

HTH,

Jorge

On Sat, Sep 6, 2008 at 1:24 PM, joseph <[EMAIL PROTECTED]> wrote:

Hello
How can I change the function to get the rows with the sum (x+y+z) = 10?
Thank you very much
Joseph

- Original Message 
From: Marc Schwartz <[EMAIL PROTECTED]>
To: joseph <[EMAIL PROTECTED]>
Cc: r-help@r-project.org
Sent: Wednesday, September 3, 2008 3:24:58 PM
Subject: Re: [R] subsetting a data frame

on 09/03/2008 05:06 PM joseph wrote:
> I have a data frame that looks like this:
> V1 V2 V3
> ab0:1:12
> df1:2:1
> cd1:0:9
> where V3 is in the form x:y:z
> Can someone show me how to subset the rows where the values of x, y and z <= 
> 10:
> V1 V2 V3
> df1:2:1
> cd1:0:9
> Thanks
> Joseph

How about this:

> DF[sapply(strsplit(as.character(DF$V3), ":"),
   function(i) all(as.numeric(i) <= 10)), ]
 V1 V2V3
2  d  f 1:2:1
3  c  d 1:0:9

Basically, use strsplit() to break apart 'V3':

> strsplit(as.character(DF$V3), ":")
[[1]]
[1] "0"  "1"  "12"

[[2]]
[1] "1" "2" "1"

[[3]]
[1] "1" "0" "9"

The use sapply() to crawl the list, converting the elements to numerics
and do the value comparison:

> sapply(strsplit(as.character(DF$V3), ":"),
function(i) all(as.numeric(i) <= 10))
[1] FALSE  TRUE  TRUE

The above then returns the logical vector to subset the rows of 'DF'.

HTH,

Marc Schwartz

   [[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] subsetting a data frame

2008-09-06 Thread joseph

Hello
How can I change the function to get the rows with the sum (x+y+z) = 10?
Thank you very much
Joseph

- Original Message 
From: Marc Schwartz <[EMAIL PROTECTED]>
To: joseph <[EMAIL PROTECTED]>
Cc: r-help@r-project.org
Sent: Wednesday, September 3, 2008 3:24:58 PM
Subject: Re: [R] subsetting a data frame

on 09/03/2008 05:06 PM joseph wrote:
> I have a data frame that looks like this:
> V1 V2 V3
> ab0:1:12
> df1:2:1
> cd1:0:9
> where V3 is in the form x:y:z
> Can someone show me how to subset the rows where the values of x, y and z <= 
> 10: 
> V1 V2 V3
> df1:2:1
> cd1:0:9
> Thanks
> Joseph

How about this:

> DF[sapply(strsplit(as.character(DF$V3), ":"),
function(i) all(as.numeric(i) <= 10)), ]
  V1 V2V3
2  d  f 1:2:1
3  c  d 1:0:9

Basically, use strsplit() to break apart 'V3':

> strsplit(as.character(DF$V3), ":")
[[1]]
[1] "0"  "1"  "12"

[[2]]
[1] "1" "2" "1"

[[3]]
[1] "1" "0" "9"

The use sapply() to crawl the list, converting the elements to numerics
and do the value comparison:

> sapply(strsplit(as.character(DF$V3), ":"),
 function(i) all(as.numeric(i) <= 10))
[1] FALSE  TRUE  TRUE

The above then returns the logical vector to subset the rows of 'DF'.

HTH,

Marc Schwartz

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] subsetting a data frame

2008-09-03 Thread Marc Schwartz

on 09/03/2008 05:06 PM joseph wrote:
> I have a data frame that looks like this:
> V1 V2 V3
> ab0:1:12
> df1:2:1
> cd1:0:9
> where V3 is in the form x:y:z
> Can someone show me how to subset the rows where the values of x, y and z <= 
> 10: 
> V1 V2 V3
> df1:2:1
> cd1:0:9
> Thanks
> Joseph

How about this:

> DF[sapply(strsplit(as.character(DF$V3), ":"),
function(i) all(as.numeric(i) <= 10)), ]
  V1 V2V3
2  d  f 1:2:1
3  c  d 1:0:9

Basically, use strsplit() to break apart 'V3':

> strsplit(as.character(DF$V3), ":")
[[1]]
[1] "0"  "1"  "12"

[[2]]
[1] "1" "2" "1"

[[3]]
[1] "1" "0" "9"

The use sapply() to crawl the list, converting the elements to numerics
and do the value comparison:

> sapply(strsplit(as.character(DF$V3), ":"),
 function(i) all(as.numeric(i) <= 10))
[1] FALSE  TRUE  TRUE

The above then returns the logical vector to subset the rows of 'DF'.

HTH,

Marc Schwartz

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] subsetting a data frame

2008-09-03 Thread joseph

I have a data frame that looks like this:
V1 V2 V3
ab0:1:12
df1:2:1
cd1:0:9
where V3 is in the form x:y:z
Can someone show me how to subset the rows where the values of x, y and z <= 
10: 
V1 V2 V3
df1:2:1
cd1:0:9
Thanks
Joseph



  
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] subsetting a data frame using string matching

2008-01-21 Thread Richard . Cotton

> a = c("Alpha", "Beta", "Gamma", "Beeta", "Alpha", "beta")
> b = c(1:6)
> example = data.frame("Title" = a, "Vals" = b)
> 
> 
> > example
>   Title Vals
> 1 Alpha1
> 2  Beta2
> 3 Gamma3
> 4 Beeta4
> 5 Alpha5
> 6  beta6
> > 
> 
> I would like to be able to get a new data frame from this data frame
> containing only rows that match a certain string. In this case it
> could for instance be the string "eta". I have tried various ways of
> using agrep and grep, but so far I have not found anything that
> worked.

Sounds like you were nearly there.

rows.to.keep <- grep("eta", example$Title)
subdata <- example[rows.to.keep,]

Regards,
Richie.

Mathematical Sciences Unit
HSL

"Statistics are like a lamp-post to a drunken man - more for leaning on 
than illumination."
David Brent, The Office.



ATTENTION:

This message contains privileged and confidential inform...{{dropped:20}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] subsetting a data frame using string matching

2008-01-21 Thread Chuck Cleland

On 1/21/2008 5:18 AM, Karin Lagesen wrote:
> Example data frame: 
> 
> 
> a = c("Alpha", "Beta", "Gamma", "Beeta", "Alpha", "beta")
> b = c(1:6)
> example = data.frame("Title" = a, "Vals" = b)
> 
> 
>> example
>   Title Vals
> 1 Alpha1
> 2  Beta2
> 3 Gamma3
> 4 Beeta4
> 5 Alpha5
> 6  beta6
> 
> I would like to be able to get a new data frame from this data frame
> containing only rows that match a certain string. In this case it
> could for instance be the string "eta". I have tried various ways of
> using agrep and grep, but so far I have not found anything that
> worked.
> 
> Thankyou in advance for your help!

a <- c("Alpha", "Beta", "Gamma", "Beeta", "Alpha", "beta")
b <- c(1:6)
df <- data.frame(Title = a, Vals = b)
df[grep("eta", df$Title),]
   Title Vals
2  Beta2
4 Beeta4
6  beta6

> Karin

-- 
Chuck Cleland, Ph.D.
NDRI, Inc.
71 West 23rd Street, 8th floor
New York, NY 10010
tel: (212) 845-4495 (Tu, Th)
tel: (732) 512-0171 (M, W, F)
fax: (917) 438-0894

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] subsetting a data frame using string matching

2008-01-21 Thread Karin Lagesen


Example data frame: 


a = c("Alpha", "Beta", "Gamma", "Beeta", "Alpha", "beta")
b = c(1:6)
example = data.frame("Title" = a, "Vals" = b)


> example
  Title Vals
1 Alpha1
2  Beta2
3 Gamma3
4 Beeta4
5 Alpha5
6  beta6
> 

I would like to be able to get a new data frame from this data frame
containing only rows that match a certain string. In this case it
could for instance be the string "eta". I have tried various ways of
using agrep and grep, but so far I have not found anything that
worked.

Thankyou in advance for your help!

Karin
-- 
Karin Lagesen, PhD student
[EMAIL PROTECTED]
http://folk.uio.no/karinlag

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Subsetting a data frame

[R] Subsetting a data frame

Re: [R] Subsetting a data frame with multiple values and exclusions.

Re: [R] Subsetting a data frame with multiple values and exclusions.

[R] Subsetting a data frame with multiple values and exclusions.

Re: [R] Subsetting a data frame by dropping correlated variables

[R] Subsetting a data frame by dropping correlated variables

Re: [R] subsetting a data frame

Re: [R] subsetting a data frame

Re: [R] subsetting a data frame

Re: [R] subsetting a data frame

Re: [R] subsetting a data frame

Re: [R] subsetting a data frame

[R] subsetting a data frame

Re: [R] subsetting a data frame using string matching

Re: [R] subsetting a data frame using string matching

[R] subsetting a data frame using string matching

17 matches

Site Navigation

Mail list logo

Footer information