Jeff08 wrote:
Sample Data.Frame format

Name is Returns.nodup

            X       id ticker      date_ adjClose totret RankStk
427225 427225 00174410    AHS 2001-11-13    21.66    100    1235


"id" uniquely defines a row


What I am trying to do is filter out id's that have less than 1500 data
points (by date)

First, I used

total<-by(Returns.nodup, Returns.nodup$id,nrow)

which subsetted by ID and calculated the number of data points for each ID

Now I am trying to figure out a way to use this to filter out the original
data.frame (Returns.nodup)

I have tried using the following, but it is VERY slow:

z<-unlist(lapply(1:length(y), function(i) which(a$id==y[i]) ))
Returns.filtered<-Returns.nodup[z,]

Is there a faster way to do this?


Most likely, yes. But without a reproducible example, it's difficult to think about the problem. Can you please give us one?

If not, you can probably cobble something together using ?table and ?%in% I'm guessing.

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to