Wadud, Zia wrote:
> Hi
> I have a panel dataset with large number of groups and differing number
> of observations for each group. I want to randomly select say, 20% of
> the groups or 200 groups, but along with all observations from the
> selcted groups (with the corresponding data). 
> I guess it is possible to generate a random sample from the groups ids
> and then match that with the entire dataset to have the intended
> dataset, but it sounds cumbersome and possibly there is an easier way to
> do this? checked the package 'sampling' or command 'sample', but they
> cant do exactly the same thing.
> I was wondering if someone on this list will be able to share his/her
> knowldege?

  How about something like this?

df <- data.frame(GROUP = rep(1:5, c(2,3,4,2,2)), Y = runif(13))

# Sample Two of the Five Groups

subset(df, GROUP %in% with(df, sample(unique(GROUP), 2)))

> Thanks in advance,
> Zia
> **********************************************************
> Zia Wadud
> PhD Student
> Centre for Transport Studies
> Department of Civil and Environmental Engineering
> Imperial College London
> London SW7 2AZ
> Tel +44 (0) 207 594 6055
>  
> 
>       [[alternative HTML version deleted]]
> 
> ______________________________________________
> [email protected] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

-- 
Chuck Cleland, Ph.D.
NDRI, Inc.
71 West 23rd Street, 8th floor
New York, NY 10010
tel: (212) 845-4495 (Tu, Th)
tel: (732) 512-0171 (M, W, F)
fax: (917) 438-0894

______________________________________________
[email protected] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to