Re: [R] Fwd: rarefy a matrix of counts

Brian Frappier Wed, 11 Oct 2006 11:32:19 -0700

I tried all of the approaches below.

the problem with:


> x <- data.frame(matrix(NA,100,3))
> for (i in 2:ncol(DF)) x[,i-1] <- sample(rep(DF[,1], DF[,i]),100)
> if you want result in data frame
> or
> x<-vector("list", 3)
> for (i in 2:ncol(DF)) x[[,i-1]] <- sample(rep(DF[,1], DF[,i]),100)

is that this code still samples the rows, not the elements, i.e. returns 100
or 300 in the matrix cells instead of "red" or a matrix of counts by color
(object type) like:
       x1    x2   x3
red  32     5    60
gr    68    95   40
sum 100  100  100

 It looks like Tony is right: sampling without replacement requires listing
of all elements to be sampled.  But, the code Petr provided

x1 <- sample(c(rep("red",400),rep("green", 100),rep("black",300)),100)

did give me a clue of how to quickly make such a list using the 'rep'
command.  I will for-loop a rep statement using my original matrix to create
a list of elements for each sample:

Thanks Petr and Tony for your help!

On 10/11/06, Tony Plate <[EMAIL PROTECTED]> wrote:
>
> Here's a way using apply(), and the prob= argument of sample():
>
> > df <- data.frame(sample1=c(red=400,green=100,black=300),
> sample2=c(300,0,1000), sample3=c(2500,200,500))
> > df
>        sample1 sample2 sample3
> red       400     300    2500
> green     100       0     200
> black     300    1000     500
> > set.seed(1)
> > apply(df, 2, function(counts) sample(seq(along=counts), rep=T,
> size=7, prob=counts))
>       sample1 sample2 sample3
> [1,]       1       3       1
> [2,]       1       3       1
> [3,]       3       3       1
> [4,]       2       3       2
> [5,]       1       3       1
> [6,]       2       3       1
> [7,]       2       3       3
> >
>
> Note that this does sampling WITH replacement.
> AFAIK, sampling without replacement requires enumerating the entire
> population to be sampled from.  I.e., you cannot do
> > sample(1:3, prob=1:3, rep=F, size=4)
> instead of
> > sample(c(1,2,2,3,3,3), rep=F, size=4)
>
> -- Tony Plate
>
> From reading ?sample, I was a little unclear on whether sampling
> without replacement could work
>
> Petr Pikal wrote:
> > Hi
> >
> > a litle bit different story. But
> >
> > x1 <- sample(c(rep("red",400),rep("green", 100),
> > rep("black",300)),100)
> >
> > is maybe close. With data frame (if it is not big)
> >
> >
> >>DF
> >
> >   color sample1 sample2 sample3
> > 1   red     400     300    2500
> > 2 green     100       0     200
> > 3 black     300    1000     500
> >
> > x <- data.frame(matrix(NA,100,3))
> > for (i in 2:ncol(DF)) x[,i-1] <- sample(rep(DF[,1], DF[,i]),100)
> > if you want result in data frame
> > or
> > x<-vector("list", 3)
> > for (i in 2:ncol(DF)) x[[,i-1]] <- sample(rep(DF[,1], DF[,i]),100)
> >
> > if you want it in list. Maybe somebody is clever enough to discard
> > for loop but you said you have 80 columns which shall be no problem.
> >
> > HTH
> > Petr
> >
> >
> >
> >
> >
> >
> >
> > On 11 Oct 2006 at 10:11, Brian Frappier wrote:
> >
> > Date sent:            Wed, 11 Oct 2006 10:11:33 -0400
> > From:                 "Brian Frappier" <[EMAIL PROTECTED]>
> > To:                   "Petr Pikal" <[EMAIL PROTECTED]>
> > Subject:              Fwd: [R] rarefy a matrix of counts
> >
> >
> >>---------- Forwarded message ----------
> >>From: Brian Frappier <[EMAIL PROTECTED]>
> >>Date: Oct 11, 2006 10:10 AM
> >>Subject: Re: [R] rarefy a matrix of counts
> >>To: [email protected]
> >>
> >>Hi Petr,
> >>
> >>Thanks for your response.  I have data that looks like the following:
> >>
> >>               sample 1         sample 2         sample 3  ....
> >>red candy        400                 300               2500
> >>green candy    100                    0                  200
> >>black candy     300                1000                500
> >>
> >>I don't want to randomly select either the samples (columns) or the
> >>"candy" types (rows), which sample as you state would allow me.
> >>Instead, I want to randomly sample 100 candies from each sample and
> >>retain info on their associated type.  I could make a list of all the
> >>candies in each sample:
> >>
> >>sample 1
> >>red
> >>red
> >>red
> >>red
> >>green
> >>green
> >>black
> >>red
> >>black
> >>...
> >>
> >>and then randomly sample those rows.  Repeat for each sample.  But, I
> >>am not sure how to do that without alot of loops, and am wondering if
> >>there is an easier way in R.  Thanks!  I should have laid this out in
> >>the first email...sorry.
> >>
> >>
> >>On 10/11/06, Petr Pikal <[EMAIL PROTECTED]> wrote:
> >>
> >>>Hi
> >>>
> >>>I am not experienced in Matlab and from your explanation I do not
> >>>understand what exactly do you want. It seems that you want randomly
> >>>choose a sample of 100 rows from your martix, what can be achived by
> >>>sample.
> >>>
> >>>DF<-data.frame(rnorm(100), 1:100, 101:200, 201:300)
> >>>DF[sample(1:100, 10),]
> >>>
> >>>If you want to do this several times, you need to save your result
> >>>and than it depends on what you want to do next. One suitable form
> >>>is list of matrices the other is array and you can use for loop for
> >>>completing it.
> >>>
> >>>HTH
> >>>Petr
> >>>
> >>>
> >>>On 10 Oct 2006 at 17:40, Brian Frappier wrote:
> >>>
> >>>Date sent:              Tue, 10 Oct 2006 17:40:47 -0400
> >>>From:                   "Brian Frappier" <[EMAIL PROTECTED]>
> >>>To:                     [email protected] Subject:
> >>>    [R] rarefy a matrix of counts
> >>>
> >>>
> >>>>Hi all,
> >>>>
> >>>>I have a matrix of counts for objects (rows) by samples (columns).
> >>>> I aimed for about 500 counts in each sample (I have about 80
> >>>>samples) and would now like to rarefy these down to 100 counts in
> >>>>each sample using simple random sampling without replacement.  I
> >>>>plan on rarefying several times for each sample.  I could do the
> >>>>tedious looping task of making a list of all objects (with its
> >>>>associated identifier) in each sample and then use the wonderful
> >>>>"sampling" package to select a sub-sample of 100 for each sample
> >>>>and thereby get a logical vector of inclusions.  I would then
> >>>>regroup the resulting logical vector into a vector of counts by
> >>>>object, rinse and repeat several times for each sample.
> >>>>
> >>>>Alternately, using the same list, I could create a random index of
> >>>>integers between 1 and the number of objects for a sample (without
> >>>>repeats) and then select those objects from the list.  Again,
> >>>>rinse and repeat several time for each sample.
> >>>>
> >>>>Is there a way to directly rarefy a matrix of counts without
> >>>>having to create a list of objects first?  I am trying to switch
> >>>>to R from Matlab and am trying to pick up good programming habits
> >>>>from the start.
> >>>>
> >>>>Much appreciation!
> >>>>
> >>>> [[alternative HTML version deleted]]
> >>>>
> >>>>______________________________________________
> >>>>[email protected] mailing list
> >>>>https://stat.ethz.ch/mailman/listinfo/r-help
> >>>>PLEASE do read the posting guide
> >>>>http://www.R-project.org/posting-guide.html and provide commented,
> >>>>minimal, self-contained, reproducible code.
> >>>
> >>>Petr Pikal
> >>>[EMAIL PROTECTED]
> >>>
> >>>
> >>
> >
> > Petr Pikal
> > [EMAIL PROTECTED]
> >
> > ______________________________________________
> > [email protected] mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
> >
>
>

        [[alternative HTML version deleted]]

______________________________________________
[email protected] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Fwd: rarefy a matrix of counts

Reply via email to