Re: [R] rarefy a matrix of counts

Brian Frappier Fri, 13 Oct 2006 07:42:46 -0700

Thank you, Alex!  That's exactly what I was looking to do.  I'm going to
remove the loops and use your apply function approach.  Best regards and
much thanks,  brian


On 10/13/06, Alex Brown <[EMAIL PROTECTED]> wrote:
>
> I thought at first that you could use a weighted sample (the sample
> function) but, you can't since it doesn't take proper account of
> replacement if you try that.
>
> You can use the list approach, but through the power of R, you don't
> need a lot of loops to do it...
>
> I can't speak for the efficiency of this approach in terms of cpu cycle.
>
> In short:
>
> apply(z2,2,function(x)sample(rep(names(x),x),100))
>
> In long:
>
> #let's load the data:
>
> z = scan(,"",sep="\n")
>                 sample.1         sample.2         sample.3
> red.candy       400                 300               2500
> green.candy    100                    0                  200
> black.candy     300                1000                500
>
> #and turn into a table
>
>   z2 = read.table(textConnection(z), header=TRUE, row.names=1)
>
> # let's create a functon to expand a sample column into individuals:
>
> expand <- function(x) rep(names(x), x)
>
> # test it on a smaller set:
>
> ex <- expand( c( red = 2, blue = 3) )
>
> ex
> [1] "red"  "red"  "blue" "blue" "blue"
>
> # and sample 2 things from that:
>
> sample( ex, 2 )
>
> # combine the two
>
> samplex <- function( x, size ) sample(expand(x), size )
>
> samplex( c( red = 2, blue = 3), size = 2 )
>
> # ok, now we use the apply function to apply this to each column
>
> apply(z2, 2, samplex, size = 2 )
>
> # you wanted 100?
>
> apply(z2, 2, samplex, size = 100 )
>
> # all done.
>
> #You should note that if there are less than 100 (samplenumber)
> candies in any given sample, this function will fail.
> # eg:
>
> apply(z2, 2, samplex, size = 2000 )
>
> Error in sample(length(x), size, replace, prob) :
>         cannot take a sample larger than the population
> when 'replace = FALSE'
>
> -Alex
>
> On 11 Oct 2006, at 15:10, Brian Frappier wrote:
>
> > Hi Petr,
> >
> > Thanks for your response.  I have data that looks like the following:
> >
> >                sample 1         sample 2         sample 3  ....
> > red candy        400                 300               2500
> > green candy    100                    0                  200
> > black candy     300                1000                500
> >
> > I don't want to randomly select either the samples (columns) or the
> > "candy"
> > types (rows), which sample as you state would allow me.  Instead, I
> > want to
> > randomly sample 100 candies from each sample and retain info on their
> > associated type.  I could make a list of all the candies in each
> > sample:
> >
> > sample 1
> > red
> > red
> > red
> > red
> > green
> > green
> > black
> > red
> > black
> > ...
> >
> > and then randomly sample those rows.  Repeat for each sample.  But,
> > I am not
> > sure how to do that without alot of loops, and am wondering if
> > there is an
> > easier way in R.  Thanks!  I should have laid this out in the first
> > email...sorry.
> >
> >
> > On 10/11/06, Petr Pikal <[EMAIL PROTECTED]> wrote:
> >>
> >> Hi
> >>
> >> I am not experienced in Matlab and from your explanation I do not
> >> understand what exactly do you want. It seems that you want randomly
> >> choose a sample of 100 rows from your martix, what can be achived by
> >> sample.
> >>
> >> DF<-data.frame(rnorm(100), 1:100, 101:200, 201:300)
> >> DF[sample(1:100, 10),]
> >>
> >> If you want to do this several times, you need to save your result
> >> and than it depends on what you want to do next. One suitable form is
> >> list of matrices the other is array and you can use for loop for
> >> completing it.
> >>
> >> HTH
> >> Petr
> >>
> >>
> >> On 10 Oct 2006 at 17:40, Brian Frappier wrote:
> >>
> >> Date sent:              Tue, 10 Oct 2006 17:40:47 -0400
> >> From:                   "Brian Frappier" <[EMAIL PROTECTED]>
> >> To:                     [email protected]
> >> Subject:                [R] rarefy a matrix of counts
> >>
> >>> Hi all,
> >>>
> >>> I have a matrix of counts for objects (rows) by samples
> >>> (columns).  I
> >>> aimed for about 500 counts in each sample (I have about 80 samples)
> >>> and would now like to rarefy these down to 100 counts in each sample
> >>> using simple random sampling without replacement.  I plan on
> >>> rarefying
> >>> several times for each sample.  I could do the tedious looping
> >>> task of
> >>> making a list of all objects (with its associated identifier) in
> >>> each
> >>> sample and then use the wonderful "sampling" package to select a
> >>> sub-sample of 100 for each sample and thereby get a logical
> >>> vector of
> >>> inclusions.  I would then regroup the resulting logical vector
> >>> into a
> >>> vector of counts by object, rinse and repeat several times for each
> >>> sample.
> >>>
> >>> Alternately, using the same list, I could create a random index of
> >>> integers between 1 and the number of objects for a sample (without
> >>> repeats) and then select those objects from the list.  Again, rinse
> >>> and repeat several time for each sample.
> >>>
> >>> Is there a way to directly rarefy a matrix of counts without
> >>> having to
> >>> create a list of objects first?  I am trying to switch to R from
> >>> Matlab and am trying to pick up good programming habits from the
> >>> start.
> >>>
> >>> Much appreciation!
> >>>
> >>>  [[alternative HTML version deleted]]
> >>>
> >>> ______________________________________________
> >>> [email protected] mailing list
> >>> https://stat.ethz.ch/mailman/listinfo/r-help
> >>> PLEASE do read the posting guide
> >>> http://www.R-project.org/posting-guide.html and provide commented,
> >>> minimal, self-contained, reproducible code.
> >>
> >> Petr Pikal
> >> [EMAIL PROTECTED]
> >>
> >>
> >
> >       [[alternative HTML version deleted]]
> >
> > ______________________________________________
> > [email protected] mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide http://www.R-project.org/posting-
> > guide.html
> > and provide commented, minimal, self-contained, reproducible code.
>
>

        [[alternative HTML version deleted]]

______________________________________________
[email protected] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] rarefy a matrix of counts

Reply via email to