Andy, Jane,

If speed is an issue or you are working with larger problems than the
example Andy used, then we can exploit other tools in R to get the same
answer as Andy's spp.cooc() function, but much more efficiently, using a
matrix multiplication:

Here's Andy's example and my version with some timings:

## Set a random seed
set.seed(123)

## dummy data
sppXsite <- matrix(rpois(15,0.5),nrow=3)
colnames(sppXsite) <- paste("spp",1:5,sep="")
rownames(sppXsite) <- paste("site",1:3,sep="")
sppXsite    # here's what it looks like

# now make a function to compute the co-occurrence matrix
spp.cooc <- function(matrx) {
    # first we make a list of all the sites where each spp is found
    site.list <- apply(matrx,2,function(x) which(x > 0))
    # then we see which spp are found at the same sites
    sapply(site.list,function(x1)
            {
                sapply(site.list,function(x2) 1*any(x2 %in% x1))
            })
    # the result is returned in a symmetrical matrix of dimension
    # equal to the number of spp
}

## And my version
spp.cooc2 <- function(mat) {
    ncol <- NCOL(mat)
    res <- matrix(as.numeric((t(mat) %*% mat) > 0), ncol = ncol)
    rownames(res) <- colnames(res) <- colnames(mat)
    return(res)
}

all.equal(spp.cooc(sppXsite), spp.cooc2(sppXsite)) ## TRUE!

## Timings
system.time(replicate(1000, spp.cooc(sppXsite)))
system.time(replicate(1000, spp.cooc2(sppXsite)))

> system.time(replicate(1000, spp.cooc(sppXsite)))
   user  system elapsed 
  0.728   0.004   0.733 
> system.time(replicate(1000, spp.cooc2(sppXsite)))
   user  system elapsed 
  0.067   0.000   0.068

## larger problem
set.seed(123)
sites <- 100
species <- 50
sppXsite.big <- matrix(rpois(sites * species, 0.5), nrow=sites)
colnames(sppXsite.big) <- paste("spp", seq_len(species), sep="")
rownames(sppXsite.big) <- paste("site", seq_len(sites), sep="")

## Timings
## Note the first line below takes ~40 seconds on my fast PC
system.time(replicate(1000, spp.cooc(sppXsite.big)))
system.time(replicate(1000, spp.cooc2(sppXsite.big)))

> ## Timings
> system.time(replicate(1000, spp.cooc(sppXsite.big)))
   user  system elapsed 
 41.049   0.043  41.244 
> system.time(replicate(1000, spp.cooc2(sppXsite.big)))
   user  system elapsed 
  0.423   0.037   0.468

If speed or size of problem is not an issue then either works just well
enough.

I don't think we have anything like this in Vegan, but happy to be
corrected if we do. If we don't, I'll chat with Jari and see about
adding it to the package.

All the best,

G

On Fri, 2010-08-27 at 12:41 -0400, Andy Rominger wrote:
> Hi Jane,
> 
> I think someone may have asked something similar on the r-sig-eco email list
> (which is a good resource in general:
> https://stat.ethz.ch/mailman/listinfo/r-sig-ecology)
> 
> I think the answer may have been there there's a function in the vegan
> package for R (http://cran.r-project.org/web/packages/vegan/index.html).
> 
> But it would be pretty simple to write something up in R.  Here's one way of
> doing it (if I'm correct in my interpretation of a co-occurrence matrix!).
> The actual function (called `spp.cooc') is really only 2 lines long--the
> code just looks longer from making up example data and adding in the
> comments.
> 
> Hope this might do the trick for you!  Note that in it's current form you
> would have to give the function a matrix or data.frame of ONLY NUMBERS in
> which species are columns and sites are rows.  This could be changed by
> manipulating the MARGIN argument of the apply command below, i.e., site.list
> <- apply(matrx,1,...)
> 
> Hope this helps--
> Andy
> 
> 
> # make some example data
> sppXsite <- matrix(rpois(15,0.5),nrow=3)
> colnames(sppXsite) <- paste("spp",1:5,sep="")
> rownames(sppXsite) <- paste("site",1:3,sep="")
> sppXsite    # here's what it looks like
> 
> # now make a function to compute the co-occurrence matrix
> spp.cooc <- function(matrx) {
>     # first we make a list of all the sites where each spp is found
>     site.list <- apply(matrx,2,function(x) which(x > 0))
>     # then we see which spp are found at the same sites
>     sapply(site.list,function(x1)
>             {
>                 sapply(site.list,function(x2) 1*any(x2 %in% x1))
>             })
>     # the result is returned in a symmetrical matrix of dimension
>     # equal to the number of spp
> }
> 
> # here's how it works
> co.matrix <- spp.cooc(sppXsite)
> co.matrix
> 
> 
> 
> 
> On Fri, Aug 27, 2010 at 12:46 AM, Jane Shevtsov <jane....@gmail.com> wrote:
> 
> > Is there a fast way to make a species co-occurrence matrix given a
> > site-species matrix or lists of species found at each site? I'm
> > looking for a spreadsheet or database method (preferably OpenOffice)
> > or R function.
> >
> > Thanks,
> > Jane
> >
> > --
> > -------------
> > Jane Shevtsov
> > Ecology Ph.D. candidate, University of Georgia
> > co-founder, <www.worldbeyondborders.org>
> > Check out my blog, <http://perceivingwholes.blogspot.com>Perceiving Wholes
> >
> > "The whole person must have both the humility to nurture the
> > Earth and the pride to go to Mars." --Wyn Wachhorst, The Dream
> > of Spaceflight
> >

-- 
%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%
 Dr. Gavin Simpson             [t] +44 (0)20 7679 0522
 ECRC, UCL Geography,          [f] +44 (0)20 7679 0565
 Pearson Building,             [e] gavin.simpsonATNOSPAMucl.ac.uk
 Gower Street, London          [w] http://www.ucl.ac.uk/~ucfagls/
 UK. WC1E 6BT.                 [w] http://www.freshwaters.org.uk
%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%

Reply via email to