Hi Frantz,

The attached code can run the phylogenetic PCA (computes the scores) in few 
seconds on a species tree with more than 5000 tips. Note however that the 
phylogenetic pca is not exactly "removing" the phylo signal. It is a rigid 
rotation of what you obtain with the conventional PCA. You may still have to 
use comparative methods on these axes

Best wishes,

Julien
________________________________
De : R-sig-phylo <r-sig-phylo-boun...@r-project.org> de la part de fran?ois 
RIGAL <frantz.ri...@hotmail.fr>
Envoyé : mercredi 23 mai 2018 10:25
À : r-sig-phylo@r-project.org
Objet : [R-sig-phylo] remove phylogenetic signal from large dataset

Dear colleagues

I am struggling to find a way to remove phylogenetic signal from quantitative 
traits for a very large dataset (more than 5000 species. I already tried to 
implement a pPCA (function phyl.pca in the phytools package) but the function 
ran for more than a week and I decide to abort it. I also checked for the 
phylogenetic eigenvector approach but again I am not sure it makes senses for a 
so large phylogeny. (Too much PCoA axes and problem of selection of those 
axes). Thereofore, I would to ask if someone knows about a method to remove 
phylogeneitic signal for such large dataset, or, simply if it make sense to 
even try to remove phylo signal from such big data.

Many thanks

Frantz
_______________________________________________
R-sig-phylo mailing list - R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/
pca_score_phylo <- function(x, center=TRUE, tree=NULL){
  require(mvMORPH)
  n <- nrow(x)
  p <- ncol(x)
  
  if(center==TRUE){
    # transform the tree
    if(!is.null(tree)){
      decomp <- pruning(tree)
      Cinv <- decomp$sqrtMat
      one <- matrix(1,ncol=1,nrow=n)
      oneD <- Cinv%*%one
      xmod <- Cinv%*%x
      a <- pseudoinverse(oneD)%*%xmod
      
    }else{
      one <- matrix(1,ncol=1,nrow=n)
      a <- pseudoinverse(one)%*%x
    }
    
    V <- x - one%*%a
  }else{
    V <- x
    a <- NULL
  }
  
    if(!is.null(tree)){
      covariance <- crossprod(xmod - oneD%*%a)/(n-1)
    }else{
      covariance <- crossprod(V)/(n-1)
    }
    
    eig <- eigen(covariance)
  
  S <- V%*%eig$vectors
  results <- list(score=S, values=eig$values, vectors=eig$vectors, 
rank=qr(covariance)$rank, mean=a)
  return(results)
}
_______________________________________________
R-sig-phylo mailing list - R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/

Reply via email to