Re: [R-sig-phylo] Using bind.tip and lapply on a multiPhylo object

2021-06-04 Thread Mike Collyer


Hello Everyone,

Just to stack onto Liam’s “fun with apply family of functions”, I often use 
lapply, which works well by having a function that only applies to the objects 
in a list, but often — and for difficult to decipher reasons — breaks down if 
the function is not predefined, is complex, has a few arguments to vary, or 
might require several steps.  As a way to deal with this kind of issue, I often 
define a new list for lapply and avoid using the list I want to work as the 
target.  Here is an attempt to show how that can be done (but I think Liam’s 
suggestion is actually better)

n <- length(tree)

newTree <- lapply(1:n, function(j){
  tree.j <- tree[[j]]
  bind.tip(tree.j, bind.tip, tip.label="Cercopithecus_albogularis",
  position=0.59, edge.length = 0.59,  
  where=mrca(tree.j)["Cercopithecus_mitis","Cercopithecus_mitis"])
})


Another way this could be done is to make multiple lists and use the Map 
function, which is basically a modification of the mapply example Liam used.  
(This example also uses Liam’s suggestion to use fastMRCA.)

tip.label <- lapply(1:n, function(.) "Cercopithecus_albogularis")
MRCA <- lapply(1:n, function(j) fastMRCA(tree[[j]], 
"Cercopithecus_albogularis", "Cercopithecus_albogularis"))

newTree <- Map(function(tr, tl, m) bind.tip(tr, tl, 
   position=0.59, edge.length = 0.59,  
   where = m),
   tree,
   tip.label,
   MRCA
  )

Finally, the do.call function can be helpful when repeating a function that has 
only one or few arguments among several changing over a list.  Building on 
previous set up, one could do this

bind.args <- list(tree = tree[[1]], tip.label = “Cercopithecus_albogularis”,
  position=0.59, edge.length = 0.59, where = MRCA[[1]],
  interactive = FALSE)

newTree <- lapply(1:n, function(j){
  bind.args$tree <- tree[[j]]
  bind.args$where <- MRCA[[j]]
  do.call(bind.tip, bind.args)
})


I also suggest these without verification with real data.

Cheers!
Mike


> On Jun 2, 2021, at 9:11 PM, Liam J. Revell  wrote:
> 
> Dear Russell et al.
> 
> Using a for loop is a great idea! Highly underrated in R, IMO. ;)
> 
> However, for future reference, the reason that your code didn't work with 
> lapply is because the list you're 'applying' over (tree) also appears among 
> the arguments!
> 
> If you want to use apply-family functions instead of a for loop (just, say, 
> for fun) then you have two basic options: you can write a custom function; or 
> you can use mapply.
> 
> Here's some (untested) code to do it.
> 
> ## first, using a custom function & lapply:
> foo<-function(tree) bind.tip(tree,
>   tip.label="Cercopithecus_albogularis",
>   position=0.59,edge.length=0.59,
>   where=getMRCA(tree,tip=c("Cercopithecus_mitis",
>   "Cercopithecus_mitis")))
> newtree<-lapply(tree,foo)
> class(newtree)<-"multiPhylo"
> 
> ## now, using mapply:
> newtree<-mapply(bind.tip,tree=tree,where=lapply(tree,getMRCA,
>   tip=c("Cercopithecus_mitis","Cercopithecus_mitis")),
>   MoreArgs=list(tip.label="Cercopithecus_albogularis",
>   position=0.59,edge.length=0.59),SIMPLIFY=FALSE)
> class(newtree)<-"multiPhylo"
> 
> (Code is not guaranteed! I don't have the data file, so I didn't actually 
> test it -- but something like this ought to work.)
> 
> Regardless, I recommend using ape::getMRCA (or phytools::fastMRCA) because 
> otherwise you're computing an N x N matrix in each iteration of your function 
> call just to get one node index.
> 
> Good luck! All the best, Liam
> 
> Liam J. Revell
> University of Massachusetts Boston [Assoc. Prof.]
> Universidad Católica de la Ssma Concepción [Adj. Res.]
> 
> Web & phytools:
> http://faculty.umb.edu/liam.revell/, http://www.phytools.org, 
> http://blog.phytools.org
> 
> Academic Director UMass Boston Chile Abroad:
> https://www.umb.edu/academics/caps/international/biology_chile
> 
> U.S. COVID-19 explorer web application:
> https://covid19-explorer.org/
> 
> On 6/2/2021 8:18 PM, Nathan Upham wrote:
>> EXTERNAL SENDER
>> Hi Russell:
>> Glad to hear you’re using the VertLife mammal trees — they are built on a 
>> taxonomy of 5,911 species of which only 4,098 are sampled for DNA, so there 
>> is already a ~30% chunk that is placed using taxonomic constraints and 
>> birth-death branch lengths as sampled during the estimation of 28 Bayesian 
>> patch clades.
>> Adding additional species described since the 2015 cutoff of that VertLife 
>> taxonomy makes sense (e.g., up to ~6,500 species on mammaldiversity.org).  
>> However, keep in mind that they will not have birth-death estimated branch 
>> lengths, but rather more likely be added as a polygamy to given clade and 
>> then randomly resolved.
>> Given the sample code you provided, the key thing you’ll want to do is run a 
>> *loop* rather than using lapply, so that you can specify a given tree each 
>> time, e.g.:
>> newtrees<-vector(“list”,length(trees))
>> for(j in 

[R-sig-phylo] Phylogenetically aligned component analysis

2020-10-27 Thread Mike Collyer
Dear Colleagues,

We wish to alert you to a new article, introducing phylogenetically aligned 
component analysis, currently in early release form:

Collyer, M.L. and D.C Adams. 2020 (in press). Phylogenetically aligned 
component analysis.  Methods in Ecology and Evolution.

If you do not have access to MEE, we have a link to the accepted article and 
supporting information here 
<https://www.researchgate.net/publication/344900621_Phylogenetically_Aligned_Component_Analysis>.
  (Or paste 
https://www.researchgate.net/publication/344900621_Phylogenetically_Aligned_Component_Analysis
 
<https://www.researchgate.net/publication/344900621_Phylogenetically_Aligned_Component_Analysis>
 in your web browser.)

Phylogenetically aligned component analysis (PACA) is an ordination method 
similar to phylogenetic PCA, but rather than finding eigenvectors that are 
evolutionarily independent, it finds vectors that are most associated with 
phylogenetic signal.  PACA provides a tool for visualizing phylogenetic signal 
in multivariate data and can assist for discerning between weak phylogenetic 
signal and strong phylogenetic signal concentrated in only a portion of the 
data dimensions.  In conjunction with PCA, and phylogenetic PCA, it can assist 
in isolating the phylogenetic signal in multivariate data that might be 
obscured by other signals (e.g., allometric, ecological).

We make PACA available to GM users in the RRPP and geomorph R packages, with 
RRPP::ordinate and geomorph::gm.prcomp functions.  These functions allow users 
to align data to either principal or phylogenetically aligned vectors, project 
ancestral states and phylogenetic tree edges into a plot, and evaluate the 
amount of covariance between data and phylogeny, by vector.  Additionally, the 
physignal function in geomorph provides $PACA output, along with the amount of 
cumulative phylogenetic signal, by vector, which can inform if phylogenetic 
signal is especially strong in certain data dimensions.

We recommend installing the latest versions of RRPP and geomorph via Github; 
i.e.,

devtools::install_github(“mlcollyer/RRPP”, build_vignettes = TRUE)
devtools::install_github(“geomorphR/geomorph”, ref = “Stable”, build_vignettes 
= TRUE)

Happy computing!

Mike Collyer and Dean Adams
[[alternative HTML version deleted]]

___
R-sig-phylo mailing list - R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/