Dear Bruno.

 > To estimate species divergence time between pair of species, I'm using
 > the phylogeny in Smith & Brown 2018 which is rooted and contain
 > 353 185 seed plant species along with 85 679 nodes, branch lengths
 > are dated. I'm applying the R function phytools::fastDist for a list
 > of sampled species to achieve a matrix of patristic distance.

To get this straight, do you want to compute pairwise distances from a 
tree containing over 353 thousand taxa? One problem that immediately 
arises is the memory that would be required to store such a matrix. If 
we assume that each element of the matrix occupies 8 bytes of memory, 
and that we are economic & record only the upper or lower diagonals of 
the matrix, we would still need about 500 GB to store the matrix.

Maybe it would be sufficient to write each distance to file, so you 
don't have to worry as much about memory (just enough disk space to 
store the file). In that case, you could use fastDist to compute each 
distance, but this would take a while as there are 62,369,645,520 
non-trivial distances between taxa (that is, excluding the distance of 
each taxon to itself and remembering that Dij=Dji). fastDist can indeed 
calculate distances between tips on such a large tree, but (on my 
computer) it takes about 0.5s per tip. At that rate it would take about 
10 years to compute all these distances.

Perhaps you actually only need to compute the distances between a subset 
of the taxa on your tree, like a few hundred up to a couple of thousand. 
If that is the case, you should be able to use fastDist; however, for 
cases in which all distances (not just one or a few) are required then 
cophenetic (which computes all distances between all tips in the tree) 
will be much faster than fastDist. fastDist is really only useful if one 
or a small number of distances are required in which case it can be used 
to compute these without calculating all patristic distances from the tree.

 > As I see in Revell's blog "the patristic distance between them is
 > simply the sum of the heights above the root for species i and j minus
 > two times the height above the root of the common ancestor of i & j",
 > is this the same that Fourment and colleagues define as a patristic
 > distance: "A patristic distance is the sum of the lengths of the
 > branches that link two nodes in a tree"?

Yes, these distances are the same. The patristic distance is the sum of 
the edge lengths that connect a pair of taxa, but this value can be 
computed by taking the sum of the total distance from the root to each 
taxon, and then subtracting two times the distance from the root to 
their common ancestor.

 > Furthermore, I'm wondering if the results of the phytools::fastDist is
 > interchangeable with the adephylo::distTips(method="patristic")?

I can't comment on that function, but the distances from fastDist are 
the same as in ape::cophenetic.

 > Finally, is the patristic distance the right choice for my propose or
 > should I use another phylogenetic distance?

I don't know.

All the best, Liam

Liam J. Revell
Associate Professor, University of Massachusetts Boston
Profesor Asistente, Universidad Católica de la Ssma Concepción
web: http://faculty.umb.edu/liam.revell/, http://www.phytools.org

On 2/4/2019 12:41 PM, Bruno Garcia Luize wrote:
> Dear Phylo-community,
> 
> I would like to acknowledge this channel that is extremely helpful for my
> education regard phylogenetic research in a broad sense. I can describe
> myself as an ecologist recently introduced to evolution.
> 
> As an enthusiast and moved by my curiosity and desire to better understand
> tropical tree ecology I'm now trying to include phylogenetic information
> for answer my research questions. Obviously, I'm stucked in a lot of
> doubts.
> 
> I'm asking if the divergence time is corelated with the probability of a
> species pair positive or negative co-occurrence. I expect species pairs
> relatively more divergent show greater probability of positive
> co-occurrence, while close related species pairs show greater probability
> of negative co-occurrence.
> 
> To estimate species divergence time between pair of species, I'm using the
> phylogeny in Smith & Brown 2018 (ALLOTB
> https://github.com/FePhyFoFum/big_seed_plant_trees) which is rooted and
> contain 353 185 seed plant species along with 85 679 nodes, branch lengths
> are dated. I'm applying the R function phytools::fastDist for a list of
> sampled species to achieve a matrix of patristic distance.
> 
> I have doubt regard the values that the function fastDist is returning. May
> I consider those values as the divergence time between species?
> 
> As I see in Revell's blog (
> http://blog.phytools.org/2015/10/new-reasonably-fast-method-to-compute.html)
> "the patristic distance between them is simply the sum of the heights above
> the root for species i and j minus two times the height above the root of
> the common ancestor of i & j", is this the same that Fourment and
> colleagues (https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1352388/) define
> as a patristic distance: "A patristic distance is the sum of the lengths of
> the branches that link two nodes in a tree"?
> 
> Furthermore, I'm wondering if the results of the phytools::fastDist is
> interchangeable with the adephylo::distTips(method="patristic")? Finally,
> is the patristic distance the right choice for my propose or should I use
> another phylogenetic distance?
> 
> I would like to thank in advance you all. With my best regards.
> 
> 
> 
> Bruno Garcia Luize
> 
> PhD candidate Ecology and Biodiversity – São Paulo State University (Unesp)
> 
>       [[alternative HTML version deleted]]
> 
> _______________________________________________
> R-sig-phylo mailing list - R-sig-phylo@r-project.org
> https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
> Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/
> 
_______________________________________________
R-sig-phylo mailing list - R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/

Reply via email to