Thank you Liam,

For sure it is a lot of computational time. I'm only interested in a subset
of species, lets say close to 700 species in a regional species pool. I did
the computations using fastDist and it take almost 7 hs to complete in a
windows based 16 Gb RAM. The function was realy useful for my propose.
Another doubt that came is if I prune the bigger tree with my species list
and apply the ape::cophenetic the distances will remain the same or the
prunning will change the tree edges lengths and consequently the patristic
distances. But, by the way I see that fastDist did the work I want.

Best regards,

Bruno Garcia Luize

PhD candidate Ecology and Biodiversity – São Paulo State University (Unesp)



Em seg, 4 de fev de 2019 às 16:53, Liam Revell <liam.rev...@umb.edu>
escreveu:

> Dear Bruno.
>
>  > To estimate species divergence time between pair of species, I'm using
>  > the phylogeny in Smith & Brown 2018 which is rooted and contain
>  > 353 185 seed plant species along with 85 679 nodes, branch lengths
>  > are dated. I'm applying the R function phytools::fastDist for a list
>  > of sampled species to achieve a matrix of patristic distance.
>
> To get this straight, do you want to compute pairwise distances from a
> tree containing over 353 thousand taxa? One problem that immediately
> arises is the memory that would be required to store such a matrix. If
> we assume that each element of the matrix occupies 8 bytes of memory,
> and that we are economic & record only the upper or lower diagonals of
> the matrix, we would still need about 500 GB to store the matrix.
>
> Maybe it would be sufficient to write each distance to file, so you
> don't have to worry as much about memory (just enough disk space to
> store the file). In that case, you could use fastDist to compute each
> distance, but this would take a while as there are 62,369,645,520
> non-trivial distances between taxa (that is, excluding the distance of
> each taxon to itself and remembering that Dij=Dji). fastDist can indeed
> calculate distances between tips on such a large tree, but (on my
> computer) it takes about 0.5s per tip. At that rate it would take about
> 10 years to compute all these distances.
>
> Perhaps you actually only need to compute the distances between a subset
> of the taxa on your tree, like a few hundred up to a couple of thousand.
> If that is the case, you should be able to use fastDist; however, for
> cases in which all distances (not just one or a few) are required then
> cophenetic (which computes all distances between all tips in the tree)
> will be much faster than fastDist. fastDist is really only useful if one
> or a small number of distances are required in which case it can be used
> to compute these without calculating all patristic distances from the tree.
>
>  > As I see in Revell's blog "the patristic distance between them is
>  > simply the sum of the heights above the root for species i and j minus
>  > two times the height above the root of the common ancestor of i & j",
>  > is this the same that Fourment and colleagues define as a patristic
>  > distance: "A patristic distance is the sum of the lengths of the
>  > branches that link two nodes in a tree"?
>
> Yes, these distances are the same. The patristic distance is the sum of
> the edge lengths that connect a pair of taxa, but this value can be
> computed by taking the sum of the total distance from the root to each
> taxon, and then subtracting two times the distance from the root to
> their common ancestor.
>
>  > Furthermore, I'm wondering if the results of the phytools::fastDist is
>  > interchangeable with the adephylo::distTips(method="patristic")?
>
> I can't comment on that function, but the distances from fastDist are
> the same as in ape::cophenetic.
>
>  > Finally, is the patristic distance the right choice for my propose or
>  > should I use another phylogenetic distance?
>
> I don't know.
>
> All the best, Liam
>
> Liam J. Revell
> Associate Professor, University of Massachusetts Boston
> Profesor Asistente, Universidad Católica de la Ssma Concepción
> web: http://faculty.umb.edu/liam.revell/, http://www.phytools.org
>
> On 2/4/2019 12:41 PM, Bruno Garcia Luize wrote:
> > Dear Phylo-community,
> >
> > I would like to acknowledge this channel that is extremely helpful for my
> > education regard phylogenetic research in a broad sense. I can describe
> > myself as an ecologist recently introduced to evolution.
> >
> > As an enthusiast and moved by my curiosity and desire to better
> understand
> > tropical tree ecology I'm now trying to include phylogenetic information
> > for answer my research questions. Obviously, I'm stucked in a lot of
> > doubts.
> >
> > I'm asking if the divergence time is corelated with the probability of a
> > species pair positive or negative co-occurrence. I expect species pairs
> > relatively more divergent show greater probability of positive
> > co-occurrence, while close related species pairs show greater probability
> > of negative co-occurrence.
> >
> > To estimate species divergence time between pair of species, I'm using
> the
> > phylogeny in Smith & Brown 2018 (ALLOTB
> > https://github.com/FePhyFoFum/big_seed_plant_trees) which is rooted and
> > contain 353 185 seed plant species along with 85 679 nodes, branch
> lengths
> > are dated. I'm applying the R function phytools::fastDist for a list of
> > sampled species to achieve a matrix of patristic distance.
> >
> > I have doubt regard the values that the function fastDist is returning.
> May
> > I consider those values as the divergence time between species?
> >
> > As I see in Revell's blog (
> >
> http://blog.phytools.org/2015/10/new-reasonably-fast-method-to-compute.html
> )
> > "the patristic distance between them is simply the sum of the heights
> above
> > the root for species i and j minus two times the height above the root of
> > the common ancestor of i & j", is this the same that Fourment and
> > colleagues (https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1352388/)
> define
> > as a patristic distance: "A patristic distance is the sum of the lengths
> of
> > the branches that link two nodes in a tree"?
> >
> > Furthermore, I'm wondering if the results of the phytools::fastDist is
> > interchangeable with the adephylo::distTips(method="patristic")? Finally,
> > is the patristic distance the right choice for my propose or should I use
> > another phylogenetic distance?
> >
> > I would like to thank in advance you all. With my best regards.
> >
> >
> >
> > Bruno Garcia Luize
> >
> > PhD candidate Ecology and Biodiversity – São Paulo State University
> (Unesp)
> >
> >       [[alternative HTML version deleted]]
> >
> > _______________________________________________
> > R-sig-phylo mailing list - R-sig-phylo@r-project.org
> > https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
> > Searchable archive at
> http://www.mail-archive.com/r-sig-phylo@r-project.org/
> >
>

        [[alternative HTML version deleted]]

_______________________________________________
R-sig-phylo mailing list - R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/

Reply via email to