Thank you Liam, For sure it is a lot of computational time. I'm only interested in a subset of species, lets say close to 700 species in a regional species pool. I did the computations using fastDist and it take almost 7 hs to complete in a windows based 16 Gb RAM. The function was realy useful for my propose. Another doubt that came is if I prune the bigger tree with my species list and apply the ape::cophenetic the distances will remain the same or the prunning will change the tree edges lengths and consequently the patristic distances. But, by the way I see that fastDist did the work I want.
Best regards, Bruno Garcia Luize PhD candidate Ecology and Biodiversity – São Paulo State University (Unesp) Em seg, 4 de fev de 2019 às 16:53, Liam Revell <liam.rev...@umb.edu> escreveu: > Dear Bruno. > > > To estimate species divergence time between pair of species, I'm using > > the phylogeny in Smith & Brown 2018 which is rooted and contain > > 353 185 seed plant species along with 85 679 nodes, branch lengths > > are dated. I'm applying the R function phytools::fastDist for a list > > of sampled species to achieve a matrix of patristic distance. > > To get this straight, do you want to compute pairwise distances from a > tree containing over 353 thousand taxa? One problem that immediately > arises is the memory that would be required to store such a matrix. If > we assume that each element of the matrix occupies 8 bytes of memory, > and that we are economic & record only the upper or lower diagonals of > the matrix, we would still need about 500 GB to store the matrix. > > Maybe it would be sufficient to write each distance to file, so you > don't have to worry as much about memory (just enough disk space to > store the file). In that case, you could use fastDist to compute each > distance, but this would take a while as there are 62,369,645,520 > non-trivial distances between taxa (that is, excluding the distance of > each taxon to itself and remembering that Dij=Dji). fastDist can indeed > calculate distances between tips on such a large tree, but (on my > computer) it takes about 0.5s per tip. At that rate it would take about > 10 years to compute all these distances. > > Perhaps you actually only need to compute the distances between a subset > of the taxa on your tree, like a few hundred up to a couple of thousand. > If that is the case, you should be able to use fastDist; however, for > cases in which all distances (not just one or a few) are required then > cophenetic (which computes all distances between all tips in the tree) > will be much faster than fastDist. fastDist is really only useful if one > or a small number of distances are required in which case it can be used > to compute these without calculating all patristic distances from the tree. > > > As I see in Revell's blog "the patristic distance between them is > > simply the sum of the heights above the root for species i and j minus > > two times the height above the root of the common ancestor of i & j", > > is this the same that Fourment and colleagues define as a patristic > > distance: "A patristic distance is the sum of the lengths of the > > branches that link two nodes in a tree"? > > Yes, these distances are the same. The patristic distance is the sum of > the edge lengths that connect a pair of taxa, but this value can be > computed by taking the sum of the total distance from the root to each > taxon, and then subtracting two times the distance from the root to > their common ancestor. > > > Furthermore, I'm wondering if the results of the phytools::fastDist is > > interchangeable with the adephylo::distTips(method="patristic")? > > I can't comment on that function, but the distances from fastDist are > the same as in ape::cophenetic. > > > Finally, is the patristic distance the right choice for my propose or > > should I use another phylogenetic distance? > > I don't know. > > All the best, Liam > > Liam J. Revell > Associate Professor, University of Massachusetts Boston > Profesor Asistente, Universidad Católica de la Ssma Concepción > web: http://faculty.umb.edu/liam.revell/, http://www.phytools.org > > On 2/4/2019 12:41 PM, Bruno Garcia Luize wrote: > > Dear Phylo-community, > > > > I would like to acknowledge this channel that is extremely helpful for my > > education regard phylogenetic research in a broad sense. I can describe > > myself as an ecologist recently introduced to evolution. > > > > As an enthusiast and moved by my curiosity and desire to better > understand > > tropical tree ecology I'm now trying to include phylogenetic information > > for answer my research questions. Obviously, I'm stucked in a lot of > > doubts. > > > > I'm asking if the divergence time is corelated with the probability of a > > species pair positive or negative co-occurrence. I expect species pairs > > relatively more divergent show greater probability of positive > > co-occurrence, while close related species pairs show greater probability > > of negative co-occurrence. > > > > To estimate species divergence time between pair of species, I'm using > the > > phylogeny in Smith & Brown 2018 (ALLOTB > > https://github.com/FePhyFoFum/big_seed_plant_trees) which is rooted and > > contain 353 185 seed plant species along with 85 679 nodes, branch > lengths > > are dated. I'm applying the R function phytools::fastDist for a list of > > sampled species to achieve a matrix of patristic distance. > > > > I have doubt regard the values that the function fastDist is returning. > May > > I consider those values as the divergence time between species? > > > > As I see in Revell's blog ( > > > http://blog.phytools.org/2015/10/new-reasonably-fast-method-to-compute.html > ) > > "the patristic distance between them is simply the sum of the heights > above > > the root for species i and j minus two times the height above the root of > > the common ancestor of i & j", is this the same that Fourment and > > colleagues (https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1352388/) > define > > as a patristic distance: "A patristic distance is the sum of the lengths > of > > the branches that link two nodes in a tree"? > > > > Furthermore, I'm wondering if the results of the phytools::fastDist is > > interchangeable with the adephylo::distTips(method="patristic")? Finally, > > is the patristic distance the right choice for my propose or should I use > > another phylogenetic distance? > > > > I would like to thank in advance you all. With my best regards. > > > > > > > > Bruno Garcia Luize > > > > PhD candidate Ecology and Biodiversity – São Paulo State University > (Unesp) > > > > [[alternative HTML version deleted]] > > > > _______________________________________________ > > R-sig-phylo mailing list - R-sig-phylo@r-project.org > > https://stat.ethz.ch/mailman/listinfo/r-sig-phylo > > Searchable archive at > http://www.mail-archive.com/r-sig-phylo@r-project.org/ > > > [[alternative HTML version deleted]] _______________________________________________ R-sig-phylo mailing list - R-sig-phylo@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-phylo Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/