Hi Tom.

Yes, for zero length terminal branches it is inadvisable to add a "tiny amount" to the terminal edge so that the analysis "works". An easy way to think about this is in terms of Felsenstein's contrasts method (which is a special case of PGLS). Contrasts are standardized to have the same expected variance by dividing by the square-root of the subtending edges. If you set some (terminal) branches to be very small - then the corresponding contrast will be very large. (Going to Inf as the edge lengths go to zero!) This will give this contrast very high weight in your regression.

Part of the problem here stems from the different effective meaning of a zero length terminal edge in molecular phylogenetics vs. comparative biology. In the former - a zero length terminal edge probably means that we don't have enough sequence data, and thus failed to sample any substitutions along that edge. Depending on the amount of data that you have, the edge could be 1000s to millions of years long. A zero length terminal edge in comparative inference means that the tree ends at a speciation event - no time elapsed between the speciation event that created a lineage and the present time. Under those circumstances, we should expect to find no phenotypic difference between species separated by zero time! (And it makes sense that comparative methods wouldn't like data that suggested otherwise.)

The best solution is to get more data to infer your tree. Failing that, you could assume that there is the equivalent of one substitution leading to the tips with zero length terminal edges. Alternatively, you could assume that differences between species separated by no patristic distance is 'sampling error' and use the method of Ives et al. (2007). I'm not sure what approach is best or if you have other options other than collapsing or throwing out data.

All the best, Liam

Liam J. Revell, Assistant Professor of Biology
University of Massachusetts Boston
web: http://faculty.umb.edu/liam.revell/
email: liam.rev...@umb.edu
blog: http://blog.phytools.org

On 7/30/2013 3:35 PM, Tom Kraft wrote:
Dear all,

I am running a standard pgls analysis using a phylogeny that contains
several terminal edges of length zero.  The tree is ultrametric and was
generated using molecular data. This results in a predictable error with
solve when using pgls(). Looking into this issue, it seems I am faced with
the following possibilities:

1) Drop all taxa with terminal branch lengths of 0.

2) Add a very small number (i.e. 0.00001) to each terminal branch length of
0.

On Liam Revell's blog, he writes, "for zero length terminal edges it is
probably reasonable to just add a very small length to that edge. For many
comparative analyses, this is inadvisable, but for ancestral state
estimation it is probably OK...".  This makes it seem like option 2 is not
very good, although losing taxa from the analysis is obviously not
desirable either.

Am I right that these are currently my options for dealing with this
problem? And if so, what is recommended? If anyone has any suggestions
about how to proceed with this analysis or links to relevant literature
that I am missing, I would appreciate it very much.

Thank you in advance,
Tom

Thomas Kraft
PhD Student
Department of Biology
Dartmouth College

        [[alternative HTML version deleted]]

_______________________________________________
R-sig-phylo mailing list - R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/


_______________________________________________
R-sig-phylo mailing list - R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/

Reply via email to