[R-sig-phylo] Using bind.tip and lapply on a multiPhylo object

2021-06-02 Thread Russell Engelman
Dear R-sig-phylo,

I have been working with a mammalian phylogeny I recently downloaded from
VertLife (http://vertlife.org/phylosubsets/). Unfortunately, the phylogeny
is missing a large number of species, so I am trying to manually add these
taxa to the phylogeny. I have a series of 100 trees that I am using to do
things such as test for phylogenetic signal. I know how to use bind.tip to
add new taxa to a single tree, but I am having more trouble with a
multiPhylo object. I am primarily adding these taxa by placing them as
sister to their nearest included relative (since most of them are elevated
former subspecies), but the issue here is that in the 100 trees in the
multiPhylo object the node representing the taxon to bind these taxa to is
not the same across all trees due to shifting topologies.

This is an example of the code I have been using, in which "tree" is the
tree object. This works for a single 'phylo' tree but not 'multiphylo'.

```
newtree<-lapply(tree,bind.tip,tip.label="Cercopithecus_albogularis",
position=0.59,edge.length = 0.59,

where=mrca(tree)["Cercopithecus_mitis","Cercopithecus_mitis"])
```

Now, this code will not work, but I know exactly why: 'tree' is a
multiPhylo object and so the 'where' argument cannot find the node for the
terminal taxon. However, the issue is how can I tell R to repeat this
'where' argument for each of the 100 trees, since the node in question is
not identical across these trees? Is there an easier way to do this than
using the 'mrca' call for each terminal taxon? I've noticed adding a 'mrca'
argument also increases computation time and if I am reinventing the wheel
it would be nice to know if I am overthinking things.

Sincerely,
Russell

[[alternative HTML version deleted]]

___
R-sig-phylo mailing list - R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/


Re: [R-sig-phylo] Using bind.tip and lapply on a multiPhylo object

2021-06-02 Thread Eliot Miller
Hi Russell,

A package I wrote a while back should be able to do that fairly easily.
https://github.com/eliotmiller/addTaxa The only paper it's described in
remains https://bsapubs.onlinelibrary.wiley.com/doi/full/10.3732/ajb.1500195
It's a wrapper for bind.tip, with some additional stuff. You basically
would give it a taxonomic file where you identify the clades you're
interested in (e.g. both of those Cercopithecus species could be named some
unique clade name and off you go, it'd add the missing one to the other),
then lapply that whole addTaxa command over the list of trees in
multiPhylo. At some point I made laser a dependency, and it's possible I
left it in that state. If that's the case, you can still get laser from old
CRAN mirrors I believe. Let me know if you want more help.

Best,
Eliot

On Wed, Jun 2, 2021 at 7:05 PM Russell Engelman 
wrote:

> Dear R-sig-phylo,
>
> I have been working with a mammalian phylogeny I recently downloaded from
> VertLife (http://vertlife.org/phylosubsets/). Unfortunately, the phylogeny
> is missing a large number of species, so I am trying to manually add these
> taxa to the phylogeny. I have a series of 100 trees that I am using to do
> things such as test for phylogenetic signal. I know how to use bind.tip to
> add new taxa to a single tree, but I am having more trouble with a
> multiPhylo object. I am primarily adding these taxa by placing them as
> sister to their nearest included relative (since most of them are elevated
> former subspecies), but the issue here is that in the 100 trees in the
> multiPhylo object the node representing the taxon to bind these taxa to is
> not the same across all trees due to shifting topologies.
>
> This is an example of the code I have been using, in which "tree" is the
> tree object. This works for a single 'phylo' tree but not 'multiphylo'.
>
> ```
> newtree<-lapply(tree,bind.tip,tip.label="Cercopithecus_albogularis",
> position=0.59,edge.length = 0.59,
>
> where=mrca(tree)["Cercopithecus_mitis","Cercopithecus_mitis"])
> ```
>
> Now, this code will not work, but I know exactly why: 'tree' is a
> multiPhylo object and so the 'where' argument cannot find the node for the
> terminal taxon. However, the issue is how can I tell R to repeat this
> 'where' argument for each of the 100 trees, since the node in question is
> not identical across these trees? Is there an easier way to do this than
> using the 'mrca' call for each terminal taxon? I've noticed adding a 'mrca'
> argument also increases computation time and if I am reinventing the wheel
> it would be nice to know if I am overthinking things.
>
> Sincerely,
> Russell
>
> [[alternative HTML version deleted]]
>
> ___
> R-sig-phylo mailing list - R-sig-phylo@r-project.org
> https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
> Searchable archive at
> http://www.mail-archive.com/r-sig-phylo@r-project.org/
>

[[alternative HTML version deleted]]

___
R-sig-phylo mailing list - R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/


Re: [R-sig-phylo] Using bind.tip and lapply on a multiPhylo object

2021-06-02 Thread Nathan Upham
Hi Russell:

Glad to hear you’re using the VertLife mammal trees — they are built on a 
taxonomy of 5,911 species of which only 4,098 are sampled for DNA, so there is 
already a ~30% chunk that is placed using taxonomic constraints and birth-death 
branch lengths as sampled during the estimation of 28 Bayesian patch clades.  

Adding additional species described since the 2015 cutoff of that VertLife 
taxonomy makes sense (e.g., up to ~6,500 species on mammaldiversity.org).  
However, keep in mind that they will not have birth-death estimated branch 
lengths, but rather more likely be added as a polygamy to given clade and then 
randomly resolved.

Given the sample code you provided, the key thing you’ll want to do is run a 
*loop* rather than using lapply, so that you can specify a given tree each 
time, e.g.:

newtrees<-vector(“list”,length(trees))
for(j in 1:length(trees)){
newtrees[[j]] <- bind.tip(tree=trees[[j]], 
tip.label="Cercopithecus_albogularis”, position=0.59,edge.length = 0.59, 
where=mrca(tree[[j]])["Cercopithecus_mitis","Cercopithecus_mitis"])
}

I also wrote some code to prune mammal trees and add extinct Caribbean species, 
which uses a similar approach of making polytomies and randomly resolving them 
— here is the repo:
https://github.com/n8upham/CaribbeanExtinctions-WTWTW/tree/master/mamPhy_pruningCode
And here is the code file:
https://github.com/n8upham/CaribbeanExtinctions-WTWTW/blob/master/mamPhy_pruningCode/pruningCode_MamPhy-to-CaribbeanTaxa.R

Hope that helps,
—nate




Nathan S. Upham, Ph.D. (he/him)
Assistant Research Professor & Associate Curator of Mammals
Arizona State University, School of Life Sciences
 ~> Check out the new Mammal Tree of Life 
 and the Mammal Diversity Database 


Research Associate, Yale University (Ecology and Evolutionary Biology)
Research Associate, Field Museum of Natural History (Negaunee Integrative 
Research Center)
Chair, Biodiversity Committee, American Society of Mammalogists
Taxonomy Advisor, IUCN/SSC Small Mammal Specialist Group

personal web: n8u.org | Google Scholar 

 | ASU profile 
e: nathan.up...@asu.edu | Skype: nate_upham | Twitter: @n8_upham 
 




> On Jun 2, 2021, at 4:19 PM, Eliot Miller  wrote:
> 
> Hi Russell,
> 
> A package I wrote a while back should be able to do that fairly easily.
> https://urldefense.com/v3/__https://github.com/eliotmiller/addTaxa__;!!IKRxdwAv5BmarQ!OZj7-dFRbxvUothKjSj6hr9B0eXscAO6LVWi1-a0w3J_PxlDqvsFDNb0lQrzxl2aIw$
>   The only paper it's described in
> remains 
> https://urldefense.com/v3/__https://bsapubs.onlinelibrary.wiley.com/doi/full/10.3732/ajb.1500195__;!!IKRxdwAv5BmarQ!OZj7-dFRbxvUothKjSj6hr9B0eXscAO6LVWi1-a0w3J_PxlDqvsFDNb0lQp7PnRRHg$
>  
> It's a wrapper for bind.tip, with some additional stuff. You basically
> would give it a taxonomic file where you identify the clades you're
> interested in (e.g. both of those Cercopithecus species could be named some
> unique clade name and off you go, it'd add the missing one to the other),
> then lapply that whole addTaxa command over the list of trees in
> multiPhylo. At some point I made laser a dependency, and it's possible I
> left it in that state. If that's the case, you can still get laser from old
> CRAN mirrors I believe. Let me know if you want more help.
> 
> Best,
> Eliot
> 
> On Wed, Jun 2, 2021 at 7:05 PM Russell Engelman 
> wrote:
> 
>> Dear R-sig-phylo,
>> 
>> I have been working with a mammalian phylogeny I recently downloaded from
>> VertLife 
>> (https://urldefense.com/v3/__http://vertlife.org/phylosubsets/__;!!IKRxdwAv5BmarQ!OZj7-dFRbxvUothKjSj6hr9B0eXscAO6LVWi1-a0w3J_PxlDqvsFDNb0lQoEdEnMAg$
>>  ). Unfortunately, the phylogeny
>> is missing a large number of species, so I am trying to manually add these
>> taxa to the phylogeny. I have a series of 100 trees that I am using to do
>> things such as test for phylogenetic signal. I know how to use bind.tip to
>> add new taxa to a single tree, but I am having more trouble with a
>> multiPhylo object. I am primarily adding these taxa by placing them as
>> sister to their nearest included relative (since most of them are elevated
>> former subspecies), but the issue here is that in the 100 trees in the
>> multiPhylo object the node representing the taxon to bind these taxa to is
>> not the same across all trees due to shifting topologies.
>> 
>> This is an example of the code I have been using, in which "tree" is the
>> tree object. This works for a single 'phylo' tree but not 'multiphylo'.
>> 
>> ```
>>