Hi Emmanuel,
Thank you for the fix! And, yes, I realize, its probably one of ape's
most widely used functions. Perhaps what we need is a function that
tests whether there is a mismatch in the node.labels, across trees
that might have different sets of taxa, which will help in the future
to alert us if this bug appears again. Of course, such a function
would need to avoid using drop.tip entirely. I've looked and I'm not
aware of any function across the entire Phylogenetics taskview that
fulfills this criteria.
So, I admit that's a tough order. I've made a particularly ugly first
stab at such a thing, mostly depending on prop.part. Its rather
clumsy, as I originally wrote it as code to fix node.labels, rather
than find the mismatches. Unfortunately, at present, it misidentifies
root-ward losses of clades, due to dropping of stem/out-group tips, as
a mismatch between node labels. Perhaps someone else will see a
cleaner, more accurate solution.
###########################################################################
library(ape)
testNodeLabels<-function(tree1,tree2,printMisMatch=TRUE,plot=FALSE){
nlab1<-tree1$node.label
if(!is.null(tree2$node.label)){
nlab2<-tree2$node.label
}else{
nlab2<-rep(NA,Nnode(tree2))
}
#if tree2 has any taxa not in tree1, stop
noMatch<-sapply(tree2$tip.label,function(x) !any(x==tree1$tip.label))
if(any(noMatch)){
stop("tree2 has taxa not found in tree1 and thus is uncomparable")
}
#
desc1<-lapply(prop.part(tree1),function(x) tree1$tip.label[x])
desc2<-lapply(prop.part(tree2),function(x) tree2$tip.label[x])
#need to remove taxa not shared by one without using drop.tip
taxa<-c(tree1$tip.label,tree2$tip.label)
shared<-taxa[sapply(taxa,function(x)
any(x==tree1$tip.label) & any(x==tree2$tip.label))]
#get ndesc for desc2
ndesc2<-sapply(desc2,length)
#reorder desc2 and nlab2
desc2<-desc2[order(ndesc2)]
nlab2<-nlab2[order(ndesc2)]
result<-TRUE
for(i in 1:length(desc1)){
target<-desc1[[i]]
targetName<-nlab1[i]
sharedDesc<-target[sapply(target,function(x)
any(x==shared))]
if(length(sharedDesc)>1){
#find matches in 2
matches<-sapply(desc2,function(x)
all(sapply(sharedDesc,function(z) any(z==x))))
}else{
matches<-FALSE
}
#get richest match - if there is no match, get NA
matchClade<-which(matches)[1]
if(!is.na(matchClade)){
matchName<-identical(nlab1[i],nlab2[matchClade])
if(!matchName){
result<-FALSE
if(printMisMatch){
warning(paste("\n A node with descendants:",
paste0(sharedDesc,collapse=', '),
"\n is labeled:",nlab1[i],
"in tree1 but labeled:",
nlab2[matchClade],"in tree2 \n"))
}
}
}
}
#
if(plot){
#plot it
layout(1:2)
plot(tree1,show.tip.label=TRUE,use.edge.length=FALSE)
nodelabels(tree1$node.label)
plot(tree2,show.tip.label=TRUE,use.edge.length=FALSE)
nodelabels(tree2$node.label)
layout(1)
}
return(result)
}
set.seed(1)
tree<-rtree(10)
tree$node.label<-rep(NA,Nnode(tree))
tree$node.label[1]<-"ROOT"
tree$node.label[5]<-"HELLO"
tree$node.label[8]<-"NOPE"
tree1<-drop.tip(tree,"t2")
tree2<-ladderize(tree)
tree3<-drop.tip(tree2,"t2")
testNodeLabels(tree,tree1)
testNodeLabels(tree,tree2)
testNodeLabels(tree,tree3)
testNodeLabels(tree,tree3,plot=TRUE)
################################################
Additionally, could you send the full revised source file? I have to
admit, I always get lost looking for ape's newest source files on the
ape website.
Cheers,
-Dave
On Sat, Jul 25, 2015 at 6:20 AM, Emmanuel Paradis
<[email protected]> wrote:
> Hi David,
>
> Here is a fix for drop.tip (line numbers refer to the source file
> drop.tip.R):
>
> 229,231c229,230
> < ## executed from right to left, so newNb is modified before phy$edge:
> < phy$edge[sndcol, 2] <- newNb[phy$edge[sndcol, 2]] <-
> < (n + 2):(n + phy$Nnode)
> ---
>> newNb[sort(phy$edge[sndcol, 2])] <- (n + 2):(n + phy$Nnode)
>> phy$edge[sndcol, 2] <- newNb[phy$edge[sndcol, 2]]
>
> Since this function is widely used, this requires more tests to validate
> this fix.
>
> Best,
>
> Emmanuel
>
>
> Le 18/07/2015 06:42, David Bapst a écrit :
>>
>> Hello all,
>>
>> Recently I noticed a complex function of mine that does some tree
>> transformations was randomly scrambling node.label elements.
>>
>> In the course of doing so, I found this old email (below) from Rebecca
>> Best in 2012, which outlined an issue that occurred when a ladderized
>> tree had tips dropped. It appears that bug has reappeared in more
>> recent version of ape.
>>
>> Here's a slightly modified version of her example code, which appears
>> to still replicate a node label shuffling in ape v3.3.
>>
>> #############################
>> require(ape)
>>
>> #read tree
>> mytree<-read.tree()
>> ((D,(E,G)1)1,((H,J)0.8,(K,(((L,M)0.5,(N,O)0.6)1,(P,(Q,R)1)1)0.7)1)1);
>>
>> #ladderize tree
>> mytree.lad<-ladderize(mytree)
>>
>> #node labels display on both trees correctly
>> layout(1:2)
>> plot(mytree,show.node.label=TRUE)
>> plot(mytree.lad,show.node.label=TRUE)
>>
>> #drop tips from both trees
>> drop.mytree<-drop.tip(mytree,c("L","D","G"))
>> drop.mytree.lad<-drop.tip(mytree.lad,c("L","D","G"))
>>
>> #plot both trees, node labels are incorrect for ladderized tree
>> dev.new()
>> layout(1:2)
>> plot(drop.mytree,show.node.label=TRUE)
>> plot(drop.mytree.lad,show.node.label=TRUE)
>> #############################
>>
>> Although I'm still in the process of dismantling my own issue, so I am
>> not 100% certain, I strongly believe this is the culprit in my case,
>> as my script partly drops.tips from an input tree (that happens to
>> always be ladderized).
>>
>> Cheers,
>> -Dave
>>
>>
>> ---------- Forwarded message ----------
>> From: Rebecca Best <[email protected]>
>> Date: Tue, Oct 2, 2012 at 12:26 AM
>> Subject: [R-sig-phylo] ladderize + drop.tip = shuffled node labels
>> To: [email protected]
>>
>>
>> Hi all
>>
>> I have been plotting some pruned trees recently, and have run into a
>> problem using drop.tip() after ladderize(). If you ladderize() and
>> then drop tips from the ladderized tree, then at least in my case the
>> node labels are no longer correct. This may be an unlikely sequence of
>> commands, but I thought I'd post this in case it is an easy fix, or it
>> helps anyone else avoid issues.
>> Thanks!
>>
>> Rebecca
>>
>> ##
>>
>> require(ape)
>>
>> #read tree
>>
>> mytree<-read.tree()
>> ((D,(E,G)1)1,((H,J)0.8,(K,(((L,M)0.5,(N,O)0.6)1,(P,(Q,R)1)1)0.7)1)1);
>>
>> #ladderize tree
>>
>> mytree.lad<-ladderize(mytree)
>>
>> #node labels display on both trees correctly
>>
>> plot(mytree,show.node.label=TRUE)
>> plot(mytree.lad,show.node.label=TRUE)
>>
>> #drop tips from both trees
>>
>> drop.mytree<-drop.tip(mytree,c("L","D","G"))
>> drop.mytree.lad<-drop.tip(mytree.lad,c("L","D","G"))
>>
>> #plot both trees, node labels are incorrect for ladderized tree
>>
>> plot(drop.mytree,show.node.label=TRUE)
>> plot(drop.mytree.lad,show.node.label=TRUE)
>>
>> _______________________________________________
>> R-sig-phylo mailing list
>> [email protected]
>> https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
>>
>>
>
--
David W. Bapst, PhD
Adjunct Asst. Professor, Geology and Geol. Eng.
South Dakota School of Mines and Technology
501 E. St. Joseph
Rapid City, SD 57701
http://webpages.sdsmt.edu/~dbapst/
http://cran.r-project.org/web/packages/paleotree/index.html
_______________________________________________
R-sig-phylo mailing list - [email protected]
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
Searchable archive at http://www.mail-archive.com/[email protected]/