Hi Emmanuel,

Thank you for the fix! And, yes, I realize, its probably one of ape's
most widely used functions. Perhaps what we need is a function that
tests whether there is a mismatch in the node.labels, across trees
that might have different sets of taxa, which will help in the future
to alert us if this bug appears again. Of course, such a function
would need to avoid using drop.tip entirely. I've looked and I'm not
aware of any function across the entire Phylogenetics taskview that
fulfills this criteria.

So, I admit that's a tough order. I've made a particularly ugly first
stab at such a thing, mostly depending on prop.part. Its rather
clumsy, as I originally wrote it as code to fix node.labels, rather
than find the mismatches. Unfortunately, at present, it misidentifies
root-ward losses of clades, due to dropping of stem/out-group tips, as
a mismatch between node labels. Perhaps someone else will see a
cleaner, more accurate solution.

###########################################################################

library(ape)

testNodeLabels<-function(tree1,tree2,printMisMatch=TRUE,plot=FALSE){
    nlab1<-tree1$node.label
    if(!is.null(tree2$node.label)){
        nlab2<-tree2$node.label
    }else{
        nlab2<-rep(NA,Nnode(tree2))
        }
    #if tree2 has any taxa not in tree1, stop
    noMatch<-sapply(tree2$tip.label,function(x) !any(x==tree1$tip.label))
    if(any(noMatch)){
        stop("tree2 has taxa not found in tree1 and thus is uncomparable")
        }
    #
    desc1<-lapply(prop.part(tree1),function(x) tree1$tip.label[x])
    desc2<-lapply(prop.part(tree2),function(x) tree2$tip.label[x])
    #need to remove taxa not shared by one without using drop.tip
    taxa<-c(tree1$tip.label,tree2$tip.label)
    shared<-taxa[sapply(taxa,function(x)
        any(x==tree1$tip.label) & any(x==tree2$tip.label))]
    #get ndesc for desc2
    ndesc2<-sapply(desc2,length)
    #reorder desc2 and nlab2
    desc2<-desc2[order(ndesc2)]
    nlab2<-nlab2[order(ndesc2)]
    result<-TRUE
    for(i in 1:length(desc1)){
        target<-desc1[[i]]
        targetName<-nlab1[i]
        sharedDesc<-target[sapply(target,function(x)
            any(x==shared))]
        if(length(sharedDesc)>1){
            #find matches in 2
            matches<-sapply(desc2,function(x)
                all(sapply(sharedDesc,function(z) any(z==x))))
        }else{
            matches<-FALSE
            }
        #get richest match - if there is no match, get NA
        matchClade<-which(matches)[1]
        if(!is.na(matchClade)){
            matchName<-identical(nlab1[i],nlab2[matchClade])
            if(!matchName){
                result<-FALSE
                if(printMisMatch){
                    warning(paste("\n A node with descendants:",
                        paste0(sharedDesc,collapse=', '),
                        "\n is labeled:",nlab1[i],
                        "in tree1 but labeled:",
                        nlab2[matchClade],"in tree2 \n"))
                    }
                }
            }
        }
    #
    if(plot){
        #plot it
        layout(1:2)
        plot(tree1,show.tip.label=TRUE,use.edge.length=FALSE)
        nodelabels(tree1$node.label)
        plot(tree2,show.tip.label=TRUE,use.edge.length=FALSE)
        nodelabels(tree2$node.label)
        layout(1)
        }
    return(result)
    }

set.seed(1)
tree<-rtree(10)
tree$node.label<-rep(NA,Nnode(tree))
tree$node.label[1]<-"ROOT"
tree$node.label[5]<-"HELLO"
tree$node.label[8]<-"NOPE"

tree1<-drop.tip(tree,"t2")
tree2<-ladderize(tree)
tree3<-drop.tip(tree2,"t2")

testNodeLabels(tree,tree1)
testNodeLabels(tree,tree2)
testNodeLabels(tree,tree3)

testNodeLabels(tree,tree3,plot=TRUE)

################################################

Additionally, could you send the full revised source file? I have to
admit, I always get lost looking for ape's newest source files on the
ape website.

Cheers,
-Dave

On Sat, Jul 25, 2015 at 6:20 AM, Emmanuel Paradis
<[email protected]> wrote:
> Hi David,
>
> Here is a fix for drop.tip (line numbers refer to the source file
> drop.tip.R):
>
> 229,231c229,230
> <     ## executed from right to left, so newNb is modified before phy$edge:
> <     phy$edge[sndcol, 2] <- newNb[phy$edge[sndcol, 2]] <-
> <         (n + 2):(n + phy$Nnode)
> ---
>>     newNb[sort(phy$edge[sndcol, 2])] <- (n + 2):(n + phy$Nnode)
>>     phy$edge[sndcol, 2] <- newNb[phy$edge[sndcol, 2]]
>
> Since this function is widely used, this requires more tests to validate
> this fix.
>
> Best,
>
> Emmanuel
>
>
> Le 18/07/2015 06:42, David Bapst a écrit :
>>
>> Hello all,
>>
>> Recently I noticed a complex function of mine that does some tree
>> transformations was randomly scrambling node.label elements.
>>
>> In the course of doing so, I found this old email (below) from Rebecca
>> Best in 2012, which outlined an issue that occurred when a ladderized
>> tree had tips dropped. It appears that bug has reappeared  in more
>> recent version of ape.
>>
>> Here's a slightly modified version of her example code, which appears
>> to still replicate a node label shuffling in ape v3.3.
>>
>> #############################
>> require(ape)
>>
>> #read tree
>> mytree<-read.tree()
>> ((D,(E,G)1)1,((H,J)0.8,(K,(((L,M)0.5,(N,O)0.6)1,(P,(Q,R)1)1)0.7)1)1);
>>
>> #ladderize tree
>> mytree.lad<-ladderize(mytree)
>>
>> #node labels display on both trees correctly
>> layout(1:2)
>> plot(mytree,show.node.label=TRUE)
>> plot(mytree.lad,show.node.label=TRUE)
>>
>> #drop tips from both trees
>> drop.mytree<-drop.tip(mytree,c("L","D","G"))
>> drop.mytree.lad<-drop.tip(mytree.lad,c("L","D","G"))
>>
>> #plot both trees, node labels are incorrect for ladderized tree
>> dev.new()
>> layout(1:2)
>> plot(drop.mytree,show.node.label=TRUE)
>> plot(drop.mytree.lad,show.node.label=TRUE)
>> #############################
>>
>> Although I'm still in the process of dismantling my own issue, so I am
>> not 100% certain, I strongly believe this is the culprit in my case,
>> as my script partly drops.tips from an input tree (that happens to
>> always be ladderized).
>>
>> Cheers,
>> -Dave
>>
>>
>> ---------- Forwarded message ----------
>> From: Rebecca Best <[email protected]>
>> Date: Tue, Oct 2, 2012 at 12:26 AM
>> Subject: [R-sig-phylo] ladderize + drop.tip = shuffled node labels
>> To: [email protected]
>>
>>
>> Hi all
>>
>> I have been plotting some pruned trees recently, and have run into a
>> problem using drop.tip() after ladderize(). If you ladderize() and
>> then drop tips from the ladderized tree, then at least in my case the
>> node labels are no longer correct. This may be an unlikely sequence of
>> commands, but I thought I'd post this in case it is an easy fix, or it
>> helps anyone else avoid issues.
>> Thanks!
>>
>> Rebecca
>>
>> ##
>>
>> require(ape)
>>
>> #read tree
>>
>> mytree<-read.tree()
>> ((D,(E,G)1)1,((H,J)0.8,(K,(((L,M)0.5,(N,O)0.6)1,(P,(Q,R)1)1)0.7)1)1);
>>
>> #ladderize tree
>>
>> mytree.lad<-ladderize(mytree)
>>
>> #node labels display on both trees correctly
>>
>> plot(mytree,show.node.label=TRUE)
>> plot(mytree.lad,show.node.label=TRUE)
>>
>> #drop tips from both trees
>>
>> drop.mytree<-drop.tip(mytree,c("L","D","G"))
>> drop.mytree.lad<-drop.tip(mytree.lad,c("L","D","G"))
>>
>> #plot both trees, node labels are incorrect for ladderized tree
>>
>> plot(drop.mytree,show.node.label=TRUE)
>> plot(drop.mytree.lad,show.node.label=TRUE)
>>
>> _______________________________________________
>> R-sig-phylo mailing list
>> [email protected]
>> https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
>>
>>
>



-- 
David W. Bapst, PhD
Adjunct Asst. Professor, Geology and Geol. Eng.
South Dakota School of Mines and Technology
501 E. St. Joseph
Rapid City, SD 57701

http://webpages.sdsmt.edu/~dbapst/
http://cran.r-project.org/web/packages/paleotree/index.html

_______________________________________________
R-sig-phylo mailing list - [email protected]
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
Searchable archive at http://www.mail-archive.com/[email protected]/

Reply via email to