Hi Kevin, here are some bits of code, which should let you compute those numbers:
library(phangorn) set.seed(3) nTips = 10 tree = rtree(nTips) x = rep(c("blue", "green"), each = 5) names(x) = paste("t", 1:nTips, sep="") plot(tree, tip.col=x[tree$tip.label]) nodelabels() # show you how your tree is indexed # tiplabels() intNodes = (Ntip(tree)+1L) : (Ntip(tree)+Nnode(tree)) sisters = Children(tree, intNodes) # the next line assumes that the tree is strictly bifurcating, otherwise its getting a bit more tricky sisters = matrix(unlist(sisters), byrow=TRUE, ncol=2, dimnames = list(intNodes, NULL)) # now the indices of sister clades are stored in the rows desc = Descendants(tree) # descendants (tips) for each internal node (your clades) l = sapply(desc, length) # length of each clade x = (x=="blue") # make x logical x = x[tree$tip] # reorder your data to have the same order as the tree l = sapply(desc, length) # length of each clade blues = sapply(desc, function(d, x)sum(x[d]), x) # number of elements in a clade, which are "blue" greens = (l - blues) # number of elements in a clade, which are "green" allgreen = (greens==l) allblue = (blues==l) # you can subset l, blues, greens, allgreen, allblue with the res = rbind(cbind(greens[sisters[,1]], blues[sisters[,2]])[allgreen[sisters[, 1]] & allblue[sisters[,2]], ], cbind(greens[sisters[,2]], blues[sisters[,1]])[allgreen[sisters[, 2]] & allblue[sisters[,1]], ]) (res = as.data.frame(res)) with this you should be able to compute your data.frame. Cheers, Klaus On Thu, Oct 23, 2014 at 11:30 AM, Liam J. Revell <liam.rev...@umb.edu> wrote: > Hi Kevin. > > It sounds like what you want to do is perform a pre-order tree traversal > and for each node visited ask if all the taxa to one side (e.g., > descendants of the right daughter node) are in state "0" and all the taxa > to the other side (e.g., descendants of the left daughter) are in state > "1". If this evaluates to be true, then you record the number of tips in > each category. To do this most efficiently, you should not visit any > daughters of a node for which you have evaluated balance - but if you do, > then you will find that it doesn't satisfy the criterion described above > (specifically, descendants on either side of the node will be all in state > "0" or "1"). > > This should be straightforward to code, but I do not have time to > demonstrate right now. I will try to do it this evening. Let us know if you > first figure it out yourself. > > - Liam > > Liam J. Revell, Assistant Professor of Biology > University of Massachusetts Boston > web: http://faculty.umb.edu/liam.revell/ > email: liam.rev...@umb.edu > blog: http://blog.phytools.org > > On 10/23/2014 11:13 AM, Arbuckle, Kevin wrote: > >> Hi Fran�ois, >> >> >> >> Thanks again for your response. Wouldn't that lose the information of how >> many species were in each clade? And how would I specify that 'clades' to >> keep consist of those sharing either state? My original tree consists of >> almost 3000 species so going through such clades manually would be >> difficult at best, hence the need to automate it somehow. (I apologise, my >> R coding skills are improving but still leave a lot to be desired in many >> cases). >> >> >> >> I completely agree that BiSSE is far more appropriate for my aims, and >> indeed this was the approach I used. However, reviewers have asked if I get >> the same basic result using other methods, which is the only reason I am >> attempting such analyses now. >> >> >> >> Thank you kindly once again for your time, >> >> >> >> Kev >> >> >> >> ________________________________ >> From: Fran�ois Michonneau [francois.michonn...@gmail.com] >> Sent: 23 October 2014 16:05 >> To: Arbuckle, Kevin >> Cc: r-sig-phylo@r-project.org >> Subject: Re: [R-sig-phylo] Extracting sister groups >> >> >> HI Kevin, >> >> If I understand correctly what you're trying to do, you'll first need >> to collapse some of your tips to create clades, a proportion of which will >> have the trait. You'll then be able to use this new tree to generate the >> data.frame needed by the functions you mentioned in your original post. >> >> Depending on what you're trying to do, you may not want to lose this >> phylogenetic information. Maybe a different approach, such as using BiSSE >> in the diversitree package might be more appropriate? >> >> Cheers, >> -- Fran�ois >> >> On Thu, Oct 23, 2014 at 10:27 AM, Arbuckle, Kevin < >> k.arbuc...@liverpool.ac.uk<mailto:k.arbuc...@liverpool.ac.uk>> wrote: >> >> Hi Fran�ois, >> >> Thank you kindly for your offer of help. The code below will simulate a >> phylogeny ("tree") and a dataframe ("trait") with one binary trait for 100 >> species. The format is representative of the data I am using for my >> analyses so should serve as a test case. Hopefully this helps, let me know >> if there's any other information I can provide. >> >> >> >> library(ape) >> library(phytools) >> tree<-rtree(100) >> tran<-matrix(c(-1,1,1,-1),2,2) >> rownames(tran)<-c("0","1") >> colnames(tran)<-c("0","1") >> phy<-sim.history(tree,tran) >> trait<-data.frame(sp=tree$tip.label,bt=getStates(phy,type="tips")) >> rownames(trait)<-tree$tip.label >> >> Cheers, >> >> >> >> Kev >> >> >> >> ________________________________ >> From: Fran�ois Michonneau [francois.michonn...@gmail.com<mailto: >> francois.michonn...@gmail.com>] >> Sent: 23 October 2014 14:54 >> To: Arbuckle, Kevin >> Subject: Re: [R-sig-phylo] Extracting sister groups >> >> >> Hi Kevin, >> >> We should be able to help you but it would be much easier if you >> provided us with a small data set that illustrate the format of your >> current dataset. How is your trait currently stored? and how is it >> associated with the tips in your tree? >> >> Cheers, >> -- Fran�ois >> >> On Thu, Oct 23, 2014 at 6:23 AM, Arbuckle, Kevin < >> k.arbuc...@liverpool.ac.uk<mailto:k.arbuc...@liverpool.ac.uk>> wrote: >> Hi everyone, >> >> >> >> I am attempting to run sister group analyses as one way to look at the >> effect of a binary trait on diversification. Two of the functions from ape >> that I'm looking at are diversity.contrast.test and richness.yule.test, but >> both have the same limitation. They require the data to be input as a >> dataframe of two columns, one with the number of species in clades that >> have the trait of interest, and the other with the number of species in the >> respective sister clades that don't have the trait. The issue is that I am >> working with a very large tree, and so extracting and entering such >> information by hand is not really feasible. >> >> >> >> I am therefore looking for a function which extracts all sister clades >> that differ in the presence vs absence of the trait, and ideally is capable >> of generating a dataframe of the appropriate format for the above functions >> automatically. It seems that a function to do this should exist already, >> but as I can't seem to find anything I would appreciate some help >> (hopefully someone will know of such a function that already exists). >> >> >> >> Thanks, >> >> >> >> Kevin Arbuckle >> >> [[alternative HTML version deleted]] >> >> _______________________________________________ >> R-sig-phylo mailing list - R-sig-phylo@r-project.org<mailto: >> R-sig-phylo@r-project.org> >> https://stat.ethz.ch/mailman/listinfo/r-sig-phylo >> Searchable archive at http://www.mail-archive.com/r- >> sig-ph...@r-project.org/ >> >> >> >> [[alternative HTML version deleted]] >> >> >> >> _______________________________________________ >> R-sig-phylo mailing list - R-sig-phylo@r-project.org >> https://stat.ethz.ch/mailman/listinfo/r-sig-phylo >> Searchable archive at http://www.mail-archive.com/r- >> sig-ph...@r-project.org/ >> >> > _______________________________________________ > R-sig-phylo mailing list - R-sig-phylo@r-project.org > https://stat.ethz.ch/mailman/listinfo/r-sig-phylo > Searchable archive at http://www.mail-archive.com/r- > sig-ph...@r-project.org/ > -- Klaus Schliep Postdoctoral Fellow Revell Lab, University of Massachusetts Boston [[alternative HTML version deleted]] _______________________________________________ R-sig-phylo mailing list - R-sig-phylo@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-phylo Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/