Re: [R-sig-phylo] NEVERMIND -read.tree and read.nexus error
James: it's not the space that is the problem, but one label has parentheses: 'Rubus idaeus U06825.1(2)' Deleting or substituting these parentheses with(out) the spaces and/or quotes solves the problem for read.tree(). I remind that this behaviour is expected and discussed in ape's FAQ. Ben: I agree with you and would even go to suggest a revision of the Newick format definition to clarify its usage. If I remember correctly, the current definition was done in the early 1980's, and it seems natural to update it someday. One main issue (to me) is how to include/specify other data than the labels and branch lengths. Emmanuel Ben Bolker wrote on 21/02/2011 10:31: On the other hand, if anyone has any ideas about a paranoid and verbose Newick file-checker that gives verbose output about what precisely might be wrong/non-conforming in a Newick file (lint for Newick for the old-fashioned programmers) to help track this sort of thing down, that would be probably be very helpful to many on the list ... On 11-02-20 06:59 PM, James Meadow wrote: List administrator - Do not post this question. I picked through the nwk file again and found a single misplaced space! it is now fixed. Sorry about that, James On Sun, Feb 20, 2011 at 4:44 PM, James Meadow jfmea...@gmail.com wrote: Hello, I have seen this error reported in the help list, but the suggestions do not fix my problem. I am using a very recent version of ape, and I have also tried the patch given here: https://svn.mpl.ird.fr/ape/dev/ape/R/read.nexus.R I am trying to read.tree a large (300 taxa) nwk tree, and I continue to get this error: pltr-read.tree(plant_all_1.nwk) # There is apparently two root edges in your file: cannot read tree file. # Reading Newick file aborted at tree no.1 I have tried it with .nex, and .nwk and both give the same error. I can open this file in any other tree viewer (like FigTree and Mega...) with no problem, and the root is clearly separated. I have looked through the .nwk file but cannot find where there might be a problem. I have also tried removing the first and last ( ) and the tree will read, but then plotting the tree gives this error: # Error in plot.phylo(pltree) : # there are single (non-splitting) nodes in your tree; you may need to use collapse.singles() # Calls: plot - plot.phylo and collapse.singles() doesn't fix it. I have tried to simulate smaller trees to get the error, but I can't seem to reproduce the situation. Does ape have some sort of upper limit on tree size? The nwk file is attached. I apologise in advance if I am missing something simple. As a sidenote, I have been using ape and other r-phylo packages for only a short time, and I have all but abandoned most other phylo software in the process - very nice job developing these packages and to EP for a terrific book! My dissertation is indebted to you. Thank you, James -- James Meadow Land Resources and Environmental Sciences Montana State University (406) 370-7157 jfmea...@gmail.com ___ R-sig-phylo mailing list R-sig-phylo@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-phylo -- Emmanuel Paradis IRD, Jakarta, Indonesia http://ape.mpl.ird.fr/ ___ R-sig-phylo mailing list R-sig-phylo@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
Re: [R-sig-phylo] NEVERMIND -read.tree and read.nexus error
The package phylobase uses NCL to parse Newick files which provides some information when a non-valid file is being imported. It's not perfect, but it will at least give you an idea of the part that is not correctly formated. -- François On Sun, Feb 20, 2011 at 22:31, Ben Bolker bbol...@gmail.com wrote: On the other hand, if anyone has any ideas about a paranoid and verbose Newick file-checker that gives verbose output about what precisely might be wrong/non-conforming in a Newick file (lint for Newick for the old-fashioned programmers) to help track this sort of thing down, that would be probably be very helpful to many on the list ... On 11-02-20 06:59 PM, James Meadow wrote: List administrator - Do not post this question. I picked through the nwk file again and found a single misplaced space! it is now fixed. Sorry about that, James On Sun, Feb 20, 2011 at 4:44 PM, James Meadow jfmea...@gmail.com wrote: Hello, I have seen this error reported in the help list, but the suggestions do not fix my problem. I am using a very recent version of ape, and I have also tried the patch given here: https://svn.mpl.ird.fr/ape/dev/ape/R/read.nexus.R I am trying to read.tree a large (300 taxa) nwk tree, and I continue to get this error: pltr-read.tree(plant_all_1.nwk) # There is apparently two root edges in your file: cannot read tree file. # Reading Newick file aborted at tree no.1 I have tried it with .nex, and .nwk and both give the same error. I can open this file in any other tree viewer (like FigTree and Mega...) with no problem, and the root is clearly separated. I have looked through the .nwk file but cannot find where there might be a problem. I have also tried removing the first and last ( ) and the tree will read, but then plotting the tree gives this error: # Error in plot.phylo(pltree) : # there are single (non-splitting) nodes in your tree; you may need to use collapse.singles() # Calls: plot - plot.phylo and collapse.singles() doesn't fix it. I have tried to simulate smaller trees to get the error, but I can't seem to reproduce the situation. Does ape have some sort of upper limit on tree size? The nwk file is attached. I apologise in advance if I am missing something simple. As a sidenote, I have been using ape and other r-phylo packages for only a short time, and I have all but abandoned most other phylo software in the process - very nice job developing these packages and to EP for a terrific book! My dissertation is indebted to you. Thank you, James -- James Meadow Land Resources and Environmental Sciences Montana State University (406) 370-7157 jfmea...@gmail.com ___ R-sig-phylo mailing list R-sig-phylo@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-phylo [[alternative HTML version deleted]] ___ R-sig-phylo mailing list R-sig-phylo@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
[R-sig-phylo] Phyloinformatics Summer of Code 2011: Call for Mentors
(Apologies if you receive multiple copies.) Over the next 3 weeks we will be pulling together NESCent's application to the 2011 Google Summer of Code as a mentoring organization. This is a call for all prospective mentors, primary and secondary, to step forward. Participating as an organization is competitive. Over the last years the acceptance rate for organizations has been around 30-35%. The most important component of organization applications is the Ideas page, and specifically the quality and suitability of the project ideas. These project ideas are contributed by you, our mentors. In the past we have had a strong, diverse and well-documented portfolio of ideas with different degrees of difficulty, from different participating open-source projects, using different programming languages. If you can fancy yourself serving as a mentor, or helping someone else mentoring a student as a secondary mentor, or would like to help out in other capacities, please contact us as soon as you can at phylosoc-ad...@nescent.org . If you have not been a mentor with us in previous years, we'll send you guidance on what doing so involves, and how you can contribute to our participation. We will also add everyone who is interested in serving to our (private) mentors mailing list (at least those who aren't already). If you are new to Summer of Code and wonder what it takes or what it is like to be a mentor for us, don't hesitate to ask questions or to contact previous mentors (see URLs below for projects that got selected). Being a mentor does require time (see http://bit.ly/soc2011-mentortime) , but our past mentors have pretty much unanimously found it a fun and rewarding experience. That's aside from the code a student could contribute to your project, and, possibly most important of all in the long run, the chance to gain a new developer. The initial skeleton of our 2011 Ideas page is now up here and ready for adding project ideas and mentors(*). http://informatics.nescent.org/wiki/Phyloinformatics_Summer_of_Code_2011 We will send further guidance on drafting project ideas, but for now you can see examples of the format and scope of project ideas on the Ideas pages from previous years (click on Ideas): http://informatics.nescent.org/wiki/Phyloinformatics_Summer_of_Code_2010 http://informatics.nescent.org/wiki/Phyloinformatics_Summer_of_Code_2009 http://informatics.nescent.org/wiki/Phyloinformatics_Summer_of_Code_2008 http://informatics.nescent.org/wiki/Phyloinformatics_Summer_of_Code_2007 Dates: == Submission of organization applications starts Feb 28 and closes on March 11. For project ideas to contribute to the strength of our application they must be in reasonable shape by the morning of March 11. *If* we are accepted, ideas can be refined (or added) between March 18-27. Students apply March 28-April 8, and selected students are announced April 25. The coding period runs from May 23 to August 22. See http://bit.ly/soc2011-timeline for a full timeline of the whole program. Cheers, and we look forward to hearing from you! Karen Cranston Hilmar Lapp (*) Editing content on the NESCent Informatics wiki (f.k.a. Hackathon wiki) requires you to login. We had to disable local account creation due to spam getting out of control. The wiki is still open, though - just login with your OpenID. If you don't have an OpenID, the Login with OpenID page has information on you can easily get one, and if you have a Google account, you're all set to go. ___ R-sig-phylo mailing list R-sig-phylo@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
Re: [R-sig-phylo] R: Re: R: Re: Simulation of Continuous Trait with Active Trend
Hi Pasquale, To follow up on my earlier message, I have written a preliminary version of a simple function to simultaneously estimate the evolutionary parameters and ancestral states for Brownian evolution with a trend using likelihood. I will paste it at the end of this email. I have confirmed that the function returns the same trend parameter as fitContinuous(model=trend) and the same ancestral character estimates as ace() - when mu (the trend parameter) is forced to zero [note that this can only be done by modifying the code]. The parameter sig2, the Brownian rate, is biased by a factor n/(2n-2) for n species relative to fitContinuous(model=trend)$Trait1$beta. Note that this model can generally only be fit for a tree with tips that are not contemporaneous - for instance, for a phylogeny containing fossil species. Best, Liam Function code (also http://anolis.oeb.harvard.edu/~liam/R-phylogenetics/anc.trend/v0.1/anc.trend.R): # written by Liam J. Revell 2011 anc.trend-function(phy,x,maxit=2000){ # preliminaries # require dependencies require(ape) # set global tol-1e-8 # compute C D-dist.nodes(phy) ntips-length(phy$tip.label) Cii-D[ntips+1,] C-D; C[,]-0 counts-vector() for(i in 1:nrow(D)) for(j in 1:ncol(D)) C[i,j]-(Cii[i]+Cii[j]-D[i,j])/2 dimnames(C)[[1]][1:length(phy$tip)]-phy$tip.label dimnames(C)[[2]][1:length(phy$tip)]-phy$tip.label C-C[c(1:ntips,(ntips+2):nrow(C)),c(1:ntips,(ntips+2):ncol(C))] # sort x by phy$tip.label x-x[phy$tip.label] # function returns the negative log-likelihood likelihood-function(theta,x,C){ a-theta[1] u-theta[2] sig2-theta[3] y-theta[4:length(theta)] logLik-dmnorm(x=c(x,y),mean=(a+diag(C)*u),varcov=sig2*C,log=TRUE) return(-logLik) } # get reasonable starting values for the optimizer a-mean(x) sig2-var(x)/max(C) # perform ML optimization result-optim(par=c(a,0,sig2,rep(a,phy$Nnode-1)),likelihood,x=x,C=C,method=L-BFGS-B,lower=c(-Inf,-Inf,tol,rep(-Inf,phy$Nnode-1)),control=list(maxit=maxit)) # return the result ace-c(result$par[c(1,4:length(result$par))]); names(ace)-c(as.character(phy$edge[1,1]),rownames(C)[(length(phy$tip.label)+1):nrow(C)]) return(list(ace=ace,mu=result$par[2],sig2=result$par[3],logL=-result$value,convergence=result$convergence,message=result$message)) } -- Liam J. Revell University of Massachusetts Boston web: http://www.bio.umb.edu/facstaff/faculty_Revell.html (new) email: liam.rev...@umb.edu (new) blog: http://phytools.blogspot.com On 2/17/2011 12:45 PM, Liam J. Revell wrote: Hi Pasquale, I think this may be possible if all tips are not contemporaneous. As Gene points out, the values at the tips of the tree given a trend should be distributed as ~MVN(mu=diag(vcv(tree))*tr+ancestor, V=vcv(tree)*sig2). Thus, we should be able to compute the likelihood of any data pattern at the tips and internal nodes of the tree, given tr, ancestor, and sig2. If we start with only the values at the tips and the phylogeny, we can estimate the states at internal nodes and our model by maximizing the likelihood. This would be our solution. I have written down some code for this using dmnorm() (from the mnormt package) to compute the multivariate normal density; a home-made function to compute vcv(tree) but for all the nodes in the tree; and optim() to maximize the likelihood - but it doesn't seem to work very well yet. I will pass it along if I can figure it out. - Liam ___ R-sig-phylo mailing list R-sig-phylo@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
[R-sig-phylo] R: Re: R: Re: R: Re: Simulation of Continuous Trait with Active Trend
Hi all, I tried the function written by Liam, it works pretty well! As an indication for users, do not use an ultrametric tree, you'd get an error like this: Error in chol.default(x, pivot = FALSE) : the leading minor of order 140 is not positive definite Error in pd.solve(varcov) : x appears to be not positive definite I applied anc.tree to a multichotomous tree, it still works very well, judging from fitted values at nodes. Pas Messaggio originale Da: liam.rev...@umb.edu Data: 22/02/2011 4.58 A: pasquale.r...@libero.itpasquale.r...@libero.it Cc: R Sig Phylo Listservr-sig-phylo@r-project.org Ogg: Re: [R-sig-phylo] R: Re: R: Re: Simulation of Continuous Trait with Active Trend Hi Pasquale, To follow up on my earlier message, I have written a preliminary version of a simple function to simultaneously estimate the evolutionary parameters and ancestral states for Brownian evolution with a trend using likelihood. I will paste it at the end of this email. I have confirmed that the function returns the same trend parameter as fitContinuous(model=trend) and the same ancestral character estimates as ace() - when mu (the trend parameter) is forced to zero [note that this can only be done by modifying the code]. The parameter sig2, the Brownian rate, is biased by a factor n/(2n-2) for n species relative to fitContinuous(model=trend)$Trait1$beta. Note that this model can generally only be fit for a tree with tips that are not contemporaneous - for instance, for a phylogeny containing fossil species. Best, Liam Function code (also http://anolis.oeb.harvard.edu/~liam/R-phylogenetics/anc.trend/v0.1/anc.trend. R): # written by Liam J. Revell 2011 anc.trend-function(phy,x,maxit=2000){ # preliminaries # require dependencies require(ape) # set global tol-1e-8 # compute C D-dist.nodes(phy) ntips-length(phy$tip.label) Cii-D[ntips+1,] C-D; C[,]-0 counts-vector() for(i in 1:nrow(D)) for(j in 1:ncol(D)) C[i,j]-(Cii[i]+Cii[j]-D[i,j])/2 dimnames(C)[[1]][1:length(phy$tip)]-phy$tip.label dimnames(C)[[2]][1:length(phy$tip)]-phy$tip.label C-C[c(1:ntips,(ntips+2):nrow(C)),c(1:ntips,(ntips+2):ncol(C))] # sort x by phy$tip.label x-x[phy$tip.label] # function returns the negative log-likelihood likelihood-function(theta,x,C){ a-theta[1] u-theta[2] sig2-theta[3] y-theta[4:length(theta)] logLik-dmnorm(x=c(x,y),mean=(a+diag(C)*u),varcov=sig2*C,log=TRUE) return(-logLik) } # get reasonable starting values for the optimizer a-mean(x) sig2-var(x)/max(C) # perform ML optimization result-optim(par=c(a,0,sig2,rep(a,phy$Nnode-1)),likelihood,x=x,C=C,method= L-BFGS-B,lower=c(-Inf,-Inf,tol,rep(-Inf,phy$Nnode-1)),control=list (maxit=maxit)) # return the result ace-c(result$par[c(1,4:length(result$par))]); names(ace)-c(as.character(phy$edge[1,1]),rownames(C)[(length(phy$tip.label) +1):nrow(C)]) return(list(ace=ace,mu=result$par[2],sig2=result$par[3],logL=-result$value, convergence=result$convergence,message=result$message)) } -- Liam J. Revell University of Massachusetts Boston web: http://www.bio.umb.edu/facstaff/faculty_Revell.html (new) email: liam.rev...@umb.edu (new) blog: http://phytools.blogspot.com On 2/17/2011 12:45 PM, Liam J. Revell wrote: Hi Pasquale, I think this may be possible if all tips are not contemporaneous. As Gene points out, the values at the tips of the tree given a trend should be distributed as ~MVN(mu=diag(vcv(tree))*tr+ancestor, V=vcv(tree)*sig2). Thus, we should be able to compute the likelihood of any data pattern at the tips and internal nodes of the tree, given tr, ancestor, and sig2. If we start with only the values at the tips and the phylogeny, we can estimate the states at internal nodes and our model by maximizing the likelihood. This would be our solution. I have written down some code for this using dmnorm() (from the mnormt package) to compute the multivariate normal density; a home-made function to compute vcv(tree) but for all the nodes in the tree; and optim() to maximize the likelihood - but it doesn't seem to work very well yet. I will pass it along if I can figure it out. - Liam ___ R-sig-phylo mailing list R-sig-phylo@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-phylo