Re: [R-sig-phylo] NEVERMIND -read.tree and read.nexus error

2011-02-21 Thread Emmanuel Paradis
James: it's not the space that is the problem, but one label has 
parentheses:


'Rubus idaeus U06825.1(2)'

Deleting or substituting these parentheses with(out) the spaces and/or 
quotes solves the problem for read.tree(). I remind that this behaviour 
is expected and discussed in ape's FAQ.


Ben: I agree with you and would even go to suggest a revision of the 
Newick format definition to clarify its usage. If I remember correctly, 
the current definition was done in the early 1980's, and it seems 
natural to update it someday. One main issue (to me) is how to 
include/specify other data than the labels and branch lengths.


Emmanuel

Ben Bolker wrote on 21/02/2011 10:31:

  On the other hand, if anyone has any ideas about a paranoid and
verbose Newick file-checker that gives verbose output about what
precisely might be wrong/non-conforming in a Newick file (lint for
Newick for the old-fashioned programmers) to help track this sort of
thing down, that would be probably be very helpful to many on the list ...


On 11-02-20 06:59 PM, James Meadow wrote:

List administrator - Do not post this question.  I picked through the nwk
file again and found a single misplaced space!  it is now fixed.

Sorry about that,
James

On Sun, Feb 20, 2011 at 4:44 PM, James Meadow jfmea...@gmail.com wrote:


Hello,

I have seen this error reported in the help list, but the suggestions do
not fix my problem. I am using a very recent version of ape, and I have also
tried the patch given here:
https://svn.mpl.ird.fr/ape/dev/ape/R/read.nexus.R

I am trying to read.tree a large (300 taxa) nwk tree, and I continue to
get this error:


pltr-read.tree(plant_all_1.nwk)

  # There is apparently two root edges in your file: cannot read tree file.
  # Reading Newick file aborted at tree no.1

I have tried it with .nex, and .nwk and both give the same error.  I can
open this file in any other tree viewer (like FigTree and Mega...) with no
problem, and the root is clearly separated.  I have looked through the .nwk
file but cannot find where there might be a problem.  I have also tried
removing the first and last ( ) and the tree will read, but then plotting
the tree gives this error:

  # Error in plot.phylo(pltree) :
  # there are single (non-splitting) nodes in your tree; you may need to
use collapse.singles()
  # Calls: plot - plot.phylo

and collapse.singles() doesn't fix it.  I have tried to simulate smaller
trees to get the error, but I can't seem to reproduce the situation.  Does
ape have some sort of upper limit on tree size?

The nwk file is attached.

I apologise in advance if I am missing something simple.

As a sidenote, I have been using ape and other r-phylo packages for only a
short time, and I have all but abandoned most other phylo software in the
process - very nice job developing these packages and to EP for a terrific
book!  My dissertation is indebted to you.

Thank you,
James




--
James Meadow
Land Resources and Environmental Sciences
Montana State University
(406) 370-7157
jfmea...@gmail.com






___
R-sig-phylo mailing list
R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo



--
Emmanuel Paradis
IRD, Jakarta, Indonesia
http://ape.mpl.ird.fr/

___
R-sig-phylo mailing list
R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo


Re: [R-sig-phylo] NEVERMIND -read.tree and read.nexus error

2011-02-21 Thread François Michonneau
The package phylobase uses NCL to parse Newick files which provides some
information when a non-valid file is being imported. It's not perfect, but
it will at least give you an idea of the part that is not correctly
formated.

  -- François


On Sun, Feb 20, 2011 at 22:31, Ben Bolker bbol...@gmail.com wrote:

  On the other hand, if anyone has any ideas about a paranoid and
 verbose Newick file-checker that gives verbose output about what
 precisely might be wrong/non-conforming in a Newick file (lint for
 Newick for the old-fashioned programmers) to help track this sort of
 thing down, that would be probably be very helpful to many on the list ...


 On 11-02-20 06:59 PM, James Meadow wrote:
  List administrator - Do not post this question.  I picked through the nwk
  file again and found a single misplaced space!  it is now fixed.
 
  Sorry about that,
  James
 
  On Sun, Feb 20, 2011 at 4:44 PM, James Meadow jfmea...@gmail.com
 wrote:
 
  Hello,
 
  I have seen this error reported in the help list, but the suggestions do
  not fix my problem. I am using a very recent version of ape, and I have
 also
  tried the patch given here:
  https://svn.mpl.ird.fr/ape/dev/ape/R/read.nexus.R
 
  I am trying to read.tree a large (300 taxa) nwk tree, and I continue to
  get this error:
 
  pltr-read.tree(plant_all_1.nwk)
# There is apparently two root edges in your file: cannot read tree
 file.
# Reading Newick file aborted at tree no.1
 
  I have tried it with .nex, and .nwk and both give the same error.  I can
  open this file in any other tree viewer (like FigTree and Mega...) with
 no
  problem, and the root is clearly separated.  I have looked through the
 .nwk
  file but cannot find where there might be a problem.  I have also tried
  removing the first and last ( ) and the tree will read, but then
 plotting
  the tree gives this error:
 
# Error in plot.phylo(pltree) :
# there are single (non-splitting) nodes in your tree; you may need to
  use collapse.singles()
# Calls: plot - plot.phylo
 
  and collapse.singles() doesn't fix it.  I have tried to simulate smaller
  trees to get the error, but I can't seem to reproduce the situation.
  Does
  ape have some sort of upper limit on tree size?
 
  The nwk file is attached.
 
  I apologise in advance if I am missing something simple.
 
  As a sidenote, I have been using ape and other r-phylo packages for only
 a
  short time, and I have all but abandoned most other phylo software in
 the
  process - very nice job developing these packages and to EP for a
 terrific
  book!  My dissertation is indebted to you.
 
  Thank you,
  James
 
 
 
 
  --
  James Meadow
  Land Resources and Environmental Sciences
  Montana State University
  (406) 370-7157
  jfmea...@gmail.com
 
 
 
 

 ___
 R-sig-phylo mailing list
 R-sig-phylo@r-project.org
 https://stat.ethz.ch/mailman/listinfo/r-sig-phylo


[[alternative HTML version deleted]]

___
R-sig-phylo mailing list
R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo


[R-sig-phylo] Phyloinformatics Summer of Code 2011: Call for Mentors

2011-02-21 Thread Hilmar Lapp

(Apologies if you receive multiple copies.)

Over the next 3 weeks we will be pulling together NESCent's  
application to the 2011 Google Summer of Code as a mentoring  
organization. This is a call for all prospective mentors, primary and  
secondary, to step forward.


Participating as an organization is competitive. Over the last years  
the acceptance rate for organizations has been around 30-35%. The most  
important component of organization applications is the Ideas page,  
and specifically the quality and suitability of the project ideas.  
These project ideas are contributed by you, our mentors. In the past  
we have had a strong, diverse and well-documented portfolio of ideas  
with different degrees of difficulty, from different participating  
open-source projects, using different programming languages.


If you can fancy yourself serving as a mentor, or helping someone else  
mentoring a student as a secondary mentor, or would like to help out  
in other capacities, please contact us as soon as you can at phylosoc-ad...@nescent.org 
. If you have not been a mentor with us in previous years, we'll send  
you guidance on what doing so involves, and how you can contribute to  
our participation. We will also add everyone who is interested in  
serving to our (private) mentors mailing list (at least those who  
aren't already).


If you are new to Summer of Code and wonder what it takes or what it  
is like to be a mentor for us, don't hesitate to ask questions or to  
contact previous mentors (see URLs below for projects that got  
selected). Being a mentor does require time (see http://bit.ly/soc2011-mentortime) 
, but our past mentors have pretty much unanimously found it a fun and  
rewarding experience. That's aside from the code a student could  
contribute to your project, and, possibly most important of all in the  
long run, the chance to gain a new developer.


The initial skeleton of our 2011 Ideas page is now up here and ready  
for adding project ideas and mentors(*).


http://informatics.nescent.org/wiki/Phyloinformatics_Summer_of_Code_2011

We will send further guidance on drafting project ideas, but for now  
you can see examples of the format and scope of project ideas on the  
Ideas pages from previous years (click on Ideas):


http://informatics.nescent.org/wiki/Phyloinformatics_Summer_of_Code_2010
http://informatics.nescent.org/wiki/Phyloinformatics_Summer_of_Code_2009
http://informatics.nescent.org/wiki/Phyloinformatics_Summer_of_Code_2008
http://informatics.nescent.org/wiki/Phyloinformatics_Summer_of_Code_2007

Dates:
==

Submission of organization applications starts Feb 28 and closes on  
March 11. For project ideas to contribute to the strength of our  
application they must be in reasonable shape by the morning of March  
11. *If* we are accepted, ideas can be refined (or added) between  
March 18-27.


Students apply March 28-April 8, and selected students are announced  
April 25. The coding period runs from May 23 to August 22. See

http://bit.ly/soc2011-timeline for a full timeline of the whole program.

Cheers, and we look forward to hearing from you!

   Karen Cranston
   Hilmar Lapp

(*) Editing content on the NESCent Informatics wiki (f.k.a. Hackathon  
wiki) requires you to login. We had to disable local account creation  
due to spam getting out of control. The wiki is still open, though -  
just login with your OpenID. If you don't have an OpenID, the Login  
with OpenID page has information on you can easily get one, and if  
you have a Google account, you're all set to go.


___
R-sig-phylo mailing list
R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo


Re: [R-sig-phylo] R: Re: R: Re: Simulation of Continuous Trait with Active Trend

2011-02-21 Thread Liam J. Revell

Hi Pasquale,

To follow up on my earlier message, I have written a preliminary version 
of a simple function to simultaneously estimate the evolutionary 
parameters and ancestral states for Brownian evolution with a trend 
using likelihood.  I will paste it at the end of this email.  I have 
confirmed that the function returns the same trend parameter as 
fitContinuous(model=trend) and the same ancestral character estimates 
as ace() - when mu (the trend parameter) is forced to zero [note that 
this can only be done by modifying the code].  The parameter sig2, the 
Brownian rate, is biased by a factor n/(2n-2) for n species relative to 
fitContinuous(model=trend)$Trait1$beta.


Note that this model can generally only be fit for a tree with tips that 
are not contemporaneous - for instance, for a phylogeny containing 
fossil species.


Best, Liam

Function code (also 
http://anolis.oeb.harvard.edu/~liam/R-phylogenetics/anc.trend/v0.1/anc.trend.R):


# written by Liam J. Revell 2011
anc.trend-function(phy,x,maxit=2000){

# preliminaries
# require dependencies
require(ape)
# set global
tol-1e-8
# compute C
D-dist.nodes(phy)
ntips-length(phy$tip.label)
Cii-D[ntips+1,]
C-D; C[,]-0
counts-vector()
for(i in 1:nrow(D)) for(j in 1:ncol(D))
C[i,j]-(Cii[i]+Cii[j]-D[i,j])/2
dimnames(C)[[1]][1:length(phy$tip)]-phy$tip.label
dimnames(C)[[2]][1:length(phy$tip)]-phy$tip.label
C-C[c(1:ntips,(ntips+2):nrow(C)),c(1:ntips,(ntips+2):ncol(C))]
# sort x by phy$tip.label
x-x[phy$tip.label]

# function returns the negative log-likelihood
likelihood-function(theta,x,C){
a-theta[1]
u-theta[2]
sig2-theta[3]
y-theta[4:length(theta)]

logLik-dmnorm(x=c(x,y),mean=(a+diag(C)*u),varcov=sig2*C,log=TRUE)
return(-logLik)
}

# get reasonable starting values for the optimizer
a-mean(x)
sig2-var(x)/max(C)

# perform ML optimization

result-optim(par=c(a,0,sig2,rep(a,phy$Nnode-1)),likelihood,x=x,C=C,method=L-BFGS-B,lower=c(-Inf,-Inf,tol,rep(-Inf,phy$Nnode-1)),control=list(maxit=maxit))

# return the result
	ace-c(result$par[c(1,4:length(result$par))]); 
names(ace)-c(as.character(phy$edge[1,1]),rownames(C)[(length(phy$tip.label)+1):nrow(C)])


return(list(ace=ace,mu=result$par[2],sig2=result$par[3],logL=-result$value,convergence=result$convergence,message=result$message))

}

--
Liam J. Revell
University of Massachusetts Boston
web: http://www.bio.umb.edu/facstaff/faculty_Revell.html
(new) email: liam.rev...@umb.edu
(new) blog: http://phytools.blogspot.com

On 2/17/2011 12:45 PM, Liam J. Revell wrote:

Hi Pasquale,

I think this may be possible if all tips are not contemporaneous.

As Gene points out, the values at the tips of the tree given a trend
should be distributed as ~MVN(mu=diag(vcv(tree))*tr+ancestor,
V=vcv(tree)*sig2). Thus, we should be able to compute the likelihood of
any data pattern at the tips and internal nodes of the tree, given tr,
ancestor, and sig2. If we start with only the values at the tips and the
phylogeny, we can estimate the states at internal nodes and our model by
maximizing the likelihood. This would be our solution.

I have written down some code for this using dmnorm() (from the mnormt
package) to compute the multivariate normal density; a home-made
function to compute vcv(tree) but for all the nodes in the tree; and
optim() to maximize the likelihood - but it doesn't seem to work very
well yet. I will pass it along if I can figure it out.

- Liam



___
R-sig-phylo mailing list
R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo


[R-sig-phylo] R: Re: R: Re: R: Re: Simulation of Continuous Trait with Active Trend

2011-02-21 Thread pasquale.r...@libero.it


Hi all,

I tried the function written by Liam, it works pretty well! As an indication 
for users, do not use an ultrametric tree, you'd get an error like this:

Error in chol.default(x, pivot = FALSE) : 
  the leading minor of order 140 is not positive definite
Error in pd.solve(varcov) : x appears to be not positive definite

I applied anc.tree to a multichotomous tree, it still works very well, judging 
from fitted values at nodes.
Pas








Messaggio originale
Da: liam.rev...@umb.edu
Data: 22/02/2011 4.58
A: pasquale.r...@libero.itpasquale.r...@libero.it
Cc: R Sig Phylo Listservr-sig-phylo@r-project.org
Ogg: Re: [R-sig-phylo] R: Re: R: Re: Simulation of Continuous Trait with 
Active Trend

Hi Pasquale,

To follow up on my earlier message, I have written a preliminary version 
of a simple function to simultaneously estimate the evolutionary 
parameters and ancestral states for Brownian evolution with a trend 
using likelihood.  I will paste it at the end of this email.  I have 
confirmed that the function returns the same trend parameter as 
fitContinuous(model=trend) and the same ancestral character estimates 
as ace() - when mu (the trend parameter) is forced to zero [note that 
this can only be done by modifying the code].  The parameter sig2, the 
Brownian rate, is biased by a factor n/(2n-2) for n species relative to 
fitContinuous(model=trend)$Trait1$beta.

Note that this model can generally only be fit for a tree with tips that 
are not contemporaneous - for instance, for a phylogeny containing 
fossil species.

Best, Liam

Function code (also 
http://anolis.oeb.harvard.edu/~liam/R-phylogenetics/anc.trend/v0.1/anc.trend.
R):

# written by Liam J. Revell 2011
anc.trend-function(phy,x,maxit=2000){

   # preliminaries
   # require dependencies
   require(ape)
   # set global
   tol-1e-8
   # compute C
   D-dist.nodes(phy)
   ntips-length(phy$tip.label)
   Cii-D[ntips+1,]
   C-D; C[,]-0
   counts-vector()
   for(i in 1:nrow(D)) for(j in 1:ncol(D))
   C[i,j]-(Cii[i]+Cii[j]-D[i,j])/2
   dimnames(C)[[1]][1:length(phy$tip)]-phy$tip.label
   dimnames(C)[[2]][1:length(phy$tip)]-phy$tip.label
   C-C[c(1:ntips,(ntips+2):nrow(C)),c(1:ntips,(ntips+2):ncol(C))]
   # sort x by phy$tip.label
   x-x[phy$tip.label]

   # function returns the negative log-likelihood
   likelihood-function(theta,x,C){
   a-theta[1]
   u-theta[2]
   sig2-theta[3]
   y-theta[4:length(theta)]
   
 logLik-dmnorm(x=c(x,y),mean=(a+diag(C)*u),varcov=sig2*C,log=TRUE)
   return(-logLik)
   }

   # get reasonable starting values for the optimizer
   a-mean(x)
   sig2-var(x)/max(C)

   # perform ML optimization
   
 result-optim(par=c(a,0,sig2,rep(a,phy$Nnode-1)),likelihood,x=x,C=C,method=
L-BFGS-B,lower=c(-Inf,-Inf,tol,rep(-Inf,phy$Nnode-1)),control=list
(maxit=maxit))

   # return the result
   ace-c(result$par[c(1,4:length(result$par))]); 
names(ace)-c(as.character(phy$edge[1,1]),rownames(C)[(length(phy$tip.label)
+1):nrow(C)])
   
 return(list(ace=ace,mu=result$par[2],sig2=result$par[3],logL=-result$value,
convergence=result$convergence,message=result$message))

}

-- 
Liam J. Revell
University of Massachusetts Boston
web: http://www.bio.umb.edu/facstaff/faculty_Revell.html
(new) email: liam.rev...@umb.edu
(new) blog: http://phytools.blogspot.com

On 2/17/2011 12:45 PM, Liam J. Revell wrote:
 Hi Pasquale,

 I think this may be possible if all tips are not contemporaneous.

 As Gene points out, the values at the tips of the tree given a trend
 should be distributed as ~MVN(mu=diag(vcv(tree))*tr+ancestor,
 V=vcv(tree)*sig2). Thus, we should be able to compute the likelihood of
 any data pattern at the tips and internal nodes of the tree, given tr,
 ancestor, and sig2. If we start with only the values at the tips and the
 phylogeny, we can estimate the states at internal nodes and our model by
 maximizing the likelihood. This would be our solution.

 I have written down some code for this using dmnorm() (from the mnormt
 package) to compute the multivariate normal density; a home-made
 function to compute vcv(tree) but for all the nodes in the tree; and
 optim() to maximize the likelihood - but it doesn't seem to work very
 well yet. I will pass it along if I can figure it out.

 - Liam



___
R-sig-phylo mailing list
R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo