from:"Brian O'Meara"

Re: [R-sig-phylo] collapsing sets of nodes based on label values

2020-05-22 Thread Brian O'Meara

You might need to reindex each time, something like (pseudocode):

bad.nodes <- function_to_get_node_numbers_with_support_less_than_90(phy)
while(length(bad.nodes>0)) {
   phy <- collapse.to.star(phy, bad.nodes[1])
   bad.nodes <- function_to_get_node_numbers_with_support_less_than_90(phy)
}

But this will collapse the bad node and all its descendants. I don't see a
function that just collapses at a node and leaves descendants intact, but
this may be what you're looking for (the stuff that ape::collapse.singles
does is similar, and could perhaps be modified).

One pedantic point -- we often talk about node support, collapsing nodes,
etc., but it's really about the branch: the node doesn't have 83% bootstrap
support but the bipartition created by the edge subtending it does. For
example, if the tree is (A,(B,(C,(D,E, if the support for the CDE clade
is 83%, it's actually the proportion of trees that have a split between AB
| CDE, so the branch separating the CDE clade from everything else. The
node itself there has one branch coming in (that comes from the rest of the
tree containing A and B) and two branches coming out (one going to C, one
going to DE). We could summarize trees by actual node support: how many
times we have a node that has an edge going to AB, an edge going to C, and
an edge going to DE, but we typically don't. [For example, the branch could
separate AB | CDE, and the node is AB | C | DE, but one could also have a
node that is AB | CD | E for the same bipartition on the subtending edge].
I only bring it up here because if you end up having to write code to do
collapsing, it can be important to remember that what is actually being
collapsed is the edge subtending the labeled node because it is the the
edge with low support.

Best,
Brian

_______
Brian O'Meara

Professor, Dept. of Ecology & Evolutionary Biology, UT Knoxville
Associate Head, Dept. of Ecology & Evolutionary Biology, UT Knoxville
He/Him/His

On Fri, May 22, 2020 at 9:42 AM Karla Shikev  wrote:

> Dear all,
>
> I'm surprised I could not find a simple function to do this in any of the
> packages I know. I just want to take a tree with node labels (e.g.
> bootstrap values from a RAxML analysis) and collapse all nodes with support
> values below, say, 90%. I tried to do this recursively using
> collapse.to.star in phytools, but node numbers change after the first
> collapsed node and my indexing gets completely off.
>
> with best regards,
>
> Karla
>
> [[alternative HTML version deleted]]
>
> ___
> R-sig-phylo mailing list - R-sig-phylo@r-project.org
> https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
> Searchable archive at
> http://www.mail-archive.com/r-sig-phylo@r-project.org/
>

[[alternative HTML version deleted]]

___
R-sig-phylo mailing list - R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/

Re: [R-sig-phylo] Alllowing and considering "missing data" as values for reconstruction of ancestral character states

2020-04-14 Thread Brian O'Meara

We corresponded with Carolina offline, but here I'm including a generalized
solution of the quick solution we just sent her in case it's useful for
others (avoiding the issue lampooned in https://xkcd.com/979/).

The data was a delimited file with a species column and columns for each
character, with column labels "species" and then the character numbers. So,
loaded into R after calling the relevant libraries:

library(ape)
library(corHMM)
library(geiger)

phy <- read.nexus("cladeb_tree.nex")
data <- read.delim("cladeb_sexch.txt", stringsAsFactors = FALSE)

corHMM can handle polymorphic data, but it needs to be in format 0&3, not
0+3, which was how it was coded initially. So change that:

for (i in 2:ncol(data)) {
  data[,i] <- gsub('\\+', '&', data[,i])
}

Then reconstruct ancestral states and show the output in a pdf. Loop over
each character, deleting taxa from the matrix that have an NA and deleting
taxa from the matrix and from the tree that aren't on both:

trait_indices <- c(2:ncol(data))
pdf(file="CarolinaPlots.pdf")
for (i in seq_along(trait_indices)) {
  col_num <- trait_indices[i]
  data_local <- data[,c(1,col_num)]
  data_local <- data_local[!is.na(data_local[,2]),]
  data_local <- data_local[data_local$species%in%phy$tip.label,]
  bad_tips <- phy$tip.label[!phy$tip.label%in%data_local$species]
  phy_local <- phy
  if(length(bad_tips)>0) {
phy_local <- ape::drop.tip(phy, bad_tips)
  }
  recon <- rayDISC(phy_local, data_local, model="ARD")
  plotRECON(recon$phy,recon$states,title=paste0(colnames(data)[col_num], ":
", paste(names(table(data_local[,2])), "=", table(data_local[,2]),
collapse=", ")))
}
dev.off()

Hopefully a few general tips there. Slightly better style would be to
preface most functions with the package name: use corHMM::rayDISC() rather
than rayDISC -- this is becoming more expected in packages.

Best,
Brian

___
Brian O'Meara

Professor, Dept. of Ecology & Evolutionary Biology, UT Knoxville
Associate Head, Dept. of Ecology & Evolutionary Biology, UT Knoxville
He/Him/His



On Mon, Apr 13, 2020 at 4:18 PM Carolina Santos Vieira <
carolsantosvie...@gmail.com> wrote:

> Hi Liam and everyone
>
> The numbers from this message "1000 trees with a mapped discrete character
> with states: 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43,
> 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55" are the same numbers from
> the nodes on my tree. As I replied to Brian, I assumed that this was a list
> of which states were mapped during make.simmap simulation. This warning
> message only returns when applying the ARD model.
> This character in particular is binary, and has no missing data or
> non-applicable coding.
>
> As I mentioned previously, a big part of my dataset contains polymorphic
> and non-applicable coding, and characters with 3 states (0,1,2). For those,
> when the script runs, the likelihoods are very high, and plots don't agree
> with coding.
>
> Thank you so much for your replies.
> Best,
> Carolina Vieira
> Bióloga - Mestre em Ecologia e Conservação
> Doutoranda em Biologia Animal (PPGBAN - UFRGS)
>
>
>
> <https://www.avast.com/sig-email?utm_medium=email&utm_source=link&utm_campaign=sig-email&utm_content=webmail>
>  Livre
> de vírus. www.avast.com
> <https://www.avast.com/sig-email?utm_medium=email&utm_source=link&utm_campaign=sig-email&utm_content=webmail>.
>
> <#m_2399085183096779712_m_1993609250326834374_DAB4FAD8-2DD7-40BB-A1B8-4E2AA1F9FDF2>
>
> Em seg., 13 de abr. de 2020 às 16:57, Liam J. Revell 
> escreveu:
>
>> Hi Carolina.
>>
>> I agree with Brian that a Q matrix in which the elements of a row are
>> all zeros is not likely to be you problem. It just means that the ML
>> estimated transition rate from 1->0, in your case, is zero.
>>
>> I agree with Brian that the message "1000 trees with a mapped discrete
>> character with states: 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40,
>> 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55" is
>> surprising. Does your discrete character really have 27 states?
>>
>> All the best, Liam
>>
>> Liam J. Revell
>> Associate Professor, University of Massachusetts Boston
>> Profesor Asistente, Universidad Católica de la Ssma Concepción
>> web: http://faculty.umb.edu/liam.revell/, http://www.phytools.org
>>
>> Academic Director UMass Boston Chile Abroad:
>> https://www.umb.edu/academics/caps/international/biology_chile
>>
>> On 4/13/2020 3:23 PM, Brian O'Meara wrote:
>

Re: [R-sig-phylo] Alllowing and considering "missing data" as values for reconstruction of ancestral character states

2020-04-13 Thread Brian O'Meara

With raydisc, you need one column with the species name and a column with
the data; so something like data[,c(1,2)] might be needed. If you want, you
could email the files (tree, data, script to run) to Jeremy (
jmbea...@uark.edu) and me to check out.

For the simmap results, a rate of zero can be the estimate if it seems that
there are no transitions from state 0 to 1. But the number of discrete
states at the end ("1000 trees with a mapped discrete character with
states: 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45,
46, 47, 48, 49, 50, 51, 52, 53, 54, 55") seems unexpected to me.

Best,
Brian


_______
Brian O'Meara

Professor, Dept. of Ecology & Evolutionary Biology, UT Knoxville
Associate Head, Dept. of Ecology & Evolutionary Biology, UT Knoxville
He/Him/His



On Mon, Apr 13, 2020 at 2:32 PM Carolina Santos Vieira <
carolsantosvie...@gmail.com> wrote:

> Hi Liam and all users
>
> My morphological data varies from binary and multisate characters,
> including polymorphism and non-applicable coding for many of those
> characters. I've made a lot of progress after Liam's advices, but lately I
> have been getting an error from make.simmap.
>
> By trying to run a simulation with make.simmap under ARD model on
> characters with states that the probability is zero on occuring in a given
> terminal,  after calling summary(), this error is displayed:
>
> Error in colMeans(x$count) : 'x' must be an array of at least two
> dimensions
>
> When changing the input of the same data in a two column array, the error
> still persists. The same data was analyzed with make.simmap under "ER" and
> "SYM" models, and they run just fine. I am also comparing the results using
> fitDiscrete (geiger) and ace (ape) functions and they display results for
> "ARD" model without a glitch. According to output from fitDiscrete
> function, ARD model is indicated as the "best" model comparing AIC
> likelihoods.
> I've noticed that in this particular character, the "Q" matrix extracted
> from make.simmap simmulation under "ARD" is the following:
>
> Q =
>   01
> 0 -1.916782 1.916782
> 1  0.00 0.00
>
> Are those "zero" probabilities the cause of the error? And by that I mean,
> should I take from this evidence that "ARD" is not an appropriate model to
> this character?
>
> Also, results from characters with "non-applicable" coding (NA in matrix)
> for some terminal are coming out very different from what I would expect.
> Is this coding allowed for ancestral reconstruction?
> I am sorry for the many questions, and I would be very glad if any
> suggestions come to mind.
>
> Best wishes,
> Carolina Vieira.
>
> -
> This is the error rundown:
>
> > anc.tree_ard<-make.simmap(tree,ch434,model="ARD")
> make.simmap is sampling character histories conditioned on the transition
> matrix
>
> Q =
>   01
> 0 -1.916782 1.916782
> 1  0.00 0.00
> (estimated using likelihood);
> and (mean) root node prior probabilities
> pi =
>   0   1
> 0.5 0.5
> Done.
> > plot(anc.tree_ard,node.numbers=T,cores,fsize=0.5,ftype="i")
> > add.simmap.legend(colors=cores,prompt=FALSE,x=0.5*par()$usr[1],
> +   y=-30*par()$usr[3],fsize=0.8)
> > title(main="ACSR \"ARD\" model (make.simmap)")
> > anc.tree_ard<-make.simmap(tree,ch434,model="ARD",nsim=1000)
> make.simmap is sampling character histories conditioned on the transition
> matrix
>
> Q =
>   01
> 0 -1.916782 1.916782
> 1  0.00 0.00
> (estimated using likelihood);
> and (mean) root node prior probabilities
> pi =
>   0   1
> 0.5 0.5
> Done.
> > anc.tree_ard
> 1000 phylogenetic trees with mapped discrete characters
> > simmap_ard<-summary(anc.tree_ard)
> > simmap_ard
> 1000 trees with a mapped discrete character with states:
>  29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46,
> 47, 48, 49, 50, 51, 52, 53, 54, 55
>
> Error in colMeans(x$count) :  'x' must be an array of at least two
> dimensions
>
>
>
> Carolina Vieira
> Bióloga - Mestre em Ecologia e Conservação
> Doutoranda em Biologia Animal (PPGBAN - UFRGS)
>
>
> <
> https://www.avast.com/sig-email?utm_medium=email&utm_source=link&utm_campaign=sig-email&utm_content=webmail
> >
> Livre
> de vírus. www.avast.com
> <
> https://www.avast.com/sig-email?utm_medium=email&utm_source

Re: [R-sig-phylo] ancestral state transitions clustered in time?

2020-04-09 Thread Brian O'Meara

You can use something like corHMM (https://doi.org/10.1093/sysbio/syt034)
and perhaps its rayDISC function to find parts of the tree where the
transition rates between states are different (for your question, where
rates might be higher). Basically, it allows a hidden "trait" (which isn't
necessarily a single trait, just some thing or things that change on the
tree) to affect the transition rates. A good example is
https://academic.oup.com/view-large/figure/115233499/syt034f3.jpeg -- you
can see where transition rates are higher. The nice thing (though note my
conflict of interest, as a coauthor) is that it allows the rates to
actually change over the tree; different approaches that use a single rate
matrix over the whole tree and then reconstruct changes (using things such
as stochastic character mapping) are essentially using a model to infer
things in order to find deviations from a model. One downside of the corHMM
model is that the hidden changes evolve under a continuous time Markov
model -- fine if you think whatever factors affect character rate
(biogeographic changes, presence of certain other species, new
morphological traits, etc.) evolve in this way (which is true for nearly
all of them) but not if a single factor appears suddenly (i.e., radical
change in environment for all species once an asteroid his the planet).
CorHMM might have a timeslice function to allow this, but I'm not sure.

If you have a lot of characters, a different approach could be to estimate
branch lengths under something like a Lewis MKV model, which basically
gives you amount of change averaged across all characters, and then compare
the ratio of those branch lengths to the chronogram branch lengths for
those same edges. Those branches with higher than the average ratio have
more character change per unit of time than other branches. There's
probably a paper or papers that do this, but I don't recall any at the
moment, my apologies.

If the question is about two rate regimes (i.e., before or after a massive
event) or gradual change of rates, you could do tree transformations: use
Geiger's tree transformation function with discrete characters to find the
transformations that best fit the data -- maybe more change earlier than
later, a discrete shift in rate 30 MY ago, etc.

Some standard caveats on these and other potential solutions: ancestral
state reconstruction is very hard to do well: N tips have N-1 ancestral
states to estimate, plus rate parameters. That's a lot, which is why
methods that fit overall rates could be better than reconstructing many
changes. But it's possible to over-interpret these, too: you can in theory
from corHMM get an estimate of the support for every hidden state and rates
for every node, but I wouldn't read a lot into every single node. Also,
there's still the challenge of Maddison & FitzJohn (2015):
https://doi.org/10.1093/sysbio/syu070. It still scares me, personally, and
the issues raised affect our interpretation of many methods that look at
rates, such as the ones here.

Best,
Brian

_______
Brian O'Meara

Professor, Dept. of Ecology & Evolutionary Biology, UT Knoxville
Associate Head, Dept. of Ecology & Evolutionary Biology, UT Knoxville
He/Him/His

On Thu, Apr 9, 2020 at 2:55 PM Elizabeth Miller  wrote:

> I am interested in testing whether ancestral state changes (e.g. habitat
> transitions) are clustered in time and/or in specific lineages rather than
> randomly distributed on phylogeny. Does a method already exist for testing
> this? I am aware of BayesTraits for testing correlation among states, but
> not a method for clustered distribution of state changes with time. What
> immediately comes to mind is simulating an expected temporal distribution
> of state changes if changes were random and comparing to observed temporal
> distribution, but I just want to check if something more sophisticated
> exists.
>
> Thank you!
>
> --
> Elizabeth Miller, Ph.D.
> NSF Postdoctoral Research Fellow
> School of Aquatic and Fishery Sciences
> University of Washington
>
> [[alternative HTML version deleted]]
>
> ___
> R-sig-phylo mailing list - R-sig-phylo@r-project.org
> https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
> Searchable archive at
> http://www.mail-archive.com/r-sig-phylo@r-project.org/
>

[[alternative HTML version deleted]]

___
R-sig-phylo mailing list - R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/

Re: [R-sig-phylo] Phylogenetic varying slopes and intercepts model

2019-09-11 Thread Brian O'Meara

Also worth noting is that people often use taxonomic levels if they don't
have a tree. Trees are more available than might be expected (though still
far less available than they should be), so you can get a tree and use
phylogenetic comparative methods.

Sources of trees:

https://tree.opentreeoflife.org/
https://www.treebase.org/
https://phylo.cs.nmsu.edu/
http://datelife.org/ (though the datelife R package
<https://github.com/phylotastic/datelife> may be more useful, depending on
scale)
https://datadryad.org/

Plus people make individual trees available: large trees for birds, sharks
and rays, plants, etc. that may not be deposited into a formal repository
but are on a website, plus trees stored as supplemental info in papers. So
if you're using taxonomy as a predictor because there doesn't seem to be a
good tree, there might actually be one with a bit of digging.

Best,
Brian

___________
Brian O'Meara, http://brianomeara.info, especially Calendar
<http://brianomeara.info/calendar.html>, CV
<http://brianomeara.info/cv.html>, and Feedback
<http://brianomeara.info/feedback.html>

Professor, Dept. of Ecology & Evolutionary Biology, UT Knoxville
Associate Head, Dept. of Ecology & Evolutionary Biology, UT Knoxville
He/Him/His

On Wed, Sep 11, 2019 at 1:33 AM Joe Felsenstein 
wrote:

> Oscar Inostraza --
>
> Is the variable  x also evolving on the tree?  If so you need to use
> standard phylogenetically-informed comparative methods to estimate the
> variances and covariances of changes in both characters.
>
> You may not be able to assume that  y   responds instantly to  x.
>
> J.F.
> 
> Joe Felsenstein, j...@gs.washington.edu
> Department of Genome Sciences, University of Washington, Box 355065,
> Seattle, WA. 98195-5065 USA
>
> [[alternative HTML version deleted]]
>
> ___
> R-sig-phylo mailing list - R-sig-phylo@r-project.org
> https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
> Searchable archive at
> http://www.mail-archive.com/r-sig-phylo@r-project.org/
>

[[alternative HTML version deleted]]

___
R-sig-phylo mailing list - R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/

Re: [R-sig-phylo] model averaging using brownie.lite

2019-09-05 Thread Brian O'Meara

ame number of sites, I bet the first
tree has more meaningful data: the pectinate tree with those branch lengths
will likely have all but one of the taxa having nearly identical states. So
I think tree shape and branch lengths should matter for this. I've done
some preliminary analyses on this, building on Beaulieu et al. (2018) and
Jhwueng et al. (2014,  https://doi.org/10.1515/sagmb-2013-0048, also note
COI), but nothing definitive yet.

It's also worth looking at Ho and Ané (2014,
https://doi.org/10./2041-210X.12285) who talk about AIC in the context
of OU shifts, but who get into sample size with shifts in a modified BIC
that uses taxa in different regimes as sample size (but again, univariate,
so maybe it's actually matrix size).

I also probably am missing important work by others -- my apologies if so.
If you know of any, please let me know (and probably Karla, too!).

So, in summary yeah, what Liam said: number of taxa, but it might be
more complex.

Best,
Brian

___________
Brian O'Meara, http://brianomeara.info, especially Calendar
<http://brianomeara.info/calendar.html>, CV
<http://brianomeara.info/cv.html>, and Feedback
<http://brianomeara.info/feedback.html>

Professor, Dept. of Ecology & Evolutionary Biology, UT Knoxville
Associate Head, Dept. of Ecology & Evolutionary Biology, UT Knoxville
He/Him/His



On Thu, Sep 5, 2019 at 10:00 AM Liam Revell  wrote:

> Dear Karla.
>
> In my opinion, it is probably correct to use the number of tips on the
> tree as the sample size for AICc when estimating the Brownian rate: as
> the number of independent pieces of information is n-1, just like with
> an ordinary variance. For other parameters in phylogenetic comparative
> analyses, the effective sample size may be different, however.
>
> All the best, Liam
>
> Liam J. Revell
> Associate Professor, University of Massachusetts Boston
> Profesor Asistente, Universidad Católica de la Ssma Concepción
> web: http://faculty.umb.edu/liam.revell/, http://www.phytools.org
>
> Academic Director UMass Boston Chile Abroad (starting 2019):
> https://www.umb.edu/academics/caps/international/biology_chile
>
> On 9/5/2019 9:49 AM, Karla Shikev wrote:
> > [EXTERNAL SENDER]
> >
> > Thanks so much, Liam! Just one quick follow-up question: what do you
> > suggest should be the sample size for transforming AIC into AICc? the
> > number of tips on the tree?
> >
> > Karla
> >
> > On Thu, Sep 5, 2019 at 10:27 AM Liam Revell  wrote:
> >
> >> Dear Karla.
> >>
> >> You could try & create your own logLik method for the object class
> >> "brownie.lite" as follows:
> >>
> >> ## method
> >> logLik.brownie.lite<-function(object,...){
> >>  lik<-setNames(
> >>  c(object$logL1,object$logL.multiple),
> >>  c("single-rate","multi-rate"))
> >>  attr(lik,"df")<-c(object$k1,object$k2)
> >>  lik
> >> }
> >> ## fit model
> >> fit<-brownie.lite(tree,x)
> >> ## use it
> >> logLik(fit)
> >> AIC(fit)
> >>
> >> All the best, Liam
> >>
> >> Liam J. Revell
> >> Associate Professor, University of Massachusetts Boston
> >> Profesor Asistente, Universidad Católica de la Ssma Concepción
> >> web: http://faculty.umb.edu/liam.revell/,
> https://nam01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.phytools.org&data=02%7C01%7Cliam.revell%40umb.edu%7C04607945a9f74968c14e08d73207f67f%7Cb97188711ee94425953c1ace1373eb38%7C0%7C0%7C637032882107478464&sdata=ofsem4h4SNk6g6QFUwD%2BJKO3TsTArNfH9%2BAyYDEjCvY%3D&reserved=0
> >>
> >> Academic Director UMass Boston Chile Abroad (starting 2019):
> >> https://www.umb.edu/academics/caps/international/biology_chile
> >>
> >> On 9/5/2019 9:13 AM, Karla Shikev wrote:
> >>> [EXTERNAL SENDER]
> >>>
> >>> Dear all,
> >>>
> >>> I've been trying to use brownie.lite to implement the tutorial
> available
> >>> here (
> >>
> https://nam01.safelinks.protection.outlook.com/?url=http%3A%2F%2Ftreethinkers.org%2Ftutorials%2Fmorphological-evolution-in-r%2F&data=02%7C01%7Cliam.revell%40umb.edu%7C04607945a9f74968c14e08d73207f67f%7Cb97188711ee94425953c1ace1373eb38%7C0%7C0%7C637032882107478464&sdata=64k6WMtazzmyn0SLRrx2wEA%2F2wkk3%2B%2F3dBS0HtjlUT8%3D&reserved=0
> )
> >> to
> >>> calculate model-averaged rates of evolution and for model selection (1
> >>> versus 2

Re: [R-sig-phylo] Number of parameters in BM and OU models in OUwie

2019-09-03 Thread Brian O'Meara

Hi, Rafael.

It applies for the simple OU model. Relevant code in OUwie:

  if (model == "OU1"){
np=2
index<-matrix(TRUE,2,k)
index.mat[1,1:k]<-1
index.mat[2,1:k]<-2
if(root.station==TRUE){
  param.count<-np+1
}
if(root.station==FALSE){
  param.count<-np+2
}
bool=root.station
  }

If root stationarity is assumed, there are three parameters: alpha and
sigma.sq (the np=2 cost) and theta (the +1 for param.count); if not
assumed, there is +2 for the param.count for theta at root and theta
elsewhere.  For BM1, np is 1 (sigma.sq) and param.count is 2 (root state,
though the MLE for this does not require optimization as there's an
equation for it), so they should be comparable.

Best,
Brian

_______
Brian O'Meara, http://brianomeara.info, especially Calendar
<http://brianomeara.info/calendar.html>, CV
<http://brianomeara.info/cv.html>, and Feedback
<http://brianomeara.info/feedback.html>

Professor, Dept. of Ecology & Evolutionary Biology, UT Knoxville
Associate Head, Dept. of Ecology & Evolutionary Biology, UT Knoxville
He/Him/His



On Tue, Aug 27, 2019 at 12:24 PM Rafael S Marcondes 
wrote:

> Hi all,
>
> I'm trying to understand the number of parameters in simple BM and OU
> models in OUwie. $param.count in the fitted results gives 2 for BM and 3
> for OU. I know that the 2 BM params are sigma and the trait value at the
> root, but I'm a little confused about the OU params. I expected there to be
> 4 of them: sigma, alpha, theta and the trait value at the root.
>
> Digging a little deeper, the OUwie documentation explains that under the
> default settings (which I used), the value at the root is dropped from the
> model and drawn from the stationary distribution of the OU process. But the
> documentation only says that explicitly for the more complex OUMV and OUMVA
> models. So I just wanted to make sure that that is also applied to the
> simple OU model.
>
> I'm also wondering about the implications of this for comparing the fits of
> OU and BM, if for one (BM) the root value is estimated and for the other
> (OU) it is not...
>
> Thanks in advance,
>
>
> *--*
> *Rafael Sobral Marcondes*
> PhD Candidate (Systematics, Ecology and Evolution/Ornithology)
>
> Museum of Natural Science <http://www.lsu.edu/mns/>
> Louisiana State University
> 119 Foster Hall
> Baton Rouge, LA 70803, USA
>
> <http://goog_27804804>
> www.rafaelmarcondes.com
>
> [[alternative HTML version deleted]]
>
> ___
> R-sig-phylo mailing list - R-sig-phylo@r-project.org
> https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
> Searchable archive at
> http://www.mail-archive.com/r-sig-phylo@r-project.org/
>

[[alternative HTML version deleted]]

___
R-sig-phylo mailing list - R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/

Re: [R-sig-phylo] model averaging for discrete character evolution

2019-08-07 Thread Brian O'Meara

I think this is ok in theory: say you have 10% of the weight on the ER
model, 90% on the ARD model, so your states at the nodes are based 10% on
the probabilities from ER, 90% from ARD. [Worth noting, of course, that
with N tips with one character each, estimating N-1 ancestral states, plus
the rates, plus the model weights, is asking a bit much of the data -- it
can be worth asking if there's another way to answer the biological
question]. The risk of model averaging is if some of the models are truly
terrible: you might get extreme parameter values because there's not enough
info to estimate them well. It won't be apparent with discrete ancestral
states (since there's a finite, reasonable state space), but you could have
an asymmetrical rate near infinity, for example, if you fit an ARD model
with not much data, and infinity times a weight for that model still means
a big number.

Model averaging for parameter estimates between the rates is ok, btw:

symmetrical model:

rate_0_to_1 = s
rate_1_to_0 = s

asymmetrical model:

rate_0_to_1 = a
rate_1_to_0 = b

weighted params:

rate_0_to_1 = s * weight_sym + a * weight_asym
rate_1_to_0 = s * weight_sym + b * weight_asym

Best,
Brian

_______
Brian O'Meara, http://www.brianomeara.info, especially Calendar
<http://brianomeara.info/calendars/omeara/>, CV
<http://brianomeara.info/cv/>, and Feedback
<http://brianomeara.info/teaching/feedback/>

Professor, Dept. of Ecology & Evolutionary Biology, UT Knoxville
Associate Head, Dept. of Ecology & Evolutionary Biology, UT Knoxville

On Wed, Aug 7, 2019 at 11:33 AM Jacob Berv 
wrote:

> Dear R-sig phylo
>
> I’ve been running a few discrete character Mk type models using phytool's
> SIMMAP — and I had the idea that it might be useful to try model averaging
> across posterior probabilities for node states.
>
> Might this make sense to do, to avoid problems associated with model
> ranking via AIC? Ie, average the node state probabilities based on AIC
> weights? Is there some fundamental problem with this?
>
> I could imagine generating some code to generate all possible transition
> models given a set of N states, and then rather than ranking with AIC,
> model averaging for parameter estimates (though now that I think about it,
> not sure how one might reasonably average a symmetrical rate and an
> asymmetrical rate).
>
> Best,
> Jake
>
> [[alternative HTML version deleted]]
>
> ___
> R-sig-phylo mailing list - R-sig-phylo@r-project.org
> https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
> Searchable archive at
> http://www.mail-archive.com/r-sig-phylo@r-project.org/
>

[[alternative HTML version deleted]]

___
R-sig-phylo mailing list - R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/

Re: [R-sig-phylo] patristic distance doubts and meanings

2019-02-06 Thread Brian O'Meara

Hi, Bruno. It's also possible that there are other ways of getting at the
biological question you're going after, since others also examine how
phylogenetic relatedness affects overlap (my apologies for the redundancy
if you know about this already, just a teachable moment, as they say). An
early, key review of this is Webb et al., (2002):
https://doi.org/10.1146/annurev.ecolsys.33.010802.150448 and a more recent
review is Cavender-Bares et al. (2009):
https://doi.org/10./j.1461-0248.2009.01314.x. The literature blossoms
from there, but those should give a feel for the approaches.

Here's some packages used for things like overlap in species in areas
[stolen from the task view
<https://cloud.r-project.org/web/views/Phylogenetics.html>]:

*Community/Microbial Ecology *: picante
<https://cloud.r-project.org/web/packages/picante/index.html>, vegan
<https://cloud.r-project.org/web/packages/vegan/index.html>, SYNCSA
<https://cloud.r-project.org/web/packages/SYNCSA/index.html>, phylotools
<https://cloud.r-project.org/web/packages/phylotools/index.html>, PCPS
<https://cloud.r-project.org/web/packages/PCPS/index.html>, caper
<https://cloud.r-project.org/web/packages/caper/index.html>, DAMOCLES
<https://cloud.r-project.org/web/packages/DAMOCLES/index.html>, and cati
<https://cloud.r-project.org/web/packages/cati/index.html> integrate
several tools for using phylogenetics with community ecology. HMPTrees
<https://cloud.r-project.org/web/packages/HMPTrees/index.html> and GUniFrac
<https://cloud.r-project.org/web/packages/GUniFrac/index.html> provide
tools for comparing microbial communities. betapart
<https://cloud.r-project.org/web/packages/betapart/index.html> allows
computing pair-wise dissimilarities (distance matrices) and multiple-site
dissimilarities, separating the turnover and nestedness-resultant
components of taxonomic (incidence and abundance based), functional and
phylogenetic beta diversity. adiv
<https://cloud.r-project.org/web/packages/adiv/index.html> can calculate
various indices of biodiversity including species, functional and
phylogenetic diversity, as well as alpha, beta, and gamma diversities.
entropart <https://cloud.r-project.org/web/packages/entropart/index.html> can
measure and partition diversity based on Tsallis entropy as well as
calculate alpha, beta, and gamma diversities. ecospat
<https://cloud.r-project.org/web/packages/ecospat/index.html> can also
examine phylogenetic diversity. metacoder
<https://cloud.r-project.org/web/packages/metacoder/index.html> is an R
package for handling large taxonomic data sets, like those generated from
modern high-throughput sequencing, like metabarcoding.

Best,
Brian

___
Brian O'Meara, http://www.brianomeara.info, especially Calendar
<http://brianomeara.info/calendars/omeara/>, CV
<http://brianomeara.info/cv/>, and Feedback
<http://brianomeara.info/teaching/feedback/>

Associate Professor, Dept. of Ecology & Evolutionary Biology, UT Knoxville
Associate Head, Dept. of Ecology & Evolutionary Biology, UT Knoxville

On Wed, Feb 6, 2019 at 3:07 PM Eliot Miller  wrote:

> The amount of time two taxa have been evolving independently from one
> another is, barring some complex introgression, independent of other
> species in the tree. Pruning the tree from however many species to 700
> won't affect the distances between those 700. After pruning, use
> cophenetic() to nearly instantly calculate the relevant distances between
> taxa. All the values you need are in the matrix. Access the evolutionary
> distance between any two taxa by pulling the relevant element of the
> resulting cophenetic matrix, e.g. copheneticMatrix["sp1", "sp2"]. If you
> are interested in the time since two taxa shared a common ancestor,
> assuming the tree is ultrametric and in units of time, that value is the
> distance between those two taxa divided by 2.
>
> Cheers,
> Eliot
>
> On Mon, Feb 4, 2019 at 5:30 PM Bruno Garcia Luize 
> wrote:
>
> > Thank you Liam,
> >
> > For sure it is a lot of computational time. I'm only interested in a
> subset
> > of species, lets say close to 700 species in a regional species pool. I
> did
> > the computations using fastDist and it take almost 7 hs to complete in a
> > windows based 16 Gb RAM. The function was realy useful for my propose.
> > Another doubt that came is if I prune the bigger tree with my species
> list
> > and apply the ape::cophenetic the distances will remain the same or the
> > prunning will change the tree edges lengths and consequently the
> patristic
> > distances. But, by the way I see that fastDist did the work I want.
> >
> > Best regards,
>

Re: [R-sig-phylo] Mesquite->ape node numbers and/or parsimony ancestral reconstruction

2019-01-31 Thread Brian O'Meara

It's possible that Mesquite (and within R) that discrete traits may be
interpreted by the software as continuous, depending on how they're stored
-- so it could do a reconstruction of a character with tip states of 0, 1,
and 2 and get ancestral values of 0.1, 1.75, etc. using squared change
parsimony or the equivalent. This isn't something one should do (for one
thing, if the trait is unordered, the ancestor of a clade with taxa in
state 0 and 2 having a state of 1 doesn't make sense) but it is possible
someone could run squared change parsimony on discrete traits if the
software mistakenly treats them as continuous.

Best,
Brian

_______
Brian O'Meara, http://www.brianomeara.info, especially Calendar
<http://brianomeara.info/calendars/omeara/>, CV
<http://brianomeara.info/cv/>, and Feedback
<http://brianomeara.info/teaching/feedback/>

Associate Professor, Dept. of Ecology & Evolutionary Biology, UT Knoxville
Associate Head, Dept. of Ecology & Evolutionary Biology, UT Knoxville



On Thu, Jan 31, 2019 at 2:53 PM Liam Revell  wrote:

> Roland.
>
> You have not done 'squared change parsimony' in that case, but just
> parsimony. The problem (among various) is that for a discrete character
> there is not always a single most parsimonious set of ancestral states
> and there are often many equally parsimonious reconstructions. That's
> why some scientists have developed criteria (such as 'ACCTRAN' and
> 'DELTRAN') for consistently selecting one or two among the usually many
> equally parsimonious reconstructions. One should not be deceived by
> this, however. Just because the ACCTRAN and DELTRAN criteria exist
> doesn't mean that these MP solutions for ancestral states are any more
> probable than any other equally parsimonious solution (and under some
> circumstances all parsimonious solutions can be misleading).
>
> I believe parsimony ancestral state reconstruction is implemented in the
> phangorn function ancestral.pars.
>
> All the best, Liam
>
> Liam J. Revell
> Associate Professor, University of Massachusetts Boston
> Profesor Asistente, Universidad Católica de la Ssma Concepción
> web: http://faculty.umb.edu/liam.revell/, http://www.phytools.org
>
> On 1/31/2019 2:18 PM, Roland Sookias wrote:
> > Dear Liam,
> >
> > Thanks very much indeed - a really nice blog post. However, the issue is
> > that I am using discrete states, which are not ordered. (and to make
> things
> > more exciting there are also polymorphisms and missing data - although
> they
> > can be worked around I guess)
> >
> > Is there any way to get ML to yield the same as SCP would (in Mesquite at
> > least) for this kind of data?
> >
> > Best wishes
> >
> > Roland
> >
> > On Thu, Jan 31, 2019 at 8:03 PM Liam Revell  wrote:
> >
> >> Dear Roland.
> >>
> >> I didn't really solve your problem of reading a Mesquite result into R;
> >> however I just posted about the relationship between unweighted &
> >> weighted square-change-parsimony & ML ancestral character estimation on
> >> my blog:
> >> http://blog.phytools.org/2019/01/unweighted-weighted-square-change.html
> .
> >> (In summary, you should be able to do what you want within R.)
> >>
> >> All the best, Liam
> >>
> >> Liam J. Revell
> >> Associate Professor, University of Massachusetts Boston
> >> Profesor Asistente, Universidad Católica de la Ssma Concepción
> >> web: http://faculty.umb.edu/liam.revell/, http://www.phytools.org
> >>
> >> On 1/31/2019 12:48 PM, Roland Sookias wrote:
> >>> Dear all,
> >>>
> >>> I want to get squared change parsimony ancestral state reconstructions
> >> for
> >>> a matrix into R.
> >>>
> >>> I have tried the asr_max_parsimony() function from castor, but the
> >> results
> >>> are not the same as in Mesquite for example. To take an example, a
> >>> two-taxon clade with a unique state for these taxa optimized as having
> a
> >>> different state (shared by near relatives) at the ancestral node. This
> >>> doesn't happen in Mesquite and is *not* what I want.
> >>>
> >>> If there were some way to get the results from Mesquite into R that
> would
> >>> also be OK, but I cannot figure out the node numbering system in
> >> Mesquite,
> >>> to match them up with nodes as numbered by ape.
> >>>
> >>> Any help would be *very* greatly appreci

Re: [R-sig-phylo] OU for non-ultrametric trees

2018-12-06 Thread Brian O'Meara

You can run in OUwie if you set the root.age (we should probably make that
automatic, but haven't yet). See
https://www.rdocumentation.org/packages/OUwie/versions/1.50/topics/OUwie. I
believe the past issue with non-ultrametric trees comes from functions that
handles the creation of the VCV matrix inappropriately by assuming an
ultrametric tree. Using non-ultrametric trees may be possible in other OU
packages (ouch, slouch, phylolm, MVmorph, etc.) but I don't know for sure.

It's also worth checking for why your tree is non-ultrametric. If it has a
mixture of extinct and modern taxa, that's cool. If it's because you think
that the branch lengths reflect the amount of change that'd drive the
continuous character change (i.e., they mean number of generations for taxa
with different generation times, so they don't have the same root to tip
length) that's ok, too. If it's just that those are raw branch lengths
(parsimony, likelihood) without making the tree a chronogram, that's bad --
the main thing OU/BM models are doing is effectively stretching and
smooshing branch lengths (for OU, also adjusting the expected means) so if
the starting branch lengths don't mean something that's relevant, the
parameters you get from stretching also don't mean anything. [I imagine
you're doing it well, just a teachable moment for new students on the list].

Best,
Brian

_______
Brian O'Meara, http://www.brianomeara.info, especially Calendar
<http://brianomeara.info/calendars/omeara/>, CV
<http://brianomeara.info/cv/>, and Feedback
<http://brianomeara.info/teaching/feedback/>

Associate Professor, Dept. of Ecology & Evolutionary Biology, UT Knoxville
Associate Head, Dept. of Ecology & Evolutionary Biology, UT Knoxville

On Thu, Dec 6, 2018 at 7:35 AM Danielle Miller 
wrote:

> Hi all,
>
> I have a non-rooted non-ultrametric tree and a corresponding set of a
> single trait values for each one of the tips.
> I’m interested whether it follows a BM model or an OU.
>
> Reading previous comments in the archive, I understood that running an OU
> process is inadequate in a case of non-ultrametric trees, however I did not
> fully understand why.
> As I use ouch package, what are the consequences of running an unrooted
> tree? And a non-ultrametric one?
>
> What would be the best way assessing this issue? Diversitree should be
> more reliable?
>
> I will appreciate any explanation and help,
> Danielle Miller
> ___
> R-sig-phylo mailing list - R-sig-phylo@r-project.org
> https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
> Searchable archive at
> http://www.mail-archive.com/r-sig-phylo@r-project.org/
>

[[alternative HTML version deleted]]

___
R-sig-phylo mailing list - R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/

Re: [R-sig-phylo] understanding variance-covariance matrix

2018-08-25 Thread Brian O'Meara

Hi, Agus. The variance-covariance matrix comes from the tree and the
evolutionary model, not the data. Each entry between taxa A and B in the
VCV is how much covariance I should expect between data for taxa A and B
simulated up that tree using that model. I don't want to be *that guy*, but
O'Meara et al. (2006)
https://onlinelibrary.wiley.com/doi/10./j.0014-3820.2006.tb01171.x has
a fairly accessible explanation of this (largely b/c I was just learning
about VCVs when working on that paper). Hansen and Martins (1996)
https://onlinelibrary.wiley.com/doi/10./j.1558-5646.1996.tb03914.x have
a much more detailed description of how you get these covariance matrices
from microevolutionary processes.

Typically, ape::vcv() is how you get a variance covariance for a phylogeny,
assuming Brownian motion and no measurement error. It just basically takes
the history two taxa share to create the covariance (or variance, if the
two taxa are the same taxon). A different approach, which seems to be what
you're doing, would be to simulate up a tree many times, and then for each
pair of taxa (including the pair of a taxon with itself, the diagonal of
the VCV), calculate the covariance. These approaches should get the same
results, though the shared history on the tree approach is faster.

Best,
Brian

_______
Brian O'Meara, http://www.brianomeara.info, especially Calendar
<http://brianomeara.info/calendars/omeara/>, CV
<http://brianomeara.info/cv/>, and Feedback
<http://brianomeara.info/teaching/feedback/>

Associate Professor, Dept. of Ecology & Evolutionary Biology, UT Knoxville
Associate Head, Dept. of Ecology & Evolutionary Biology, UT Knoxville

On Sat, Aug 25, 2018 at 1:16 PM Agus Camacho  wrote:

> Dear list users,
>
> I am trying to make an easy R demonstration to teach the
> variance-covariance matrix to students. However, After consulting the
> internet and books, I found myself facing three difficulties to understand
> the math and code behind this important matrix. As this list is answered by
> several authors of books of phylocomp methods, thought this might make an
> useful general discussion.
>
> Here we go,
>
> 1) I dont know how to generate a phyloVCV matrix in R (Liams kindly
> described some options here
> <
> http://blog.phytools.org/2013/12/three-different-ways-to-calculate-among.html
> >
> but I cannot tell for sure what is X made of. It would seem a dataframe of
> some variables measured across species. But then, I get errors when I
> write:
>
>  tree <- pbtree(n = 10, scale = 1)
>  tree$tip.label <- sprintf("sp%s",seq(1:n))
>  x <- fastBM(tree)
> y <- fastBM(tree)
>   X=data.frame(x,y)
>  rownames(X)=tree$tip.label
>  ## Revell (2009)
>  A<-matrix(1,nrow(X),1)%*%apply(X,2,fastAnc,tree=tree)[1,]
>  V1<-t(X-A)%*%solve(vcv(tree))%*%(X-A)/(nrow(X)-1)
>## Butler et al. (2000)
>Z<-solve(t(chol(vcv(tree%*%(X-A)
>  V2<-t(Z)%*%Z/(nrow(X)-1)
>
>## pics
>Y<-apply(X,2,pic,phy=tree)
>  V3<-t(Y)%*%Y/nrow(Y)
>
> 2) The phyloVCV matrix has n x n coordinates defined by the n species, and
> it represents covariances among observations made across the n species,
> right?. Still, I do no know whether these covariances are calculated over
> a) X vs Y values for each pair of species coordinates in the matrix, across
> the n species, or b) directly over the vector of n residuals of Y, after
> correlating Y vs X, across all pairs of species coordinates. I think it may
> be a) because, by definition, variance cannot be calculated for a single
> value. I am not sure though, since it seems the whole point of PGLS is to
> control phylosignal within the residuals of a regression procedure, prior
> to actually making it.
>
> 3) If I create two perfeclty correlated variables with independent
> observations and calculate a covariance or correlation matrix for them, I
> do not get a diagonal matrix, with zeros at the off diagonals (ex. here
> <
> https://www.dropbox.com/s/y8g3tkzk509pz58/vcvexamplewithrandomvariables.xlsx?dl=0
> >),
> why expect then a diagonal matrix for the case of independence among the
> observations?
>
> Thanks in advance and sorry if I missed anything obvious here!
> Agus
> Dr. Agustín Camacho Guerrero. Universidade de São Paulo.
> http://www.agustincamacho.com
> Laboratório de Comportamento e Fisiologia Evolutiva, Departamento de
> Fisiologia,
> Instituto de Biociências, USP.Rua do Matão, trav. 14, nº 321, Cidade
> Universitária,
> São Paulo - SP, CEP: 05508-090, Brasil.
>
> [[alternative HTML version deleted]]
>
> ___
> R-sig-phylo mailing list - R-s

Re: [R-sig-phylo] keeping states discrete when plotting HiSSE results

2018-07-09 Thread Brian O'Meara

Hi, Bonnie. It's not treating binary characters as continuous, but rather
showing the uncertainty in what the state is: an intermediate color means
it's not really sure. There's substantial uncertainty with ancestral
states, and our goal was to communicate that rather than the point estimate
of whatever state is best at a time lest people over interpret it [which
reminds me to finish testing the justifiably controversial OUwie ancestral
state recon that was asked about before on R-sig-phylo]. However, the way
plot.hisse.states works is basically plotting a tree of rates and then a
narrower tree of states on top of that, so I think you could just plot a
third tree of states on top of all that. You can get the internal
reconstructions and terminal states as binary characters by doing

states.tips <- hisse:::ConvertManyToBinaryState(hisse.results, "tip.mat")
states.internal <- hisse:::ConvertManyToBinaryState(hisse.results,
"node.mat")

and then use phytools to plot a tree with discrete traits using these on
top of the hisse plot. Here's the relevant code we use for plotting: you'll
want to change contMapGivenAnc to a function to give discrete change
mappings.

state.tree <- contMapGivenAnc(tree=tree.to.plot, x=states.tips, plot=FALSE,
anc.states=states.internal, lims=state.lims, ...)
state.colors <- colorRampPalette(state.colors,
space="Lab")(length(rate.tree$cols))
state.tree$cols[]<- state.colors
plot(state.tree, outline=FALSE, lwd=edge.width.state, legend=FALSE,
type=type, fsize=fsize, ...)

Hope this helps,
Brian

_______
Brian O'Meara, http://www.brianomeara.info, especially Calendar
<http://brianomeara.info/calendars/omeara/>, CV
<http://brianomeara.info/cv/>, and Feedback
<http://brianomeara.info/teaching/feedback/>

Associate Professor, Dept. of Ecology & Evolutionary Biology, UT Knoxville
Associate Head, Dept. of Ecology & Evolutionary Biology, UT Knoxville
Associate Director for Postdoctoral Activities, National Institute for
Mathematical & Biological Synthesis <http://www.nimbios.org> (NIMBioS)

On Tue, Jul 3, 2018 at 9:44 AM Bonnie Blaimer 
wrote:

> Hi all,
>
> I am using the HiSSE package and function plot.hisse.states on a list of
> several HiSSE objects to generate a heatmap phylogeny of model-averaged
> results.
>
> I'm wondering if there is a way to plot the character state reconstructions
> as being discrete changes, and if so, could someone provide guidance on
> how? The default seems to treat the binary characters as continuous and
> shows a somewhat gradual transition between the two different colors
> specified for the two states.
>
> Many thanks!
> Bonnie
>
> --
> *Bonnie B Blaimer*
> Assistant Professor
> Department of Entomology & Plant Pathology
> 3204D Gardner Hall, Campus Box 7613
> North Carolina State University
> Raleigh, NC 27695-7613
>
> [[alternative HTML version deleted]]
>
> ___
> R-sig-phylo mailing list - R-sig-phylo@r-project.org
> https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
> Searchable archive at
> http://www.mail-archive.com/r-sig-phylo@r-project.org/
>

[[alternative HTML version deleted]]

___
R-sig-phylo mailing list - R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/

Re: [R-sig-phylo] diversification rates with two traits present simultaneously

2018-06-19 Thread Brian O'Meara

Rich FitzJohn has programmed something of what Will suggested in
diversitree: look at details in ?make.musse.multitrait. I'm not sure if
there was ever a paper published on this. We do a similar thing to Will's
idea in http://rspb.royalsocietypublishing.org/content/283/1830/20152304
(with the added complication of varying over combinations of traits, since
converting six binary traits to a 2^6 state musse model would have been a
bit ambitious given the data size [or, frankly, given any possible sampling
from the tree of life]). However, neither approach deals with hidden
sources of rate heterogeneity. Jeremy Beaulieu and Daniel Caetano are
always adding stuff to hisse that deals with hidden rates (see
preprint on a hidden
biogeography model at
https://www.biorxiv.org/content/early/2018/04/17/302729), but I don't think
anything has been peer-reviewed and published yet for this use case (I
think the core hisse model is also in RevBayes, and that may have added
options making a musse version possible). If your tree is big enough,
separating it into clades and seeing if the pattern (i.e., when converting
00 -> 1, 01 -> 2, 10 -> 3, 11 -> 4 for musse, and you find that state 3
(aka binary states 10) has highest diversification rate) is consistent
across the clades could be a good way of seeing if there are some kinds of
heterogeneity that matter.

Best,
Brian

_______
Brian O'Meara, http://www.brianomeara.info, especially Calendar
<http://brianomeara.info/calendars/omeara/>, CV
<http://brianomeara.info/cv/>, and Feedback
<http://brianomeara.info/teaching/feedback/>

Associate Professor, Dept. of Ecology & Evolutionary Biology, UT Knoxville
Associate Head, Dept. of Ecology & Evolutionary Biology, UT Knoxville
Associate Director for Postdoctoral Activities, National Institute for
Mathematical & Biological Synthesis <http://www.nimbios.org> (NIMBioS)

On Tue, Jun 19, 2018 at 2:03 AM Gilles Benjamin Leduc  wrote:

> Hello,
>
> Personally, I would advise you not to use a tree, for your 2 traits times
> 2 (+/-) makes 4 combinations, meaning a tree that is dichotomic cannot
> reveal anything!  Can you quantify your "diversification" without "rate"?
> How do you obtain your  "diversification rates"? If it is something you can
> count, you may try an old fashion Khi-square test… you would then know if
> there is a correlation…
>
> Have a nice day,
>
> Benjamin
>
>
> Le Lundi 18 Juin 2018 23:46 GMT, Elizabeth Christina Miller <
> ecmil...@email.arizona.edu> a écrit:
>
> > Hello,
> >
> > I am wondering what comparative method(s) is appropriate for testing if
> > diversification rates are highest when two traits are present together,
> > rather than one alone? Specifically, if I have two binary traits, let's
> say
> > freshwater/marine and temperate/tropical, what is the best way to test if
> > diversification rates are highest in tropical+freshwater groups, as
> opposed
> > to tropical+marine or temperate+freshwater? I think MuSSE can do this,
> but
> > my tree is large so I want to avoid issues associated with rate
> > heterogeneity. Are there any alternatives?
> >
> >
> >
> > --
> > Elizabeth Miller
> > PhD Candidate: Wiens Lab
> > Ecology and Evolutionary Biology
> > University of Arizona
> >
> >   [[alternative HTML version deleted]]
> >
> > ___
> > R-sig-phylo mailing list - R-sig-phylo@r-project.org
> > https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
> > Searchable archive at
> http://www.mail-archive.com/r-sig-phylo@r-project.org/
>
> ___
> R-sig-phylo mailing list - R-sig-phylo@r-project.org
> https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
> Searchable archive at
> http://www.mail-archive.com/r-sig-phylo@r-project.org/
>

[[alternative HTML version deleted]]

___
R-sig-phylo mailing list - R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/

Re: [R-sig-phylo] estimate ancestral state with OUwie models

2018-06-11 Thread Brian O'Meara

Prompted by Bruno's email, and similar requests by some students at the
Arnold & Felsenstein Evolutionary Quantitative Genetics course, we've
started adding ancestral state reconstruction to OUwie. It still needs
debugging and testing, but should be ready fairly soon. I do believe bayou
does ancestral state estimation, too, but I don't think all the models you
want will be in the set. Could be adequate to answering the biological
question, though.

Best,
Brian

_______
Brian O'Meara, http://www.brianomeara.info, especially Calendar
<http://brianomeara.info/calendars/omeara/>, CV
<http://brianomeara.info/cv/>, and Feedback
<http://brianomeara.info/teaching/feedback/>

Associate Professor, Dept. of Ecology & Evolutionary Biology, UT Knoxville
Associate Head, Dept. of Ecology & Evolutionary Biology, UT Knoxville
Associate Director for Postdoctoral Activities, National Institute for
Mathematical & Biological Synthesis <http://www.nimbios.org> (NIMBioS)



On Mon, Jun 11, 2018 at 6:01 PM David Bapst  wrote:

> Just to follow off what Lucas said, but please note you cannot rescale
> branches of a phylogeny using an OU model when the tree is
> non-ultrametric (such as when it contains extinct, fossil taxa as
> tips). Slater (2014, MEE) discusses this more in a brief correction to
> Slater (2013).
>
> I don't know if anyone in this conversation has a non-ultrametric
> tree, but I wanted to make that clear for anyone who stumbles on this
> thread n the future using a google search.
> -Dave
>
>
>
> On Sun, Jun 10, 2018 at 12:25 PM, Lucas Jardim 
> wrote:
> > Hi Bruno,
> >
> > You can transform the branches of your phylogeny using the estimated
> > parameters of OU models. Then, if those models describe the observed data
> > adequatly, the transformed tree should model the observed data as a
> > Brownian motion model. So you can use an ancestral state reconstruction
> > based on Brownian motion model. However, I do not know if that is the
> best
> > approach as optimum values would not be included into the reconstruction
> > process.
> >
> > Best,
> > --
> > Lucas Jardim
> > Doutor em Ecologia e Evolução
> > Bolsista do INCT-EECBio (Ecologia, Evolução e Conservação da
> > Biodiversidade)
> > Instituto de Ciências Biológicas
> > Laboratório de Ecologia Teórica e Síntese
> > Universidade Federal de Goiás
> > http://dinizfilho.wix.com/dinizfilholab
> >
> > [[alternative HTML version deleted]]
> >
> > ___
> > R-sig-phylo mailing list - R-sig-phylo@r-project.org
> > https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
> > Searchable archive at
> http://www.mail-archive.com/r-sig-phylo@r-project.org/
>
>
>
> --
> David W. Bapst, PhD
> Asst Research Professor, Geology & Geophysics, Texas A & M University
> https://github.com/dwbapst/paleotree
> Google Calendar: https://goo.gl/EpiM4J
>
> ___
> R-sig-phylo mailing list - R-sig-phylo@r-project.org
> https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
> Searchable archive at
> http://www.mail-archive.com/r-sig-phylo@r-project.org/
>

[[alternative HTML version deleted]]

___
R-sig-phylo mailing list - R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/

Re: [R-sig-phylo] Problem with negative ages in OUwie.boot?

2018-05-02 Thread Brian O'Meara

On Wed, May 2, 2018 at 2:53 PM, David Bapst  wrote:

> Given that your tree appears to be non-ultrametric enough to cause
> branching.times to throw some nonsensical node ages, if it is supposed
> to be ultrametric. I recommend checking it carefully to figure out why
> the tips seem to not quite be at the same distance from the root.
>
>
 Sometimes this happens with tree import from a file -- it could be a
newick tree with branch lengths precise to the hundredths but a lot of the
R  ultrametric tests by default use higher precision (1e-08, iirc).

Best,
Brian

[[alternative HTML version deleted]]

___
R-sig-phylo mailing list - R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/

Re: [R-sig-phylo] grafting chronograms onto backbone phylogeny

2018-04-26 Thread Brian O'Meara

Look at congruify.phylo in geiger (and the associated paper: Eastman JM, LJ
Harmon, and DC Tank. 2013. Congruification: support for time scaling large
phylogenetic trees. Methods in Ecology and Evolution).

You could also look into SDM in ape (converting trees [not data] to
distance matrices, first): relevant paper Criscuolo, A., Berry, V.,
Douzery, E. J. P. , and Gascuel, O. (2006) SDM: A fast distance-based
approach for (super)tree building in phylogenomics. Systematic Biology, 55,
740–755.

Best,
Brian

___
Brian O'Meara, http://www.brianomeara.info, especially Calendar
<http://brianomeara.info/calendars/omeara/>, CV
<http://brianomeara.info/cv/>, and Feedback
<http://brianomeara.info/teaching/feedback/>

Associate Professor, Dept. of Ecology & Evolutionary Biology, UT Knoxville
Associate Head, Dept. of Ecology & Evolutionary Biology, UT Knoxville
Associate Director for Postdoctoral Activities, National Institute for
Mathematical & Biological Synthesis <http://www.nimbios.org> (NIMBioS)

On Thu, Apr 26, 2018 at 5:07 PM, Chris Law  wrote:

> Hi all,
>
> I am trying to graft 3 chronograms onto a family level backbone. Does
> anybody have any suggestions on what is the best way to do this? Or is
> going through the tree files in textwrangler the only way to do this.
>
> Thanks!
>
>
> *Chris Law **|*
> * *PhD Student
>  *|*
> * *University of California, Santa Cruz
> 
>
>
> Coastal Biology Building
> 
> 130 McAllister Way
> Santa Cruz, CA 95060
> cj...@ucsc.edu *|* research.pbsci.ucsc.edu/eeb/cjlaw/
> Small Mammal Research in the Forest Internship
> <http://research.pbsci.ucsc.edu/envs/smurf>
>
> [[alternative HTML version deleted]]
>
> ___
> R-sig-phylo mailing list - R-sig-phylo@r-project.org
> https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
> Searchable archive at http://www.mail-archive.com/r-
> sig-ph...@r-project.org/
>

[[alternative HTML version deleted]]

___
R-sig-phylo mailing list - R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/

Re: [R-sig-phylo] quantitative state dependent diversification of another quantitative trait

2018-04-04 Thread Brian O'Meara

I think you want SLOUCH: Hansen et al. 2008:
https://onlinelibrary.wiley.com/doi/pdf/10./j.1558-5646.2008.00412.x.
One trait evolves under Brownian motion, another trait follows it.

Best,
Brian

___
Brian O'Meara, http://www.brianomeara.info, especially Calendar
<http://brianomeara.info/calendars/omeara/>, CV
<http://brianomeara.info/cv/>, and Feedback
<http://brianomeara.info/teaching/feedback/>

Associate Professor, Dept. of Ecology & Evolutionary Biology, UT Knoxville
Associate Head, Dept. of Ecology & Evolutionary Biology, UT Knoxville
Associate Director for Postdoctoral Activities, National Institute for
Mathematical & Biological Synthesis <http://www.nimbios.org> (NIMBioS)


On Wed, Apr 4, 2018 at 2:43 PM, Zach Culumber  wrote:

> Hi everyone,
>
> We are interested in examining whether shifts in one continuous trait have
> preceded shifts in another continuous trait through evolutionary time for a
> phylogeny of ~80 species.  To do this we were thinking of calculating the
> change in the value of each trait for each tip to its most recent ancestor
> (i.e., node) and then for each node to its most recent ancestral node.  We
> have all the tip values and conducted ASR to get the node values.  However,
> we are unaware of any functions or packages that would automate calculation
> of all these distances for teach tip and node to its most recent ancestor.
> Does anyone have any suggestions of functions or packages that might
> possibly do this?  Alternatively, are there better existing options for
> exploring this question? I.e., something similar to QuaSSE but for analysis
> between two quantitative traits rather than traits and species
> diversification?  Something like Bayes Traits or PGLS would allow tests of
> correlated evolution, but not the order of evolutionary shifts.
>
> Thank you for any insights!
>
> Zach
>
> [[alternative HTML version deleted]]
>
> ___
> R-sig-phylo mailing list - R-sig-phylo@r-project.org
> https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
> Searchable archive at http://www.mail-archive.com/r-
> sig-ph...@r-project.org/
>

[[alternative HTML version deleted]]

___
R-sig-phylo mailing list - R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/

Re: [R-sig-phylo] Interpretation of standard errors of parameter estimates in OUwie models

2018-04-04 Thread Brian O'Meara

Apparently the plot did not come through correctly. I'm attaching a PDF of
it.

Best,
Brian

___
Brian O'Meara, http://www.brianomeara.info, especially Calendar
<http://brianomeara.info/calendars/omeara/>, CV
<http://brianomeara.info/cv/>, and Feedback
<http://brianomeara.info/teaching/feedback/>

Associate Professor, Dept. of Ecology & Evolutionary Biology, UT Knoxville
Associate Head, Dept. of Ecology & Evolutionary Biology, UT Knoxville
Associate Director for Postdoctoral Activities, National Institute for
Mathematical & Biological Synthesis <http://www.nimbios.org> (NIMBioS)


On Wed, Apr 4, 2018 at 4:24 PM, Brian O'Meara 
wrote:

> Hi, Rafael et al.
>
> ?OUwie has some information on the standard errors:
>
> The Hessian matrix is used as a means to estimate the approximate standard
>> errors of the model parameters and to assess whether they are the maximum
>> likelihood estimates. The variance-covariance matrix of the estimated
>> values of alpha and sigma^2 are computed as the inverse of the Hessian
>> matrix and the standard errors are the square roots of the diagonals of
>> this matrix. The Hessian is a matrix of second-order derivatives and is
>> approximated in the R package numDeriv. So, if changes in the value of a
>> parameter results in sharp changes in the slope around the maximum of the
>> log-likelihood function, the second-order derivative will be large, the
>> standard error will be small, and the parameter estimate is considered
>> stable. On the other hand, if the second-order derivative is nearly zero,
>> then the change in the slope around the maximum is also nearly zero,
>> indicating that the parameter value can be moved in any direction without
>> greatly affecting the log-likelihood. In such situations, the standard
>> error of the parameter will be large.
>>
>> For models that allow alpha and sigma^2 to vary (i.e., OUMV, OUMA, and
>> OUMVA), the complexity of the model can often times be greater than the
>> information that is contained within the data. As a result one or many
>> parameters are poorly estimated, which can cause the function to return a
>> log-likelihood that is suboptimal. This has great potential for poor model
>> choice and incorrect biological interpretations. An eigendecomposition of
>> the Hessian can provide an indication of whether the search returned the
>> maximum likelihood estimates. If all the eigenvalues of the Hessian are
>> positive, then the Hessian is positive definite, and all parameter
>> estimates are considered reliable. However, if there are both positive and
>> negative eigenvalues, then the objective function is at a saddlepoint and
>> one or several parameters cannot be estimated adequately. One solution is
>> to just fit a simpler model. Another is to actually identify the offending
>> parameters. This can be done through the examination of the eigenvectors.
>> The row order corresponds to the entries in index.matrix, the columns
>> correspond to the order of values in eigval, and the larger the value of
>> the row entry the greater the association between the corresponding
>> parameter and the eigenvalue. Thus, the largest values in the columns
>> associated with negative eigenvalues are the parameters that are causing
>> the objective function to be at a saddlepoint.
>>
>
>  You can get better (by which I mean more accurate) estimates by
> parametric bootstrapping (simulation).
>
> Doing a quick summary of your OUM results (code at end):
>
> Trait: contr
> nonfor: 0.12 (-0.12, 0.36)
> inter: 0.12 (-0.12, 0.36)
> fores: 0.13 (-0.13, 0.39)
> Phylogenetic half life: 0.06
> Tree height: 28.34
>
> Trait: cshd
> nonfor: 0.24 (-0.23, 0.72)
> inter: 0.33 (-0.32, 0.98)
> fores: 0.2 (-0.19, 0.6)
> Phylogenetic half life: 6.35
> Tree height: 28.34
>
> Trait: dors
> nonfor: 0.04 (-0.04, 0.12)
> inter: 0.15 (-0.14, 0.44)
> fores: 0 (0, 0.01)
> Phylogenetic half life: 3.68
> Tree height: 28.34
>
> Trait: vent
> nonfor: 0.05 (-0.05, 0.14)
> inter: 0.32 (-0.3, 0.93)
> fores: -0.03 (0.03, -0.08)
> Phylogenetic half life: 5.43
> Tree height: 28.34
>
> It does seem that you don't have a lot of ability to tell optima apart.
> The alpha value is pretty low, though not tremendously (the half life is a
> good way to look at that). But the raw data also suggest it's not three
> very clear clumps (beeswarm plot, regimes on the x axis, trait 1 value on
> the y). It's not phylogenetically corrected, but still a good gut check:
>
>
>
> I agree that for your data, it may b

Re: [R-sig-phylo] Interpretation of standard errors of parameter estimates in OUwie models

2018-04-04 Thread Brian O'Meara

oint.estimate+1.96*se),2)
cat(paste0("\n",regimes[regime.index], ": ", result.vector[1], "
(", result.vector[2], ", ", result.vector[3], ")"))
if (trait.index==1) {
observations[[regime.index]] <-
bla$data[which(bla$data[,1]==regime.index),2]
}
}
cat(paste0("\nPhylogenetic half life: ", round(log(2)/data["alpha",
traits[trait.index]], 2)))
    cat(paste0("\nTree height: ",
round(max(ape::branching.times(bla$phy)),2)))
}

points.to.plot <- bla$data[,1:2]
colnames(points.to.plot) <- c("regime", "trait1")
beeswarm(trait1~regime, data=points.to.plot, col="black", pch=20)




___
Brian O'Meara, http://www.brianomeara.info, especially Calendar
<http://brianomeara.info/calendars/omeara/>, CV
<http://brianomeara.info/cv/>, and Feedback
<http://brianomeara.info/teaching/feedback/>

Associate Professor, Dept. of Ecology & Evolutionary Biology, UT Knoxville
Associate Head, Dept. of Ecology & Evolutionary Biology, UT Knoxville
Associate Director for Postdoctoral Activities, National Institute for
Mathematical & Biological Synthesis <http://www.nimbios.org> (NIMBioS)


On Wed, Apr 4, 2018 at 4:07 PM, Jacob Berv 
wrote:

> Even if those models may fit the data better, they may not necessarily
> help Rafael determine whether or not the parameter estimates of interest
> are different across regimes (though perhaps BMS might be informative).
>
> Rafael, maybe you could try fixing the ancestral regimes to match most
> likely states for each node from your SIMMAP summary? I wonder if that
> might decrease your ‘uncertainty’ in parameter estimates.
>
> I don’t have a great answer for your main question though, which is a good
> one.
>
> Jake
>
> > On Apr 4, 2018, at 8:59 PM, William Gearty  wrote:
> >
> > Hi Rafael,
> >
> > Have you tried running the BM1, BMS, or OU1 models? If so, how do the
> AICc
> > values for those compare to those of the OUM model? If not, you should
> make
> > sure you run those.
> > If you find that the these models fit your data better, this could
> indicate
> > that there isn't a large different across regimes and an OUM model isn't
> > necessarily suitable.
> >
> > Best,
> > Will
> >
> > On Wed, Apr 4, 2018 at 12:30 PM, Rafael S Marcondes <
> raf.marcon...@gmail.com
> >> wrote:
> >
> >> Dear all,
> >>
> >> I'm writing (again!) to ask for help interpreting standard errors of
> >> parameter estimates in OUwie models.
> >>
> >> I'm using OUwie to examine how the evolution of bird plumage color
> varies
> >> across habitat types (my selective regimes) in a tree of 229 tips. I was
> >> hoping to be able to make inferences based on OUMV and OUMVA models,
> but I
> >> was getting nonsensical theta estimates from those. So I've basically
> given
> >> up on them for now.
> >>
> >> But even looking at theta estimates from OUM models, I'm getting really
> >> large standard errors, often overlapping the estimates from other
> selective
> >> regimes. So I was wondering what that means exactly. How are these erros
> >> calculated? How much do high errors it limit the biological inferences I
> >> can make? I'm more interested in the relative thetas across regimes
> than on
> >> the exact values (testing the prediction that birds in darker habitats
> tend
> >> to adapt to darker plumages than birds in more illuminated habitats).
> >>
> >> I have attached a table averaging parameter estimates and errors from
> >> models fitted across a posterior distribution of 100 simmaps for four
> >> traits; and one exemplar fitted model from one trait in one of those
> >> simmaps.
> >>
> >> Thanks a lot for any help,
> >>
> >>
> >> *--*
> >> *Rafael Sobral Marcondes*
> >> PhD Candidate (Systematics, Ecology and Evolution/Ornithology)
> >>
> >> Museum of Natural Science <http://sites01.lsu.edu/wp/mns/>
> >> Louisiana State University
> >> 119 Foster Hall
> >> Baton Rouge, LA 70803, USA
> >>
> >> Twitter: @brown_birds <https://twitter.com/brown_birds>
> >>
> >>
> >> ___
> >> R-sig-phylo mailing list - R-sig-phylo@r-project.org
> >> https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
> >> Searchable archive at http://www.ma

Re: [R-sig-phylo] correlation of binary and multistate

2018-03-21 Thread Brian O'Meara

You can do this with corHMM by making an appropriate model with rayDISC.
For example, if character A has states 0,1 and character B has states
0,1,2, you then recode this into a single multistate character:

00 -> 0
01 -> 1
02 -> 2
10 -> 3
11 -> 4
12 -> 5

and then create a rate matrix that matches your hypotheses (i.e., force the
00->10 rate [the 0->3 rate in the new coding] to be the same as the 01->11
and 02->12 rates (so that the state of the second character doesn't affect
the rate of the first character) or and compare (or model average) with a
model where they are free to vary).

It wouldn't be hard to make a looping thing to do this automatically
(Jeremy Beaulieu has for two and three binary trait correlation in the same
package with corDISC but not for multistate yet) but as far as I know no
one has (pull requests welcome!).

There may also be this functionality in phytools: I remember some Pagel
correlation work was added, but I don't know about its limitations with
multistate.

Best,
Brian

PS: Of course, with all this stuff, I now worry about the Maddison and
FitzJohn issue: https://academic.oup.com/sysbio/article/64/1/127/2847997.

___________
Brian O'Meara, http://www.brianomeara.info, especially Calendar
<http://brianomeara.info/calendars/omeara/>, CV
<http://brianomeara.info/cv/>, and Feedback
<http://brianomeara.info/teaching/feedback/>

Associate Professor, Dept. of Ecology & Evolutionary Biology, UT Knoxville
Associate Head, Dept. of Ecology & Evolutionary Biology, UT Knoxville
Associate Director for Postdoctoral Activities, National Institute for
Mathematical & Biological Synthesis <http://www.nimbios.org> (NIMBioS)

On Wed, Mar 21, 2018 at 3:19 PM, John Denton  wrote:

> Hi all,
>
> Does anybody know of a package that does character correlations between
> binary and multistate characters (and even better with the possibility of
> missing data)?
>
> Thanks!
>
> ~John
> ___
> R-sig-phylo mailing list - R-sig-phylo@r-project.org
> https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
> Searchable archive at http://www.mail-archive.com/r-
> sig-ph...@r-project.org/
>

[[alternative HTML version deleted]]

___
R-sig-phylo mailing list - R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/

Re: [R-sig-phylo] How to test stratification of sampling across tree?

2017-03-15 Thread Brian O'Meara

One thing you could do is take all nonoverlapping pairs of taxa
(Felsenstein's other technique in the contrasts paper): that is, for a tree
(A,(B,(C,(D,E, you can look at D-E and B-C, *or* D-E and A-B, *or* D-E
and A-C, *or* A-B and C-D, etc. (so, still leaving out one taxon each time,
but only using each edge once) and then compare the states for pair of
taxa. If state 0 has freq p, and state 1 has freq q, you should see p^2 0-0
pairs, 2pq 0-1 pairs, and q^2 1-1 pairs with truly random sampling; if it's
maximally stratified, you should see only two of these pairs (i.e., if 1 is
less common, you should see only 0-1 and 0-0 pairs). You could rig up a
test statistic (prob comparing two multinomial models) from this.

I have some code that purportedly gets all independent such pairs in R at
https://r-forge.r-project.org/scm/viewvc.php/pkg/R/independentTaxa.R?view=markup&revision=366&root=omearalab.
However, I haven't rigorously tested it (this was in the dark ages before
we all used testthat, too), so feel free to take, hack, republish, etc.,
but test first [anyone else should feel free to take this, too -- it could
naturally go into phytools, ape, or phangorn, for example, assuming it
actually works].

Best,
Brian

_______
Brian O'Meara, http://www.brianomeara.info, especially Calendar
<http://brianomeara.info/calendars/omeara/>, CV
<http://brianomeara.info/cv/>, and Feedback
<http://brianomeara.info/teaching/feedback/>

Associate Professor, Dept. of Ecology & Evolutionary Biology, UT Knoxville
Associate Head, Dept. of Ecology & Evolutionary Biology, UT Knoxville
Associate Director for Postdoctoral Activities, National Institute for
Mathematical & Biological Synthesis <http://www.nimbios.org> (NIMBioS)
Communication Director, Society of Systematic Biologists

On Tue, Mar 14, 2017 at 1:37 PM, Ross Mounce  wrote:

> So I tried a 12-taxon fully pectinate tree with Blomberg's K as calculated
> by picante::Kcalc()
>
> library(picante)
> library(ape)
> aa<-"(A,(B,(C,(D,(E,(F,(G,(H,(I,(J,(K,L)));"
> t1<-read.tree(text=aa)
> t4 <- compute.brlen(t1,method="Grafen",1)
> tipvals <- c(0,1,0,1,0,1,0,1,0,1,0,1)
> Kcalc(tipvals,t4)
>
> K = 0.3487135
>
>
> There are possible 924 permutations of 6 'painted' tips from 0011
> to 1100 (each of those two extreme distributions gives the maximum
> value of K for this particular tree & number [6] of painted tips: 2.37703)
>
> There are 336 discrete integer value results of K for this tree, and
> painting 6 tips:
>
> Min. 1st Qu.  MedianMean 3rd Qu.Max.
>  0.3438  0.3676  0.4489  0.5332  0.5945  2.3770
>
>
> A histogram of all 924 possible values of K (for 6 painted tips) shows that
> Blomberg's K in terms of value distribution is extremely positively skewed
> (it has a skewness of 2.671139), which is great if you're looking to test
> phylogenetic signal without false positives, but not so great if you're
> trying to assess "evenness" at the other tail of the distribution. In the
> case of this exact tree, because I've enumerated all possible permutations
> of 6 painted tips, I can calculate the 5% significance threshold as values
> of K that are 0.3480158 or less.  It seems some normalisation procedure
> might be needed before safely using Blomberg's K to assess the significance
> of evenness, if one is not going to exhaustively examine/enumerate all
> possible values (which I can't do for a 1000+ tip tree).
>
> Does that make sense?
>
> Certainly interesting...
>
>
> Ross
>
>
>
>
>
>
>
>
> On 14 March 2017 at 14:53, Ross Mounce  wrote:
>
> > Thanks Dave,
> >
> > I'll try Blomberg's K with small simulated fully-bifurcating trees of
> > simple shape (e.g. fully pectinate), where I can easily paint the tips
> > myself in what I believe to be a "maximally stratified manner" e.g.
> > 010101010 to see if Blomberg's K does actually reach minimum (i.e.
> 0.0
> > ?) for such a distribution. If it does, great! This is the measure I
> need.
> >
> > I still wonder though, for a complex tree structure in terms of
> > balance/shape somewhere intermediate between fully balanced and fully
> > pectinate; how does one arrive empirically at _the_ most optimal
> > stratified/even sampling ('painting') of tips if say only 25% of tips
> > are/can be 'painted'. I guess a lot depends on how one defines what 'even
> > sampling' on a phylogeny actually is, does it include branch lengths et
> > cetera...
> >
> > I'll give it a try

Re: [R-sig-phylo] chronoMPL/APE

2017-01-17 Thread Brian O'Meara

Just to add to Juan's comment, for most analyses where you need branch
lengths, you need them to be fairly accurate (but not all, so his is useful
advice for those cases). In cases where you do need good branch lengths
(diversification, ancestral state reconstruction, phylogenetic diversity,
rates of evolution, many others), doing chronopl() with constraints or
using congruify.phylo() in geiger using a reference chronogram could be
useful. This may also be a case where one would leave our happy R ecosystem
(or use a system() call to quietly leave) to use BEAST, RevBayes, r8s,
treePL, or similar.

Best,
Brian

___
Brian O'Meara, http://www.brianomeara.info, especially Calendar
<http://brianomeara.info/calendars/omeara/>, CV
<http://brianomeara.info/cv/>, and Feedback
<http://brianomeara.info/teaching/feedback/>

Associate Professor, Dept. of Ecology & Evolutionary Biology, UT Knoxville
Associate Head, Dept. of Ecology & Evolutionary Biology, UT Knoxville
Associate Director for Postdoctoral Activities, National Institute for
Mathematical & Biological Synthesis <http://www.nimbios.org> (NIMBioS)
Communication Director, Society of Systematic Biologists

On Tue, Jan 17, 2017 at 9:13 AM, Juan Antonio Balbuena 
wrote:

> The "strange" shape of your tree seems to result from negative branch
> lengths.
>
> If your goal is to just obtain an ultrametric tree under no specific
> assumptions, use
>
> compute.brtime (yourtree, method="coalescent", force.positive=TRUE).
>
> Where 'yourtree' must be an object of class phylo.  force.positive=TRUE
> will prevent negative branch lengths to occur in your tree.
>
> I hope this helps.
>
> Juan A. Balbuena
>
>
>
> --
>
> Dr. Juan A. Balbuena
> Cavanilles Institute of Biodiversity and Evolutionary Biology
> University of Valencia
> http://www.uv.es/~balbuena
> P.O. Box 22085
> http://www.uv.es/cophylpaco
> <http://www.uv.es/cavanilles/zoomarin/index.htm>
> 46071 Valencia, Spain
> e-mail: j.a.balbu...@uv.estel. +34 963 543 658
> <+34%20963%2054%2036%2058>fax +34 963 543 733
> <+34%20963%2054%2037%2033>
> 
> *NOTE!* For shipments by EXPRESS COURIER use the following street address:
> C/ Catedrático José Beltrán 2, 46980 Paterna (Valencia), Spain.
> 
>
> ___
> R-sig-phylo mailing list - R-sig-phylo@r-project.org
> https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
> Searchable archive at http://www.mail-archive.com/r-
> sig-ph...@r-project.org/
>

[[alternative HTML version deleted]]

___
R-sig-phylo mailing list - R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/

Re: [R-sig-phylo] A question about phylogenetic signal significance with 1000 trees

2016-12-08 Thread Brian O'Meara

Well, since 1000 are a sample, 100 sampled from those is a somewhat more
manageable sample.

As to using, say, an 80% cutoff is a valid procedure, one question is what
you're doing this for. I'd be tempted to just show or report the
distribution of K, lambda, and p values and talk about what they mean in
terms of your biological question: "We don't know the tree exactly, but
it's clear that for most feasible trees, related things are similar" or
something like that. If it's a question of scaling trees for later
analyses, you can just (using a script, or built in functions) choose the
particular scaling to be appropriate for each tree, and so the distribution
across them doesn't matter in that case.

Best,
Brian

_______
Brian O'Meara, http://www.brianomeara.info, especially Calendar
<http://brianomeara.info/calendars/omeara/>, CV
<http://brianomeara.info/cv/>, and Feedback
<http://brianomeara.info/teaching/feedback/>

Associate Professor, Dept. of Ecology & Evolutionary Biology, UT Knoxville
Associate Head, Dept. of Ecology & Evolutionary Biology, UT Knoxville
Associate Director for Postdoctoral Activities, National Institute for
Mathematical & Biological Synthesis <http://www.nimbios.org> (NIMBioS)
Communication Director, Society of Systematic Biologists

On Thu, Dec 8, 2016 at 11:26 AM, Carturan, Bruno 
wrote:

> Hello R-sig-phylo community,
>
> I am trying to measure the phylogenetic signal of functional traits with
> the Blomberg’s K and Pagel’s λ methods. My problem is that I don’t have one
> phylogenetic tree but 1000 (the trees were created by first combining
> different molecular, morphologic and taxonomic trees, then by conducting a
> MCMC analysis and finally by selecting the 1000 trees – work made by Huang
> and Roy 2015).
>
> Is there a way to proceed to test the significance of the K and λ values
> for each trait with these 1000 trees?
>
> An idea I have is to run the test and obtain a p value for each of the
> 1000 trees, which would give me a distribution of p values, and then
> consider that if 80% (for instance) of the p values are significant, the K
> or λ value is significant. This is computationally very demanding (I have
> 735 species) and I don’t know if this could be considered as a valid
> procedure.
>
> I am looking forward for you answer.
>
> Bruno Carturan
> PhD Candidate
> Complex Environmental Systems Lab
> University of British Columbia
> Okanagan Campus
>
>
> 
> From: Liam J. Revell [liam.rev...@umb.edu]
> Sent: Wednesday, December 07, 2016 6:54 AM
> To: Carturan, Bruno
> Subject: Re: A question about phylogenetic signal significance with 1000
> trees
>
> Bruno.
>
> This is an interesting question. Since this doesn't pertain specifically
> to phytools, perhaps you should pose it to the R-sig-phylo email
> list-serve. You may get a good answer there.
>
> All the best,
>
> Liam J. Revell, Associate Professor of Biology
> University of Massachusetts Boston
> web: http://faculty.umb.edu/liam.revell/
> email: liam.rev...@umb.edu
> blog: http://blog.phytools.org
>
> On 12/6/2016 6:16 PM, Carturan, Bruno wrote:
> > Hello Dr. Revell,
> >
> >
> >
> > I am a PhD student from the Complex Environmental Systems Lab, from the
> > University of British Columbia and I would like to ask you question.
> >
> >
> >
> > I am trying to measure the phylogenetic signal of functional traits with
> > the Blomberg’s K and Pagel’s λ methods. My problem is that I don’t have
> > one phylogenetic tree but 1000 (the trees were created by first
> > combining different molecular, morphologic and taxonomic trees, then by
> > conducting a MCMC analysis and finally by selecting the 1000 trees –
> > work made by Huang and Roy 2015).
> >
> >
> >
> > Is there a way to proceed to test the significance of the K and λ values
> > for each trait with these 1000 trees?
> >
> >
> >
> > An idea I have is to run the test and obtain a p value for each of the
> > 1000 trees, which would give me a distribution of p values, and then
> > consider that if 80% (for instance) of the p values are significant, the
> > K or λ value is significant. This is computationally very demanding (I
> > have 735 species) and I don’t know if this could be considered as a
> > valid procedure.
> >
> >
> >
> > I am looking forward for you answer.
> >
> >
> >
> > Best regards,
> >
> >
> > Bruno Carturan
> > PhD Candidate
> > Complex Environmental Systems Lab
> >

Re: [R-sig-phylo] Making ultrametric trees

2016-10-30 Thread Brian O'Meara

Emmanuel's way is the most general and quickest (well, other than just
making up branch lengths entirely, using ape::compute.brlen() -- that's
even faster!). However, the original question was "best and most general"
for making ultrametric trees for phylogenetic comparative analysis. For an
analysis where branch lengths matter, as in many comparative methods
(ancestral state estimation, independent contrasts, diversification
analyses, etc), the *best* method probably involves data. Getting a tree
with branch lengths proportional to time uses calibrations, often from
fossils, and this is non trivial (requires substantial expertise in the
groups, can take some time to run, etc.). A few solutions here:

1) Start with chronograms: TreeBase initially prohibited trees with branch
lengths, so it probably isn't the best place to go. However, OpenTree does
have trees with branch lengths, with metadata on whether the branch lengths
represent time. You can access that using the rotl package and only pull
trees that are chronograms from its tree store. The datelife package (
https://github.com/phylotastic/datelife: note COI that I'm lead author on
that) has a dataset of all OpenTree chronograms stored already, which makes
it even easier to get them, and you can use a function there to pull in an
updated list. It's often better to start with expert-calibrated chronograms
rather than do your own, especially if all you need is a set of trees. Note
that OpenTree has a bit over a hundred chronograms (anyone reading: please
add yours
<https://github.com/OpenTreeOfLife/opentree/wiki/Submitting-phylogenies-to-Open-Tree-of-Life>
-- you'll help advance science [and get more citations]); TimeTree.org has
thousands, but I believe they're not available for download and reuse.
However, you could reach out to the TimeTree authors for a potential
collaboration and perhaps access them that way.

2) Date yourself: take the trees from treebase, find calibrations from
other papers, and then use other programs (Beast, r8s, pathd8, treepl [COI:
coauthor], revbayes, etc.) to convert them into chronograms. You could wrap
these program in an R interface to do this.

3) Stretch: use geiger::congruify() to stretch your treebase trees to other
calibrated trees. This works for a pair of trees at a time; the datelife
package does this for your tree and all the trees in OpenTree with
overlapping taxa (but it's still alpha-level software, so use at your own
risk (and let me know if there are issues)).

If you need ultrametric trees for some quick and dirty examples for using
comparative methods, the above is overkill; if you care about biological
results that are sensitive to branch lengths (and the degree to which
methods are sensitive to branch length errors does vary by method) you
might need to invest in the harder analyses.

Best,
Brian

_______
Brian O'Meara, http://www.brianomeara.info, especially Calendar
<http://brianomeara.info/calendars/omeara/>, CV
<http://brianomeara.info/cv/>, and Feedback
<http://brianomeara.info/teaching/feedback/>

Associate Professor, Dept. of Ecology & Evolutionary Biology, UT Knoxville
Associate Head, Dept. of Ecology & Evolutionary Biology, UT Knoxville
Associate Director for Postdoctoral Activities, National Institute for
Mathematical & Biological Synthesis <http://www.nimbios.org> (NIMBioS)
Communication Director, Society of Systematic Biologists

On Sat, Oct 29, 2016 at 10:09 PM, Shinichi Nakagawa 
wrote:

> Dear Emmanuel and Alejandro
>
> Many thanks for your replies.
>
> Emmanuel, I understand what you said if I use vcv(nonultrametric_tree,
> cor=FALSE) in a phylogenetic comparative method (PCM). But if I use
> vcv(nonultrametric_tree, cor=TRUE), isn't this the same as assuming a
> ultrametric tree? I may be missing something here?
>
> Note that vcv(nonultrametric_tree, cor=TRUE) is different from
> vcv(chronoMPL(nonultrametric_tree), cor = TRUE), which is relates to my
> previous question (which one is preferred?)
>
> Many thanks
>
> Shinichi
>
> On Sun, Oct 30, 2016 at 9:53 AM, Alejandro Gonzlaez Voyer <
> alejandro.gonza...@iecologia.unam.mx> wrote:
>
> > Hi,
> >
> > I would venture in adding that with an additive tree the differences in
> > evol rates among species are determined by the genes used to estimate the
> > tree, given differences in substitution rates among genes. I do think its
> > important to bear in mind.
> >
> > Cheers
> >
> > Alejandro
> >
> > Alejandro Gonzalez Voyer
> > Instituto de Ecologia
> > UNAM
> >
> >
> >
> >
> > > On Oct 29, 2016, at 2:45 PM, Emmanuel Paradis  >
> > wrote:
> > >
> > > Sure they are

Re: [R-sig-phylo] Use of a Phylocom tree for PGLS

2016-09-18 Thread Brian O'Meara

There also are several empirical plant chronograms available for reuse:
check TreeBase, OpenTree, and Dryad. Or you could use something like phlawd
or supersmart (just out in SystBio) to make a phylogeny for your taxa.
Phylomatic is convenient and will have all your taxa placed in some way,
but it'd make your conclusions more robust to just download a tree, make
sure the taxa match yours (look at the taxize package), prune out the taxa
that don't match (treedata() in Geiger), and rerun. If the conclusions are
qualitatively the same, you satisfy the reviewer; if not, you may have
saved yourself from an error. Either way the paper is improved. I haven't
reviewed your paper, but if the central point is using a phylogeny to look
at signal (rather than have this be just one small part of a paper focused
on something else), then doing even more to get a better tree might be
worth it -- the tree is THE thing used in every analysis, may as well get
it right (or at least, less wrong). If the paper is mostly on something
else, though, this might be a good enough solution.

Two notes for disclosure: I've been on papers that make empirical trees, so
I suppose there's a minor COI there (I might get one more citation). I'm
also part of a project, phylotastic, that is working to make getting trees
easier. We're doing a soft release soon, but there's nothing I'd suggest
you try using yet (we know people are going to form a first impression and
then decide to come back based on that first impression -- we're not quite
ready for our debut yet, though it is all open). My part is to make it
easier to get dates on trees: there's an R package for this (datelife, on
github) and a website, but I'm still debugging -- don't use yet for
something that matters, but it should make all this easier in the future.
But it does give me an incentive to say, "get a better tree", so take that
into consideration.

Best,
Brian

_______
Brian O'Meara, http://www.brianomeara.info, especially Calendar
<http://brianomeara.info/calendars/omeara/>, CV
<http://brianomeara.info/cv/>, and Feedback
<http://brianomeara.info/teaching/feedback/>

Associate Professor, Dept. of Ecology & Evolutionary Biology, UT Knoxville
Associate Head, Dept. of Ecology & Evolutionary Biology, UT Knoxville
Associate Director for Postdoctoral Activities, National Institute for
Mathematical & Biological Synthesis <http://www.nimbios.org> (NIMBioS)
Communication Director, Society of Systematic Biologists

On Sat, Sep 17, 2016 at 11:38 PM, Jacob Berv  wrote:

> Personally, I don’t think I’d have a problem with this approach
> (especially given the paper Liam cited) given that you are using a
> phylogeny (a model) to test a hypothesis, which is, after all, all we can
> ever do. You can always do more, so any threshold of phylogenetic tree
> “quality” is going to be somewhat arbitrary anyway. I don’t mean to sound
> defeatist — obviously you should do the best job that you can, and it
> sounds like you have done that given that the goal of your research is not
> to reconstruct a new phylogeny of the clade you’re studying. Perhaps you
> can try to explain this in your rebuttal.
>
> Or alternatively, use Liam’s awesome code (below) to generate all possible
> phylogenetic trees give your polytomies, and run your PGLS on all of them
> to see if its sensitive to topology. You could probably loop this up
> somehow without much effort.
> http://blog.phytools.org/2016/08/resolving-one-or-more-
> multifurcations.html?m=1 <http://blog.phytools.org/
> 2016/08/resolving-one-or-more-multifurcations.html?m=1>
>
> Jake
>
>
>
> > On Sep 17, 2016, at 6:46 PM, Oscar Valverde 
> wrote:
> >
> > Dear colleages
> >
> > I am working on an phylogenetic signal and PGLS analysis using a database
> > with values for ~600 plant species. To construct my phylogeny I used the
> > backbone Phylomatic supertree (http://phylodiversity.net/phylomatic/)
> and
> > added branch lengths with the based on a fossil calibration for
> angiosperms
> > using the bladj function in phylocom, and resolved polytomies with the
> > multi2di function in phylotools.
> >
> > Now that I am trying to publish the paper, some reviewers indicated that
> > such tree is not suitable for statistical analysis because the level of
> > resolution of the tree is too low (to the family level maybe?) and the
> > uncertainty is too high to get any reliable result with respect to PGLS
> or
> > phylogenetic signal of the traits. Instead, they suggest I should build
> my
> > own tree based on sequences. Of course this is a major project to
> undertake
> > and in my opinion

Re: [R-sig-phylo] Standard errors of theta estimates in OU models

2016-09-15 Thread Brian O'Meara

That's a really cool question. I don't know. I'm pretty sure that
practically, the answer is no (esp when you consider all the issues -- any
feasible OU is too simple a model to match reality, tree errors, data
errors, etc. -- that also come into an analysis). However, I could imagine
that in a perfect world you could look at expected standard errors created
using parametric bootstrapping from the MLE of the parameters and see if
the actual standard errors are different -- maybe it could reflect
plasticity or some other factor. But I really don't know. You might look at
some of the methods that can deal with intraspecific and interspecific
processes to get at this sort of biological question (the Felsenstein 2008
Contrasts revisited paper might be interesting as a start, though it's not
OU and comes at this from a different direction), but it's fun to think
about ways to go from what is treated as annoying noise to a meaningful
signal.

Best,
Brian

_______
Brian O'Meara, http://www.brianomeara.info, especially Calendar
<http://brianomeara.info/calendars/omeara/>, CV
<http://brianomeara.info/cv/>, and Feedback
<http://brianomeara.info/teaching/feedback/>

Associate Professor, Dept. of Ecology & Evolutionary Biology, UT Knoxville
Associate Head, Dept. of Ecology & Evolutionary Biology, UT Knoxville
Associate Director for Postdoctoral Activities, National Institute for
Mathematical & Biological Synthesis <http://www.nimbios.org> (NIMBioS)
Communication Director, Society of Systematic Biologists

On Thu, Sep 15, 2016 at 5:37 PM, Xiaojing Wei  wrote:

> Dear R-sig-phylo users,
>
> I wonder if the standard errors of the theta estimates in OU models have
> any biologically meaningful interpretations. For instance, could it give
> some indication of plasticity in traits, or the optimal level of plasticity
> in traits? Or does it simply reflect estimation uncertainty of the
> phylogeny or the OU model?
>
> Thanks a lot!
>
> --
> Xiaojing Wei
> PhD student
> Dept. of Ecology, Evolution, and Behavior
> University of Minnesota
> Rm. 211 Ecology Bld., 1987 Upper Buford Cir.
> St. Paul, 55108
>
> [[alternative HTML version deleted]]
>
> ___
> R-sig-phylo mailing list - R-sig-phylo@r-project.org
> https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
> Searchable archive at http://www.mail-archive.com/r-
> sig-ph...@r-project.org/
>

[[alternative HTML version deleted]]

___
R-sig-phylo mailing list - R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/

Re: [R-sig-phylo] Query about Phylogenetics in R

2016-08-31 Thread Brian O'Meara

A similar solution is to create a pdf:

pdf(file="AwesomeTree.pdf", width=10, height=10)
plot(phy)
dev.off()

As you increase the width and height, the labels become proportionally
smaller. You can probably mess around with cex options, too, but this is
the approach I normally use.

Best,
Brian

_______
Brian O'Meara, http://www.brianomeara.info, especially Calendar
<http://brianomeara.info/calendars/omeara/>, CV
<http://brianomeara.info/cv/>, and Feedback
<http://brianomeara.info/teaching/feedback/>

Associate Professor, Dept. of Ecology & Evolutionary Biology, UT Knoxville
Associate Head, Dept. of Ecology & Evolutionary Biology, UT Knoxville
Associate Director for Postdoctoral Activities, National Institute for
Mathematical & Biological Synthesis <http://www.nimbios.org> (NIMBioS)
Communication Director, Society of Systematic Biologists

On Wed, Aug 31, 2016 at 5:25 AM, Vojtěch Zeisek  wrote:

> Hello,
> if onlythe readibility of the labels is the issue, what about using svg()
> function? It produces a SVG file which You can open in almost any vector
> editor (like Inkscape) andmove the labels around. Chech ?svg for possible
> settings and play with them little bit. They vary among operating systems.
> So You can do something like
> svg(...) # Any parameters inside, at least output file name
> # All the tree ploting commands...
> dev.off() # To save the SVG file to the disc
> Check also parameters of function tip.labels() - You can specify font size
> or replace the text with some symbol.
> (Sorry for being so brief, I don't have exact examples handy)
> Yours,
> V.
>
>
> 31. srpna 2016 11:11:33 CEST, "Bhuller, Ravneet" <
> ravneet.bhulle...@imperial.ac.uk> napsal:
> >Dear All,
> >
> >I am trying to construct a phylogenetic tree (neighbour joining) using
> >either APE or PHANGORN in R. But, since I have got 220 strains of
> >bacteria in my data, the resulted tree is not very clear. The branch
> >labels are so much overlapping that they cannot be read at all. Is
> >there any way, I can get a neat tree with clearly read labels? Any
> >guidance will be very much appreciated.
> >
> >Regards,
> >
> >Rav
> --
> Vojtěch Zeisek
> http://trapa.cz/cs
>
> ___
> R-sig-phylo mailing list - R-sig-phylo@r-project.org
> https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
> Searchable archive at http://www.mail-archive.com/r-
> sig-ph...@r-project.org/
>

[[alternative HTML version deleted]]

___
R-sig-phylo mailing list - R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/

Re: [R-sig-phylo] Measurement error for tips with only 1 measured specimen (OUwie)

2016-08-17 Thread Brian O'Meara

Another possibility is to do a regression to predict variance for species
with a single observation. Or even do a phylogenetic regression so species
nearer the ones with missing data matter more.

But all this stuff is minor tweaks: it's great you're incorporating
measurement error at all, and I hope your results are robust to any of
these suggestions for tweaks.

Best,
Brian

_______
Brian O'Meara, http://www.brianomeara.info, especially Calendar
<http://brianomeara.info/calendars/omeara/>, CV
<http://brianomeara.info/cv/>, and Feedback
<http://brianomeara.info/teaching/feedback/>

Associate Professor, Dept. of Ecology & Evolutionary Biology, UT Knoxville
Associate Head, Dept. of Ecology & Evolutionary Biology, UT Knoxville
Associate Director for Postdoctoral Activities, National Institute for
Mathematical & Biological Synthesis <http://www.nimbios.org> (NIMBioS)
Communication Director, Society of Systematic Biologists

On Tue, Aug 16, 2016 at 11:53 PM, Liam J. Revell 
wrote:

> Hi Santiago.
>
> This is identical to my suggestion, except that the pooled variance is a
> weighted mean in which weights (for better or worse) are proportional to
> the sample size of each species. If the variances are indeed homogeneous,
> this should be preferred because it gives greater weight to species whose
> variance we should know well. If not, then it risks giving high weight to a
> species with a well-estimated, but peculiarly high or low variance.
> Computing the straight mean, as you suggest, comes with exactly the
> opposite set of shortcomings since species with relatively small sample
> sizes may have very poorly estimated variances.
>
> All the best, Liam
>
> Liam J. Revell, Associate Professor of Biology
> University of Massachusetts Boston
> web: http://faculty.umb.edu/liam.revell/
> email: liam.rev...@umb.edu
> blog: http://blog.phytools.org
>
> On 8/16/2016 10:46 PM, Santiago Claramunt wrote:
>
>> Hi Rafael,
>>
>> Your method would underestimate the error associated with values derived
>> from single specimens because those values would have the highest errors,
>> not average errors.
>> What I have done in such cases is to estimate an average standard
>> deviation across species and use that average standard deviation as the
>> standard error of the species with single specimens.
>> I don't know of a formal description of this solution but it is mentioned
>> in one of my papers: http://rspb.royalsocietypublis
>> hing.org/content/279/1733/1567
>>
>> Best,
>>
>> Santiago
>>
>> Research Associate
>> Department of Ornithology
>> American Museum of Natural History
>> https://sites.google.com/site/sclaramuntuy/
>>
>>
>> On Aug 16, 2016, at 4:34 PM, Rafael S. Marcondes 
>>> wrote:
>>>
>>> Hi all,
>>>
>>> I’m using OUwie to fit multi-optima OU models and I have a question about
>>> incorporating measurement error into my analyses.
>>>
>>> I’m running my models with known measurement error (mserr=‘known’) and
>>> using the standard error (std.error()) as an estimate of it, as
>>> recommended
>>> by Ives et al (2007). However, for some (a minority) of my tips, I was
>>> only
>>> able to measure 1 specimen, so I have no standard error for them. So I’m
>>> not sure about how to deal with those. At first I thought about just
>>> setting their measurement error as 0, but then I figured that would
>>> introduce false confidence. So what I’m doing now is I’m setting
>>> measurement error for those tips as the mean of the errors of all the
>>> tips
>>> for which I did measure more than one specimen. I got that idea also from
>>> Ives et al when they mention averaging the error across species (jn the
>>> third-to-last paragraph), but that was in a different context. I can’t
>>> find
>>> any references that report dealing with the same problem, even though I
>>> assume it must not be an uncommon one. So I’m wondering if mine is really
>>> the best way to do it and, or if anyone has alternative suggestions?
>>>
>>> i hope I’ve made my problem clear, and thanks in advance for any
>>> suggestions.
>>>
>>>
>>> *--*
>>> *Rafael Sobral Marcondes*
>>>
>>> PhD Candidate (Systematics, Ecology and Evolution/Ornithology)
>>> Museum of Natural Science <http://sites01.lsu.edu/wp/mns/>
>>> Louisiana State University
>>> 119 Foster Hall
>>> Baton Rouge, LA 70803, USA
>>>

Re: [R-sig-phylo] Chronos function in ape

2016-07-06 Thread Brian O'Meara

Geiger has congruify functions that can use pathd8 to make a tree
ultrametric (perhaps using an external chronogram to date it).  There's
some code in there to use r8s, as well, but it takes some digging.

Best,
Brian

___
Brian O'Meara
Associate Professor
Dept. of Ecology & Evolutionary Biology
U. of Tennessee, Knoxville
http://www.brianomeara.info



On Wed, Jul 6, 2016 at 1:02 PM, Marco Fracassetti <
marco.fracasse...@unibas.ch> wrote:

> Hi to all of you,
>
> I want to calibrated a relationship tree done with Treemix (Pickrell &
> Pritchard 2012) done with SNPs frequency. The branch length of the tree
> reflect the amount of genetic drift.
> The chronos function from the package ape want tree whose branch lengths
> are in number of substitution per sites.
> Is It correct to t use the chronos function ?
> There is another R function that could transform a relationship tree in
> ultrametric tree?
>
> Thanks in advance for the help,
>
> Marco
>
> [[alternative HTML version deleted]]
>
> ___
> R-sig-phylo mailing list - R-sig-phylo@r-project.org
> https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
> Searchable archive at
> http://www.mail-archive.com/r-sig-phylo@r-project.org/
>

[[alternative HTML version deleted]]

___
R-sig-phylo mailing list - R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/

Re: [R-sig-phylo] [function like ratebystate, but for discrete characters?]

2016-05-04 Thread Brian O'Meara

Under the hood, the methods basically treat (0,0), (0,1), (1,1), and (1,0)
as states 1, 2, 3, 4. The difference is that the rate matrix is (often)
constrained so that you can't change both traits instantaneously: the 00
<-> 11 rates are forced to be zero, as are the 01 <-> 10 rates. You can
change this assumption. I typically think it makes sense to keep this for
most traits: even if biologically you think 11 is unstable, if you think it
is an intermediate, then I'd keep it. There could be traits where 11 just
can't exist: then I might do a three state model of 00, 01, and 10. But I
agree with Liam; for your questions, these sorts of approaches are the
natural ones to use.

However, if you haven't already, read Maddison & FitzJohn 2015:
http://sysbio.oxfordjournals.org/content/64/1/127.long . It has some
important things to worry about regarding all these tests. My personal rule
of thumb at this point is that if you see traits changing many times on the
tree, it's ok to use methods like Pagel (1994), but still be worried; if
you have few changes, back away slowly and think about other potential
questions. This might change as we get new approaches, but I doubt we'll
ever get to the point where we can make strong conclusions from a few
changes.

Best,
Brian

___
Brian O'Meara
Associate Professor
Dept. of Ecology & Evolutionary Biology
U. of Tennessee, Knoxville
http://www.brianomeara.info

On Tue, May 3, 2016 at 8:49 PM, Michael Foisy <
michael.fo...@mail.utoronto.ca> wrote:

>
> Hi Liam,
>
> Thank you for your quick reply!  Been playing around with
> fitPagel{phytools} and corDISC{corHMM}.  That makes sense;  state-dependent
> model, so just look at the transition rates between characters!
>
> This might be a silly question, but I'll ask it anyways.  For the traits
> that I'm looking at, it doesn't make biological sense to keep Trait A, if
> you gain Trait B.  Given this, one might expect the distribution of traits
> at the tips to be A or B, but rarely both A and B.  And so, what I'm really
> interested in is something that might look (in reality) more like (1, 0)
> --> (0, 1), and less like (1, 0) --> (1, 1) --> (0, 1), because the middle
> state is not evolutionarily stable.  So, I guess what I'm wondering, is:
> if Trait A facilitates the evolution of Trait B (1,0) -> (1,1), and Trait A
> is always lost quickly after the evolution of Trait B (1,1) -> (0,1), then
> would this be an issue for fitPagel and corDISC models?  I know they
> estimate all these transition rates, so maybe I'm just overthinking this.
> Or, might it be better to treat the two characters as one, multistate
> character? (they, along with trait C, serve the same function; they're just
> phenotypically distinct).  J!
>  ust want to make sure I'm using the right models for the traits I'm
> looking at!
>
>
> I hope that was clear... Thank you very  much!
> , Michael Foisy
>
> MSc. Student
> University of Toronto
> michael.fo...@mail.utoronto.ca
> 
> From: Liam J. Revell 
> Sent: Tuesday, May 3, 2016 8:11:39 PM
> To: Michael Foisy; r-sig-phylo@r-project.org
> Subject: Re: [R-sig-phylo] [function like ratebystate, but for discrete
> characters?]
>
> Hi Michael.
>
> This is essentially the method of Pagel (1994). Though often interpreted
> as a test for correlated evolution of two binary characters, the model
> that is actually being fit is one in which the rate of transition 0->1
> (or vice versa) in character A is affected by the state of binary
> character B; or, conversely, if the rate of transition 0->1 (or vice
> versa) in character B is affected by the state of character A.
>
> This is implemented in fitPagel of phytools and I believe that this &
> other more sophisticated flavor are available as part of the package
> corHMM.
>
> All the best, Liam
>
> Liam J. Revell, Associate Professor of Biology
> University of Massachusetts Boston
> web: http://faculty.umb.edu/liam.revell/
> email: liam.rev...@umb.edu
> blog: http://blog.phytools.org
>
> On 5/3/2016 8:00 PM, Michael Foisy wrote:
> > Hello,
> >
> > Quick question.  I'm interested in whether the presence/absence of
> character 'A' affects the rate of evolution in character 'B'.  Is there a
> function that will do this for two or more discrete characters?
> >
> > I've seen ratebystate {phytools} for continuous data; but my data are
> all binary!
> >
> >
> > Thank you!
> > , Michael Foisy
> >
> > MSc. Student
> > University of Toronto
> > michael.fo...@mail.utoronto.ca
> >

Re: [R-sig-phylo] estimating the evolutionary rate of a continous trait

2016-04-15 Thread Brian O'Meara

Hi, Belinda.

One thing to watch is over-intepretation: BM is consistent with drift, OU
is consistent with selection, but various kinds of selection can also lead
to BM. [1]. A lot of the people involved in these methods (including me)
are guilty of sloppiness in language in this area, leading to confusion.

Also, another issue could be branch lengths: are they proportional to time
(or, arguably, something like number of generations)? All these methods are
basically stretching the tree in various ways (and, for OU, adjusting
expected means), so if branch lengths are arbitrary, so are the results. I
ask due to the lambda fit. Other things that can cause this are
unincorporated measurement error (b/c it looks like a lot of evolution
right at the tips, meaning little evolution along the branches). I'd
suggest incorporating this (you can give known estimates of measurement
error; some programs allow this to be estimated as well (I forget whether
we have the estimation bit exposed in OUwie, but known error should be
there at least)).

Surface can estimate different regimes on the tree, but IIRC it does not
estimate different rates, just OU means. I believe auteur (in geiger) and
BAMM can model different BM rates on different branches, which sounds like
your question. We have some code to try different OU models (including
perhaps ones with different rates) on different branches buried somewhere
in OUwie (we call it OUwie-dredge so people know to be cautious) but it
hasn't been tested or published yet.

One other caution: if you have a BM model, and are just modeling different
rates, it can be done well. Trying to estimate different rates with OU
models is harder to do well: there's interaction between alpha and sigma
that can make them difficult to distinguish (not formally nonidentifiable
[unlike OU means sometimes], just difficult). For me, I'd probably lean
towards one of the BM rate heterogeneity methods for your question.

Hope this helps,
Brian

[1] Hansen, T. F. and E. P. Martins. 1996. Translating between
microevolutionary process and macroevolutionary patterns: the correlation
structure of interspecific data. Evolution 50:1404-1417.

_______
Brian O'Meara
Associate Professor
Dept. of Ecology & Evolutionary Biology
U. of Tennessee, Knoxville
http://www.brianomeara.info

On Fri, Apr 15, 2016 at 10:20 AM, Belinda Kahnt  wrote:

> Dear all,
> I am a newbie to this mailing list and phylogenetic analyses in R and hope
> you can comment on a question I have. I would like to estimate the rate of
> evolution of a continous trait and check if the trait evolves faster along
> some branches of the topology. I already checked with Pagels lambda if the
> trait evolves according to a Brownian motion model,(i.e. pure drift), which
> it did not (lambda ~ 0). Modelling the trait evolution under the
> Ornstein-Uhlenbeck (OU) model provided a significant better fit (p<- 0.01)
> with a very high alpha parameter of 8.7 (indicating a role of selection in
> trait evolution). In order to infer now the rate of trait evolution I am
> searching for a R package that is able to do this inference based on the OU
> model. However, so far I either just found methods based on the Brownian
> motion model (e.g. brownie.lite function in phytools, Motmot) or methods
> that also rely on the reconstruction of ancestral selective regimes
> beforehand (like mvMORPH or OUwie). I want to keep the model as simple as
> possible and therefore don't know if it is necessary to also model
> ancestral selective regimes if I just want to know if the evolutionary rate
> of the trait varies along branches and which branches show an accelerated
> rate?! Which programm would you recommend? I would be very thankful for
> your help and recommendations.
> Best wishes,
> Belinda
>
> ___
> R-sig-phylo mailing list - R-sig-phylo@r-project.org
> https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
> Searchable archive at
> http://www.mail-archive.com/r-sig-phylo@r-project.org/
>

[[alternative HTML version deleted]]

___
R-sig-phylo mailing list - R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/

Re: [R-sig-phylo] Accounting for phylogeny in binary predictor, binary response data

2016-02-10 Thread Brian O'Meara

There's also corDISC in corHMM (disclosure: I'm a coauthor on the package).
It does Pagel 1994 with two or three binary characters, allows for user set
root states (I believe the canonical Pagel (1994) assumes equal freq at the
root, I don't know if phytools uses this same assumption [a lot of programs
assume equilibrium freq]), and allows additional control over the rate
matrix. However, it requires you to specify what sort of rate matrix you
want (has equal rates, all rate different, etc., but you would have to
specify the dependent model matrix): the phytools code seems to loop over
the named Pagel matrices, which are a subset of the possible matrices. So,
I'd summarize it as phytools being more turnkey, corHMM having more control
but less easy to use.

Oh, and in either case, read this and be afraid before starting analyses:
Maddison & FitzJohn, 2014: http://sysbio.oxfordjournals.org/content/64/1/127
.

Best,
Brian

_______
Brian O'Meara
Associate Professor
Dept. of Ecology & Evolutionary Biology
U. of Tennessee, Knoxville
http://www.brianomeara.info

Postdoc collaborators wanted: http://nimbios.org/postdocs/
Calendar: http://www.brianomeara.info/calendars/omeara

On Wed, Feb 10, 2016 at 12:41 PM, Heath Blackmon  wrote:

> Grace,
>
> The package phytools includes a function fitPagel if that is the method
> that you want to use.  He has a post on his blog that discusses his
> implementation so it won't be a black box for you:
>
>
> http://blog.phytools.org/2014/12/r-function-for-pagels-1994-correlation.html
>
> cheers
>
> Heath
>
> On Wed, Feb 10, 2016 at 11:14 AM, Joe Felsenstein 
> wrote:
>
> > I do not know offhand whether there is an R implementation, but how about
> > Mark Pagel's 1994 method for testing whether two 0/1 characters changing
> > along a ohylogeny are changing independently?
> >
> > J.F.
> > -
> > j...@gs.washington.edu
> > Joe Felsenstein, Department of Genome Sciences and Department of Biology
> > Box 355065, University of Washington, Seattle, WA  98195-5065
> >
> > [[alternative HTML version deleted]]
> >
> > ___
> > R-sig-phylo mailing list - R-sig-phylo@r-project.org
> > https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
> > Searchable archive at
> > http://www.mail-archive.com/r-sig-phylo@r-project.org/
> >
>
>
>
> --
> *Heath Blackmon, Ph.D.*
>
> *Postdoctoral Researcher*
> *University of Minnesota*
> *coleoguy.github.io <http://coleoguy.github.io>*
> *@coleoguy* <https://twitter.com/coleoguy>
> *Phone 682-444-0538*
>
> [[alternative HTML version deleted]]
>
> ___
> R-sig-phylo mailing list - R-sig-phylo@r-project.org
> https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
> Searchable archive at
> http://www.mail-archive.com/r-sig-phylo@r-project.org/
>

[[alternative HTML version deleted]]

___
R-sig-phylo mailing list - R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/

Re: [R-sig-phylo] Fitting Brownian evolution with bounds

2015-11-30 Thread Brian O'Meara

There's alpha code to do this in TreEvo, but we're still readying this for
publication (and it's now supported by an NSF grant, so expect much more
development), and so I wouldn't advise using it yet. I don't know of
anything else at the moment.

Best,
Brian

___________
Brian O'Meara
Associate Professor
Dept. of Ecology & Evolutionary Biology
U. of Tennessee, Knoxville
http://www.brianomeara.info

Postdoc collaborators wanted: http://nimbios.org/postdocs/
Calendar: http://www.brianomeara.info/calendars/omeara

On Mon, Nov 30, 2015 at 4:18 AM, saurabh  wrote:

> Hello,
>
> does anyone know if its possible and how to fit Brownian evolution with
> either reflecting or absorbing boundaries?
>
> And many thanks for creating this excellent resource!
> Saurabh
>
> --
> Saurabh Mahajan
> Research Scholar/Graduate Student
> Deepa Agashe's Lab
> National Center for Biological Sciences
> Bengaluru 560065
> INDIA
>
> Ph: +91 80 2366 6504
> Mobile: +918050001886
> Email: saurabh...@gmail.com
>
> ___
> R-sig-phylo mailing list - R-sig-phylo@r-project.org
> https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
> Searchable archive at
> http://www.mail-archive.com/r-sig-phylo@r-project.org/
>

[[alternative HTML version deleted]]

___
R-sig-phylo mailing list - R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/

Re: [R-sig-phylo] simulate Ne changes on a given phylogeny

2015-10-17 Thread Brian O'Meara

Dick Hudson's ms software can simulate gene trees along a species tree or
network with migration, changing population size, etc. The package phyclust
can call ms. You could then just simulate nucleotides on these gene trees.

Best,
Brian

___
Brian O'Meara
Associate Professor
Dept. of Ecology & Evolutionary Biology
U. of Tennessee, Knoxville
http://www.brianomeara.info

Postdoc collaborators wanted: http://nimbios.org/postdocs/
Calendar: http://www.brianomeara.info/calendars/omeara

On Sat, Oct 17, 2015 at 10:12 PM, Jacob Berv  wrote:

> Dear R-sig-phylo,
>
> I have a somewhat general simulation question and I was hoping someone on
> here might have some insight.
>
> I’m trying to figure out if it’s possible to simulate nucleotide sequence
> data (an arbitrary number of neutral loci under a multi species coalescent
> model), on an ultrametric input topology (where tips represent species),
> with user defined changes in effective population size at the start and end
> of a particular internal branch. In my searching I’ve come across some
> software by Deren Eaton (https://github.com/dereneaton/simLoci <
> https://github.com/dereneaton/simLoci>) that looks like it might do what
> I want - but I’m not sure. It looks like I can specify migration events
> between taxa, but perhaps not population size changes on internal branches.
> There are many other applications for simulating sequence data but I am not
> familiar with any of them. Any thoughts would be appreciated!
>
> Thanks,
>
> Jacob Berv
>
> Ph.D. Student
> Lovette Lab
> Cornell Laboratory of Ornithology
>
>
> [[alternative HTML version deleted]]
>
> ___
> R-sig-phylo mailing list - R-sig-phylo@r-project.org
> https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
> Searchable archive at
> http://www.mail-archive.com/r-sig-phylo@r-project.org/

[[alternative HTML version deleted]]

___
R-sig-phylo mailing list - R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/

Re: [R-sig-phylo] using chronos() from the ape package for large bacterial phylogeny

2015-09-23 Thread Brian O'Meara

Geiger has an interface to pathd8; it has the beginnings of interfaces to
other programs, too, but I don't think they're finished yet.

Best,
Brian

_______
Brian O'Meara
Associate Professor
Dept. of Ecology & Evolutionary Biology
U. of Tennessee, Knoxville
http://www.brianomeara.info

Postdoc collaborators wanted: http://nimbios.org/postdocs/
Calendar: http://www.brianomeara.info/calendars/omeara

On Mon, Sep 21, 2015 at 11:30 AM, Franz Krah  wrote:

> Hi Roger,
> Fast Solution: use pathd8 or r8s or treePL. Not in R... pathd8 should be
> the easiest one since you just need a ultrametric tree... 1 sec in parthd8
>
> Cheers, Franz
>
>
>
>
> > On 21.09.2015, at 16:21, Fabian Roger  wrote:
> >
> > hej everyone
> >
> > I have a large phylogeny based on 16S RNA OTU data (> 7000 OTUs).  In
> order to calculate some metrics of phylogenetic diversity I need to make
> the tree ultrametric. I thought about using the chronos function from the
> ape package in R.
> >
> > However I run into two problems: the main problem is that for the full
> tree the function fails with the error "Error in nlminb(start.para, f, g,
> control = opt.ctrl, lower = LOW, upper = UP) : NA/NaN gradient evaluation”
> >
> > I have seen several question regarding this error on the mailing list,
> but I haven’t seen a solution. Could anybody help with that? Not that if I
> run small subsets of the full tree, it works without problems!
> >
> > Also, I am concerned about runtime issues. A quick simulation suggests
> that runtime roughly scales linearly with the number of OTUs and that 10
> 000 OTUs should run in below 1 h (although I only simulated trees up to 140
> OTUs…) if that’s true it wouldn’t be all that bad, but does anybody know or
> has experience on if thats actually true?
> >
> > thanks a lot for your help!
> >
> > best
> >
> > Fabian
> > ___
> > R-sig-phylo mailing list - R-sig-phylo@r-project.org
> > https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
> > Searchable archive at
> http://www.mail-archive.com/r-sig-phylo@r-project.org/
>
> ___
> R-sig-phylo mailing list - R-sig-phylo@r-project.org
> https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
> Searchable archive at
> http://www.mail-archive.com/r-sig-phylo@r-project.org/
>

[[alternative HTML version deleted]]

___
R-sig-phylo mailing list - R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/

Re: [R-sig-phylo] Dollo character using ace/make.simmap

2015-09-14 Thread Brian O'Meara

Just a quick note: Dollo typically doesn't mean irreversible (character can
be lost but never gained): it means a character can be gained ONCE and then
lost (perhaps multiple times) and never gained again. If you just want an
irreversible character, the suggestions already made would be great; if you
want a truly Dollo character, it's more complex. Assuming you want 0 to be
something like eyeless and 1 to be eyes (certainly gained, but rarely, and
easy to lose), and assuming you think the clade started in state 0, you
could fix the root state at 0 and have a very small but nonzero rate from
0->1 and allow a higher 1->0 rate. If the clade starts in state 1, then
treating it as irreversible (rate 0->1 fixed at 0) would be fine.

Best,
Brian

_______
Brian O'Meara
Associate Professor
Dept. of Ecology & Evolutionary Biology
U. of Tennessee, Knoxville
http://www.brianomeara.info

Postdoc collaborators wanted: http://nimbios.org/postdocs/
Calendar: http://www.brianomeara.info/calendars/omeara

On Mon, Sep 14, 2015 at 8:00 PM, Graeme Lloyd 
wrote:

> Hi Everyone,
>
> I was wondering if anyone knows if it is possible to set up a custom rate
> model to represent a binary discrete Dollo character with ace or
> make.simmap?
>
> Thanks,
>
>
> Graeme
> ___
> R-sig-phylo mailing list - R-sig-phylo@r-project.org
> https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
> Searchable archive at
> http://www.mail-archive.com/r-sig-phylo@r-project.org/
>

[[alternative HTML version deleted]]

___
R-sig-phylo mailing list - R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/

Re: [R-sig-phylo] Ancestral Reconstruction with Polymorphic States

2015-07-06 Thread Brian O'Meara

Well, if you want to have polymorphic ancestors, one way would be to treat
it as an ordered character: 0 <-> 0/1 <-> 1, so that any movements from 0
to 1 have to go through a polymorphic ancestor. This would entail fixing
the rate matrix to have the 0<->1 rates forced to be zero (which I *think*
is allowed by phytools). Then you could use make.simmap() or similar. If
you just want to have ancestors be 0 or 1 only, but treat 0&1 as a mixture
of two states at the tips, you could use corHMM; note, though, that it'll
give you estimated states at nodes but not a full stochastic mapping.
Another possibility for treating them as polymorphic, and allowing
polymorphic ancestors, is to use a biogeographic model (like in
biogeoBEARS). It'll require some futzing with the matrices, though, to make
all the changes happen along branches rather than at nodes -- unless you
think that's how speciation happens (i.e., a polymorphic species with 0&1
speciates by separating into a 0 monomorphic species and a 1 monomorphic
species).

Best,
Brian

_______
Brian O'Meara
Assistant Professor
Dept. of Ecology & Evolutionary Biology
U. of Tennessee, Knoxville
http://www.brianomeara.info

Postdoc collaborators wanted: http://nimbios.org/postdocs/
Calendar: http://www.brianomeara.info/calendars/omeara

On Mon, Jul 6, 2015 at 1:19 PM, Jackson, Laura Marie <
laura.jack...@coyotes.usd.edu> wrote:

> Hi all,
>
> Is it possible to generate an ancestral reconstruction (i.e., make.simmap
> in {phytools} using data with polymorphic states (0&1)? It seems that
> {phytools} thinks that 0&1 is a third state, which would be incorrect in
> the final mapping. When I try this on an example file and tree, it does not
> accurately color the branches to represent the polymorphic states. Any help
> with this would be greatly appreciated!
>
>
> > simmap <- make.simmap(tree, matrix, nsim=100)
> make.simmap is sampling character histories conditioned on the transition
> matrix
> Q =
> 0   0/1 1
> 0   -3.715682  2.376705  1.338976
> 0&1  2.376705 -2.376705  0.00
> 11.338976  0.00 -1.338976
> (estimated using likelihood);
> and (mean) root node prior probabilities
> pi =
> 0   0/1 1
> 0.333 0.333 0.333
> Done.
>
>
> ---
> Laura M. Jackson
> PhD Student
> Department of Biology
> 185 Churchill Haines
> University of South Dakota
> 414 E. Clark St.
> Vermillion SD 57069-1746
> Email: laura.jack...@usd.edu<mailto:laura.jack...@usd.edu>
>
>
>
> [[alternative HTML version deleted]]
>
> ___
> R-sig-phylo mailing list - R-sig-phylo@r-project.org
> https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
> Searchable archive at
> http://www.mail-archive.com/r-sig-phylo@r-project.org/
>

[[alternative HTML version deleted]]

___
R-sig-phylo mailing list - R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/

Re: [R-sig-phylo] corDISC binary coding problems

2015-07-03 Thread Brian O'Meara

Hi, Franz. Sorry for the delay in answering. Much as I [selfishly] like
people to use corDISC, I'm not sure if it's appropriate to your question.
It sounds like you have some (unmentioned) binary state, let's call it foo,
that you're trying to correlate with a proportion, and so you convert the
proportion into two binary states -- is that correct? And you can't observe
00? Having an unobserved state can be problematic for rate estimates (the
MLE for 00->01 or 00->10 will be INF, the MLE for 10->00 and 01->00 will be
0). Why not use something like Ives and Garland (2010) to allow correlation
between a binary dependent variable and one or more independent variables
(implemented in packages such as phylolm)? If you do want do do a Pagel 94
like test, you could discretize the proportion as a multistate character:
i.e., 0, 1, 2, 3, 4 = 0:100 gymno:angio, 25:75 g:a, 50:50 g:a, 75:25 g:a,
100:0 g:a. You would then have 10 overall states: foo=0 with proportion 0,
foo 0 w prop 1, foo 0 w prop 2, ... foo 1 w prop 5. Convert this into a
rate matrix and constrain it (i.e., if you want to use the traditional
Pagel 94 assumption of not instantaneous changes in two char at once, set
f0p0 -> f1p5 to have a rate of zero, etc.; also, do a limited set of
rates). If it were me, though, I'd do Ives and Garland or similar (GEE,
PGLS, etc.: others might weigh in on the most appropriate way to correlate
binary with proportion).

As to whether the different thresholds matter, it would depend (partially)
on how sensitive the coding is. If all the proportions are 0-10% or
90-100%, a range of cutoffs wouldn't matter. If they're more evenly
distributed, you're changing the data a lot, so it should cause different
results. It's great you're checking sensitivity to thresholds. If it were
me, I'd probably focus on the threshold I thought was most biologically
meaningful, but also say (in the pub, not bury in the supplement) that
results were very sensitive to the threshold used, and describe the range
of results that come from changing the threshold. It's possible that the
values you get back vary, but the main result is the same: maybe foo 1 is
always positively correlated with gymnosperm substrates, though the
magnitude of this varies with the threshold, for example. You're right that
you can't model average over different data, and the threshold is
effectively changing your data.

corDISC should be generally fine for uneven state frequencies, other than
the caveat that rate estimates can get very extreme if some states are very
low frequency under some models (if 00 is missing, but all rates are equal,
the rates will probably be ok, but in a model with free states, the rates
will be extreme). It will be less fine for biased sampling: that is, if
state 1 and state 0 are occur in frequency A:B in nature, but you have
happened to sample very different C:D frequency, you'd get biased
estimates. corHMM doesn't have a way to deal with this. You could simulate
under your true frequency, subsample down to the frequency in your data,
and run the analyses to get an idea of how your results might be affected.
If this were a diversification question both diversitree and hisse can deal
with uneven sampling that you specify; diversitree does have trait only
models, but I'm not sure if they incorporate sampling frequency.

Hope this helps.

Best,
Brian

___
Brian O'Meara
Assistant Professor
Dept. of Ecology & Evolutionary Biology
U. of Tennessee, Knoxville
http://www.brianomeara.info

Postdoc collaborators wanted: http://nimbios.org/postdocs/
Calendar: http://www.brianomeara.info/calendars/omeara

On Fri, Jun 26, 2015 at 3:45 AM, f.k...@mailbox.org 
wrote:

> Hello,
>
> I have used the Pagels 94’ extended version in corDISC with 3 binary
> traits and 1300 species.
> To be able to use it, I had to code two binary character out of a
> continuous.
> My continuous variable is substrate proportion data. So 100% is gymnosperm
> exclusivity, 0% is gymnosperm exclusivity.
> I made two binary character out of it, using different thresholds 10%,
> 20%, 30%. 40%,  50%.
> The resulting two character were XY, where X= angiosperm occurrence, Y=
> gymnosperm occurrence. Therefore a 11 would be occurence on both and e.g. 10
> is occurrence primarily on a given substrate, where “primarily” is defined
> by the threshold. The third binary doesn’t matter in this case.
>

> So my first question is: Is it valid to have a state that is coded by two
> binaries, like 11 that means “occurrence on both” because the meaning of
> the coding depends on both binaries and is therefore not “independent”? I
> have a bad feeling about coding that has a “meaning together”...

> The other question is about the results. I ran corDISC with all 5
> different thresholds an

Re: [R-sig-phylo] Missing States in ML Reconstruction of Discrete Ancestral States (phytools, ape)

2015-06-17 Thread Brian O'Meara

corHMM should be able to do this as long as the states are defined in the
transition matrix (rate.mat.maker() to start, then modify). BioGeoBEARS
could do this (since it has to for biogeography: not all combos of
ancestral areas are observed at the tips), and one of its models might be
equivalent to what you want. I imagine diversitree can, too.

Of course, this all might work best with a model with somewhat constrained
rates. If rates are all free, and the model is unordered, the MLE for rate
of moving into the unseen state will be zero, and rate for moving out of it
will be Inf.

Best,
Brian


___
Brian O'Meara
Assistant Professor
Dept. of Ecology & Evolutionary Biology
U. of Tennessee, Knoxville
http://www.brianomeara.info

Postdoc collaborators wanted: http://nimbios.org/postdocs/
Calendar: http://www.brianomeara.info/calendars/omeara

On Wed, Jun 17, 2015 at 3:06 PM, David Bapst  wrote:

> Liam,
>
> Ah, I see, rerootingMethod(), unlike ace, will accept a matrix
> representation of discrete trait values.
>
> However, it doesn't appear that one can define a model matrix still?
> Outside of molecular data, the missing state issue is often for
> ordered data, hence the need to define a model matrix.
>
> 
> library(ape)
> library(phytools)
> data(bird.orders)
>
> #make a matrix for a 4 step trait
> model<-matrix(0,4,4)
> for(i in 1:4){for(j in 1:4){
> if(abs(i-j)<2){
> model[i,j]<-0.1
> }
> }}
> rownames(model)<-colnames(model)<-0:3
>
> #simulate data where you have only observed endmembers
> trait <- factor(c(rep(0, 5), rep(3, 18)))
> names(trait)<-bird.orders$tip.label
> trait<-to.matrix(trait,seq=0:3)
>
> rerootingMethod(tree=bird.orders, x=trait)
> #works
> rerootingMethod(tree=bird.orders, x=trait, model=model)
> #fails
>
> #
>
> Cheers,
> -Dave
>
> On Wed, Jun 17, 2015 at 12:06 PM, Liam J. Revell 
> wrote:
> > Hi David.
> >
> > Actually, if I understand correctly, this does work in phytools. I
> suspect
> > it can be set up in phangorn as well - Klaus can probably explain how.
> >
> > So, if I understand correctly, you want to get marginal ancestral states
> for
> > a character that can assume, say, states "A", "C", "G", & "T", but for
> which
> > you have only observed states "A", "C", & "G" (for instance) in your tip
> > taxa?
> >
> > The way to do this in rerootingMethod is as follows:
> >
> > library(phytools)
> > ## simulate some data to work with:
> > tree<-pbtree(n=26,tip.label=letters,scale=1)
> > Q<-matrix(c(-2,1,1,1,-2,1,1,1,-2),3,3)
> > rownames(Q)<-colnames(Q)<-c("A","C","G")
> > x<-sim.history(tree,Q)$states
> > x ## our data
> > ## now convert to a matrix representation:
> > X<-to.matrix(x,seq=c("A","C","G","T"))
> > ## reconstruct ancestral states
> > fitER<-rerootingMethod(tree,X,model="ER")
> > fitER
> >
> > Note that as we have not observed any evidence that changes to "T" occur
> for
> > this character, if we use a symmetric model instead, we will find that
> the
> > marginal likelihoods at all internal nodes for state "T" are zero. This
> is
> > because the maximum likelihood q[i,"T"] & q["T",i] for all i will be zero
> > (sensibly). For instance:
> >
> > fitSYM<-rerootingMethod(tree,X,model="SYM")
> > fitSYM
> >
> > Let me know if this is what you had in mind.
> >
> > rerootingMethod just uses modified code from ace internally to do the
> > calculations - so if ace does not permit this presently, it could easily
> be
> > modified to allow it. phangorn too must allow this because for many
> > nucleotide sites we want to reconstruct ancestral states - but we will
> have
> > observed only a subset of the four nucleotides at that site.
> >
> > All the best, Liam
> >
> > Liam J. Revell, Assistant Professor of Biology
> > University of Massachusetts Boston
> > web: http://faculty.umb.edu/liam.revell/
> > email: liam.rev...@umb.edu
> > blog: http://blog.phytools.org
> >
> >
> > On 6/17/2015 1:46 PM, David Bapst wrote:
> >>
> >> Hello all,
> >>
> >> (I'm a troublemaker today.)
> >>
> >> Sometimes, in ordered discrete data, there are states we know might
> >> exist as intermediary between obser

Re: [R-sig-phylo] Bug in reorder.phylo() (related to cleaning phylo objects)

2015-06-15 Thread Brian O'Meara

A kludgy solution to achieve Dave's original use case of fixing
not-quite-right phylo objects is to do:

phy <- read.tree(text=write.tree(phy))

It converts the tree to Newick and reads it back in.

It's not as good a solution as using a validator (great addition, Emanuel!)
to find and fix the general problem, but it can help in a pinch.

Best,
Brian


_______
Brian O'Meara
Assistant Professor
Dept. of Ecology & Evolutionary Biology
U. of Tennessee, Knoxville
http://www.brianomeara.info

Postdoc collaborators wanted: http://nimbios.org/postdocs/
Calendar: http://www.brianomeara.info/calendars/omeara

On Mon, Jun 15, 2015 at 11:29 AM, Emmanuel Paradis 
wrote:

> Hi David,
>
> collapse.singles() seems to work correctly if the "phylo" object is
> correctly conformed. Some features of this class may seem annoying (and
> Klaus and I alredy discussed about this), but these help for other things.
> As a reminder the class "phylo" is defined here:
>
> http://ape-package.ird.fr/misc/FormatTreeR_24Oct2012.pdf
>
> Recently, I put a function on github to help code writters:
>
> https://github.com/emmanuelparadis/checkValidPhylo
>
> This could help you to detect problems that would be tough to find
> otherwise.
>
> Cheers,
>
> Emmanuel
>
>
> Le 12/06/2015 22:41, David Bapst a écrit :
>
>> Hi Klaus (and others),
>>
>> Ah, I see! The real bug then appears to be in collapse.singles, as it
>> does not reorder the ID numbers in $edge. Here's a quick function to
>> return the root ID:
>>
>> getRootID<-function(tree){
>>  uniqueNode<-unique(tree$edge[,1])
>>  whichRoot<-sapply(uniqueNode,function(x)
>>  (sum(x==tree$edge[,2])==0))
>>  rootID<-uniqueNode[whichRoot]
>>  return(rootID)
>>  }
>>
>> And if we run it...
>>
>>  getRootID(tree) #original tree
>>>
>> [1] 151
>>
>>> getRootID(tree1)   #after collapse.singles
>>>
>> [1] 151
>>
>>> getRootID(tree2)   #after reorder.phylo
>>>
>> [1] 131
>>
>> And we can see that although the number of nodes certainly changed
>> when collapse.singles was run, it didn't change the root node's ID.
>>
>> So I guess this is really a collapse.singles bug report. Thanks for
>> the insight, Klaus!
>>
>> -Dave
>>
>> On Fri, Jun 12, 2015 at 2:27 PM, Klaus Schliep 
>> wrote:
>>
>>> Hi David,
>>>
>>> I found the bug. Somehow ape assumes on the one hand that the root is
>>> number
>>> of tips + 1.
>>> On the other hand the root of an phylo object is the node which is in
>>> tree$edge[,1], but not in tree$edge[,2],
>>> this is the case even in an unrooted tree. This annoys me for a long
>>> time.
>>>
>>> The root of tree1 with the definition above is actually node 151 and not
>>> 131. Changing these solves your problem:
>>> Here is a small function to do this for you:
>>>
>>> switch.nodes<- function(tree, a, b){
>>>  ntips = length(tree$tip.label)
>>>  tree$edge[tree$edge==a] = 0L
>>>  tree$edge[tree$edge==b] = as.integer(a)
>>>  tree$edge[tree$edge==0L] = as.integer(b)
>>>  if(!is.null(tree$node.label)){
>>>  tmp<-tree$node.label[a-ntips]
>>>  tree$node.label[a-ntips] = tree$node.label[b-ntips]
>>>  tree$node.label[b-ntips] = tmp
>>>  }
>>>  tree
>>> }
>>>
>>> library(phangorn)
>>> attr(tree1, "order") = NULL
>>> getRoot(tree1)
>>>
>>> tree2 = switch.nodes(tree1, 131, 151)
>>> plot(tree2) # now works for me
>>>
>>> Cheers and have a nice weekend,
>>> Klaus
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>> On Fri, Jun 12, 2015 at 2:53 PM, David Bapst  wrote:
>>>
>>>>
>>>> Hello all,
>>>>
>>>> As those of you who directly manipulate the guts of phylo objects in
>>>> your code (or construct new phylo objects whole cloth from
>>>> un-phylo-like data structures) have probably experienced, it is
>>>> sometimes easy to create $edge matrices that aren't accepted by ape
>>>> functions (I often use plot.phylo as my litmus for this).
>>>>
>>>> When this occurs, various things can be done; I usually do the
>>>> following sequence:
>>>>
>>>> collapse.singles(

Re: [R-sig-phylo] clade-specific release and radiate model?

2015-05-29 Thread Brian O'Meara

Hi, Daniel. It's a bit arguable whether as alpha -> 0, OU -> BM: I think it
should, but IIRC in OUCH this doesn't happen, and that's a deliberate
choice. That said, I think that an OU with alpha near zero would be ok for
your question, though you might want to think about how to penalize
parameters (that is, for that clade there'd be an OU mean parameter that is
unidentifiable (alpha of zero, so no pull, so no evidence for it): should
you count this as a parameter when doing model comparison? I'd argue no:
you're doing OU with alpha of zero as a kludgy hack to treat it as BM).

Another approach would be to resuscitate the "censored" model of O'Meara et
al. 2006. Slice your tree on the edge leading to the released clade (I
guess this truly releases the clade to roam free of its relatives) so you
have the paraphyletic non-released set and the released clade as separate
trees. Then you can try fitting the same or different models to the two
trees. The downside of this is that you must use an additional ancestral
state (at the MRCA of the released clade); the upside is that any weird
changes happening on the edge leading to the released clade aren't in the
model and so don't affect the fit (you could imagine that whatever led the
clade to be ecologically released happened somewhere on the stem edge, but
you don't know where, and it could be associated with a major single shift
in your continuous trait, too). You could try an OU on the non-released
tree (let's call this A) and an OU on the released clade (B), OU on A and
BM on B, etc. The only practical difficulty with this is constraining the
cases where A and B are supposed to have the same model: by default,
optimization will happen separately in different trees, but you can create
a wrapper function that calls OUwie.fixed() separately on A and B but with
the same parameters and adds the likelihood and then optimize the
parameters in this wrapper function.

Hope this helps,
Brian

On Fri, May 29, 2015 at 2:02 AM, Daniel Fulop  wrote:

> Hi All,
>
> Do you know if there are any methods out there for fitting ecological
> release and release and radiate models that are clade-specific?  That is,
> in which the change in mode (OU to BM) happens at the root of a clade
> instead of at specific time for the whole phylogeny (as in Slater 2013).
>
> As far as I know the closest out there are models in OUwie, say with a
> clade-specific second OU process with a very low alpha.  However, I don't
> think that biologically OU is a good model for the trait and clade in
> question (though it is for the rest of the tree).  I know that at the limit
> as alpha -> 0 OU converts to BM, but I would ideally still like compare
> standard models' fits (including OUwie models) to the fits of
> clade-specific release models.
>
> Any leads or suggestions would be much appreciated, especially about how
> to implement these clade-specific models with current tools or about how to
> "roll my own".
>
> Thanks!
> Dan.
>
> --
> Daniel Fulop, Ph.D.
> Postdoctoral Scholar
> Dept. Plant Biology, UC Davis
> Maloof Lab, Rm. 2220
> Life Sciences Addition, One Shields Ave.
> Davis, CA 95616
>
> ___
> R-sig-phylo mailing list - R-sig-phylo@r-project.org
> https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
> Searchable archive at
> http://www.mail-archive.com/r-sig-phylo@r-project.org/
>

[[alternative HTML version deleted]]

___
R-sig-phylo mailing list - R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/

Re: [R-sig-phylo] ancestor vs. change plots

2015-04-24 Thread Brian O'Meara

In Paleo, you can (well, arguably) see the ancestor and descendent pair, so
you can assess the amount of change and the ancestral state. Using just
info at the tips, you are inferring from N taxa N-1 ancestral states and
2N-2 rates of change. That seems a lot to ask from data, especially since
the rates and states affect each other and so aren't independent. You're
also using a model that assumes an even rate of change over the tree, so an
ancestor will tend to be between its descendants (exactly between, if you
have a two taxon tree with coeval taxa). You could try more exotic models
that allow multiple Brownian rates or even multiple Ornstein-Uhlenbeck
rates, but there's a limit there, as well. There are Bayesian approaches
that can paint a smear (estimated from post-burnin samples) of rates over
multiple branches, but I'd be careful interpreting them as robust estimates
of rate that are independent of the states.

If you have separate traits, you could ask does trait X lead to higher
rates (or OU mean, or OU alpha) of evolution in trait Y. Basically this
involves painting a tree with discrete trait X (generalist vs specialist
predator, say) and estimating different rates for the other trait (i.e.,
mouth volume) on the branches painted in different states (and seeing if
these rates are biologically and statistically significantly different).

Best,
Brian

_______
Brian O'Meara
Assistant Professor
Dept. of Ecology & Evolutionary Biology
U. of Tennessee, Knoxville
http://www.brianomeara.info

Postdoc collaborators wanted: http://nimbios.org/postdocs/
Calendar: http://www.brianomeara.info/calendars/omeara

On Fri, Apr 24, 2015 at 3:14 PM, Milton Tan 
wrote:

> Hello all,
>
> I have a question that is perhaps esoteric, since it's on a method I don't
> see used often. I am looking at the dynamics of body size evolution, and
> have come upon ancestor-vs-change plots described in Alroy 2000
> ("Understanding the dynamics of evolutionary trends", Paleobiology). This
> is interesting because it will allow me to see if rate of body size change
> depends on body size. I haven't seen this method widely used, so for anyone
> unaware how this works: for each branch, you plot the ancestral state vs.
> the amount/rate of change along the branch. In theory, by looking at a
> scatterplot of ancestral size vs. change, I can answer questions such as
> "Are smaller taxa more likely to evolve to a larger size?" I don't see too
> many people using this method, so I thought I'd ask why. Is there a
> particular reason for it not being used? Are there more powerful methods to
> answer this same question that have supplanted it?
>
> I also have a more specific question, assuming that ancestor-vs-change
> plots are valid. For my dataset, I have reconstructed ancestral states for
> my clade from tip data and pic, and then plotted ancestral states versus
> the rate of change along every branch (attached). While there are a couple
> outliers, you can see the distribution roughly hangs around 0 along the
> entire graph, but I'd like to fit a line so I can present if the slope
> significantly differs from 0 (ie. body size evolutionary rate truly does
> not depend on body size of ancestors). Alroy (2000) describes that the
> shape of the plot can be informative on different dynamics of evolutionary
> change, implying fitting of linear and polynomial lines, but doesn't really
> discuss how to test this statistically. Alroy seems to use this method and
> fits a polynomial line in Alroy 1998 ("Cope's Rule and the Dynamics of Body
> Mass Evolution in North American Fossil Mammals", Science), but he doesn't
> really describe his statistics !
>  in depth.
>
> So my question is how can I fit a line to this data? Problematically,
> since each branch has a sister branch that shares the same ancestral node,
> each ancestral state is represented twice. I think this makes the
> observations non-independent. If they are, this makes me think that a
> linear regression is inappropriate, but I'm not sure. I could average the
> amount of change along each branch for every ancestral node, but I'm not
> sure if that's the best way. Does anyone have any insight on an appropriate
> way to determine a best fit line statistically?
>
> Thanks in advance,
>
> Milton Tan
> Auburn University
> Department of Biological Sciences
> PhD Candidate
>
> [[alternative HTML version deleted]]
>
> ___
> R-sig-phylo mailing list - R-sig-phylo@r-project.org
> https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
> Searchable archive at
> http://www.mail-archive.com/r-sig-phylo@r-project.org/
>

[[alternative HTML version deleted]]

___
R-sig-phylo mailing list - R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/

Re: [R-sig-phylo] Unsual values of alpha in OU models

2015-03-27 Thread Brian O'Meara

You can select between models using AICc, but then you can look at the
reliability, more robustly than using eigenvalues, by using OUwie.boot() to
do parametric bootstrapping. Another very useful thing done by
Hansen, Bartoszek, and colleagues is to do a contour plot around the
maximum likelihood estimate to see how much a parameter like alpha can vary
without changing the likelihood much. As Aaron indicated, this can be a
horrifyingly large region. It's especially tough with OUMA and OUMVA where
there are multiple alphas.

One thing to consider is what kind of model is most likely to give you
results you trust and that are relevant for your question. It could be that
you care about OUM-type models, but don't really care if it's OUMA, OUM, or
OUMVA. You should certainly report that, say, OUMVA is the best fitting
under AICc, but perhaps if your biological question is about the optima
only you focus on OUM due to OUMVA seeming to have issues estimating any
parameters well (though of course also indicate if the results under OUMVA
generally agree or disagree with the conclusions from OUM -- you need to
show that it's not that OUMVA fit better but gave you an answer you didn't
like, so you cherry picked a different model to get the result you want).

Hope this helps,
Brian

___________
Brian O'Meara
Assistant Professor
Dept. of Ecology & Evolutionary Biology
U. of Tennessee, Knoxville
http://www.brianomeara.info

Postdoc collaborators wanted: http://nimbios.org/postdocs/
Calendar: http://www.brianomeara.info/calendars/omeara

On Fri, Mar 27, 2015 at 12:57 PM, Diego Salazar Tortosa 
wrote:

> Thank you for your answer Aaron.
>
> Clear to me that this situation is common in the OU models, but I don't
> know what criteria I should use when select between different OU models,
> AICc or reliability of parameters?
>
> Thanks in advance
>
> Regards
>
> Diego Salazar
>
> 2015-03-27 11:08 GMT+00:00 Aaron King :
>
> > In many situations, the OU model parameter alpha is not well identified.
> > This has been pointed out before, but it's quite common for the parameter
> > values (and not just alpha, though that's typically the worst) to be
> poorly
> > identified, even if the model selection is unambiguous.  OU model
> > parameters can be well identified under some circumstances (detailed in
> the
> > paper) but when they are not, estimates can be *very* unreliable.
> >
> > On Fri, Mar 27, 2015 at 4:07 AM, Diego Salazar Tortosa 
> > wrote:
> >
> >> Hi,
> >>
> >> I am trying to analyze whether the evolution of a continuous trait is
> >> state-dependent of discrete trait with three levels, through OU models.
> My
> >> phylogeny has 113 species and 112 internal nodes.
> >>
> >> Clearly the most parsimonious models according AICc are OUMA and OUMVA
> but
> >> have alpha values between 2.5 and 4, are these values usual or too high?
> >>
> >> Also some eigenvuales obtained with diagn=T  are negative for both
> models,
> >> althought standard errors are very small. Are these models valid?
> >>
> >> Thanks in advance
> >>
> >> Diego Salazar
> >>
> >> --
> >> Diego Francisco Salazar Tortosa
> >> Ph student
> >> Departamento de Ecología
> >> Facultad de Ciencias
> >> Universidad de Granada
> >> Av. Fuente Nueva s/n
> >> 18071 Granada
> >> Telefono: +34 958241000 ext 20007
> >> Movil: +34 634851132
> >> email: dsala...@ugr.es 
> >>   dftort...@gmail.com
> >>
> >> ---
> >> "Este mensaje se dirige exclusivamente a su destinatario y puede
> contener
> >> información privilegiada o confidencial. Si no es Ud. el destinatario
> >> indicado, queda notificado de que la utilización, divulgación o copia
> sin
> >> autorización está prohibida en virtud de la legislación vigente. Si ha
> >> recibido este mensaje por error, se ruega lo comunique inmediatamente
> por
> >> esta misma vía y proceda a su destrucción.
> >>
> >> This message is intended exclusively for its addressee and may contain
> >> information that is CONFIDENTIAL and protected by professional
> privilege.
> >> If you are not the intended recipient you are hereby notified that any
> >> dissemination, copy or disclosure of this communication is strictly
> >> prohibited by law. If this message has been received in error, please
> >> immediately notify us via e-mail and delete it".
> >> ---

Re: [R-sig-phylo] assign multiple discrete states to single tip

2015-03-26 Thread Brian O'Meara

There's also the rayDISC function in the corHMM package. It can take in
polymorphic data (so you can have some taxa red, some white, some red &
white). Note that along the tree, it assumes that there's one state. If you
wanted ancestral nodes to be validly polymorphic (not just equal chances of
red and white, but a character red+white), you'd need to do as Liam
suggests and create a polymorphic trait and an ordered state matrix. This
is possible in ace() in ape, rayDISC() in corHMM, and probably others.

Best,
Brian

_______
Brian O'Meara
Assistant Professor
Dept. of Ecology & Evolutionary Biology
U. of Tennessee, Knoxville
http://www.brianomeara.info

Postdoc collaborators wanted: http://nimbios.org/postdocs/
Calendar: http://www.brianomeara.info/calendars/omeara

On Wed, Mar 25, 2015 at 10:49 AM, Liam J. Revell 
wrote:

> Hi Matt.
>
> What is the purpose of this? To visualize the states on the tree? Or to
> conduct some kind of analysis (modeling evolution, reconstructing ancestral
> states, etc.) on the tree?
>
> If the former, then you can use tiplabels in the ape package, and supply a
> matrix to the argument pie in which each row is a tip taxon and each column
> is a state. For non-polymorphic species you can just put a 1 in the column
> for the state of that taxon, and zero otherwise; and for polymorphic taxa
> you can put 1/(number of states in that species) - or the state
> frequencies, if you have them - for each state found in that species, and
> zero otherwise.
>
> If you're interested in, for example, reconstructing ancestral states,
> then it may in fact make the most sense to treat polymorphism as another
> state. In that case, you must decide if you will, for instance, require
> that transitions from A -> B first pass through state AB, etc. This kind of
> constraint is fairly straightforward to implement in ape's ace function.
>
> Let us know & I will see if I can give you any more concrete suggestions.
>
> All the best, Liam
>
> Liam J. Revell, Assistant Professor of Biology
> University of Massachusetts Boston
> web: http://faculty.umb.edu/liam.revell/
> email: liam.rev...@umb.edu
> blog: http://blog.phytools.org
>
>
> On 3/25/2015 7:38 AM, Koski, Matthew H wrote:
>
>> Hi,
>>
>>
>> I'd like to assign multiple discrete character states to tips that are
>> polymorphic. E.g., color in a given species is polymorphic (pink or white)
>> so it is assigned two states for color. Is there a way to do this without
>> creating an extra 'polymorphic' category?
>>
>>
>> There is a similar question on stackoverflow with no responses yet:
>>
>> http://stackoverflow.com/questions/27413070/how-to-
>> assign-multiple-states-for-taxa-in-discrete-character-data-set?
>>
>> ?
>>
>> ?Thanks in advance,
>>
>> Matt
>>
>>
>> Matthew Koski
>> University of Pittsburgh
>> Biological Sciences
>> Ashman Lab
>> koskimatt.wix.com/matthewhkoski<http://koskimatt.wix.com/matthewhkoski>
>>
>> [[alternative HTML version deleted]]
>>
>> ___
>> R-sig-phylo mailing list - R-sig-phylo@r-project.org
>> https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
>> Searchable archive at http://www.mail-archive.com/r-
>> sig-ph...@r-project.org/
>>
>>
> ___
> R-sig-phylo mailing list - R-sig-phylo@r-project.org
> https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
> Searchable archive at http://www.mail-archive.com/r-
> sig-ph...@r-project.org/
>

[[alternative HTML version deleted]]

___
R-sig-phylo mailing list - R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/

Re: [R-sig-phylo] Phylogenetic analysis with sequencing error

2015-01-23 Thread Brian O'Meara

Something like that should work. Unlike the case of ambiguity, where if you
see, say, a Y at a site you'd give the probability of a C or a T at that
site each 1, not 0.5 (Felsenstein's book has a good discussion of this) I
think you're right that you'd want the probabilities to add up to one
across bases: with sequencing error, the probability of it being A in the
sequence given that it truly is A in the species is 99%. I poked around
phangorn and didn't see an easy way to do this, but it should be possible
(might require going into the C code to do this).

One very kludgy way to do something approximately equivalent would be to
make numerous replicate datasets based on your observed dataset but for
every element in the trait matrix have a 1% chance of replacing it with a
different base (and do this per element, rather than say having a 1% chance
of changing one base per column). If you create 100 datasets in this way,
and then run the analysis on them all concatenated, it will take longer to
run (though much less than 100x longer due to shared site patterns) but I
think the log likelihood, after dividing by 100, should be a good
approximation to what you'd get with incorporating error correctly (log
likelihoods being additive across sites), as would the tree and its branch
lengths. It's not quite right, though, as you're starting from a matrix
already with unknown sequence errors, whereas this approach actually works
better if you start from the matrix of true sequences known without error
and want to discover the effect of sequencing error on the tree.

It's tempting to think of doing something like we do with continuous traits
known with error, which is just lengthening the terminal branches (aka
increase the magnitude of the diagonals of the variance-covariance matrix
for the tree) but the issue with discrete data is that you could have
homoplastic tip errors that would make an internal branch seem longer.
Perhaps with a low error rate, though, the chance of this is small enough
that you could just assume all errors result in autapomorphies and model it
by a lengthening of the terminal branches.

Of course, an easy way to do this properly with R is to write an R wrapper
for an existing non-R program that incorporates an error model, but I don't
know of a program like that to wrap yet.

Finally, there is corHMM, which is a hidden Markov model for discrete data:
you see state 0, but there might be two different types of 0. For example,
0 could be herbaceous plants, and there might be some that can become woody
again (they still have the required genes) and so those are a kind of 0
with a nonzero 0 -> 1 transition rate, while there are other 0 that have
lost the required woody genes and they would have a 0->1 transition rate of
zero. Your problem is similar: you see an A, but it might be really an A in
the species but there's some chance it's actually at T. So you could think
of tip states A_observed, T_observed, etc. and some model relating them to
hidden states A_actual, T_actual, etc. It feels like the machinery is there
in that package to do this, but it'll take some thought. Actually, one way
to model this process directly is to not use hidden states but instead use
two rate matrices: one over the entire tree that's your GTR matrix or
whatever and one for a special region you add on to the tips that has a
transition rate based on the error model: i.e., in your case a rate matrix
that leads to a transition matrix with an A->A probability of 0.99, A->T of
0.01/3, etc. Thus, this tip matrix allows for the errors, and then the rest
of the tree is using a normal nucleotide model but with the possibility of
tip errors already put in. You could use corHMM for the rates (I believe it
allows heterogeneity in this way; if not, any discrete trait model that
allows different rate matrices in different time slices would work) to get
the likelihood for a particular tree and set of rates (only vary the rates
on the main tree part; fix the error rates to what you think they are) and
wrap the whole thing in phangorn scripts for proposing tree moves. Just
remember to trim off the extra tip lengths for the last regime when
presenting the final tree, and this should give you exactly what you want.

Hope this helps,
Brian

___
Brian O'Meara
Assistant Professor
Dept. of Ecology & Evolutionary Biology
U. of Tennessee, Knoxville
http://www.brianomeara.info

Postdoc collaborators wanted: http://nimbios.org/postdocs/
Calendar: http://www.brianomeara.info/calendars/omeara

On Thu, Jan 22, 2015 at 1:24 PM, George Shirreff <
georgeshirr...@googlemail.com> wrote:

> Hello,
>
> I was wondering if there is a way to incorporate sequencing error in
> phylogenetic analysis, if I know there is sequencing error but it's
> randomly distributed.
>
> I imagine this could be as

Re: [R-sig-phylo] chronos nlminb error

2014-12-30 Thread Brian O'Meara

The programs r8s, pathd8, and treepl can all take a phylogram and return a
chronogram, but they're not within R (though they could be wrapped). You
could also infer a chronogram from the start using Beast or MrBayes, though
the large size of the tree may make it difficult.

One simple thing to try before all this, though, is pass your tree through
a reorder() function call (using default arguments) before passing it to
chronos: (i.e.,

tree <- read.tree("full_Cipres_Data/RAxML_bestTree.tre")

*tree <- reorder(tree)*tree <- chronos(tree, lambda = 1, model =
"correlated", quiet = FALSE,
calibration = makeChronosCalib(tree),
control = chronos.control())

Sometimes the imported  tree structure, while valid, can be an issue for
ape and I find doing this reordering can fix weird errors.

Best,
Brian

_______
Brian O'Meara
Assistant Professor
Dept. of Ecology & Evolutionary Biology
U. of Tennessee, Knoxville
http://www.brianomeara.info

Postdoc collaborators wanted: http://nimbios.org/postdocs/
Calendar: http://www.brianomeara.info/calendars/omeara

On Tue, Dec 30, 2014 at 4:18 AM, Franz Krah  wrote:

> Dear all,
>
> I used chronos to make a large phylogeny (~3600 species) ultrametric.
> The tree is rooted and binary.
>
> With the following code I get the error:
>
> tree <- read.tree("full_Cipres_Data/RAxML_bestTree.tre")
> tree <- chronos(tree, lambda = 1, model = "correlated", quiet = FALSE,
> calibration = makeChronosCalib(tree),
> control = chronos.control())
>
>
> Error:
>
> In nlminb(start.para, f, g, control = opt.ctrl, lower = LOW, upper = UP) :
>   NA/NaN function evaluation
>
> This error was posted before some times but I didn't find a satisfying
> answer.
>
> I tried out some things but didn't helped: altering the root in different
> ways,
> altering the model, altering lambda. The functions worked fine with smaller
> phylogenies before!
>
> Hope someone found a solution!
> Are there other ways of making my tree ultrametric?
>
> Cheers, Franz
>
> ___
> R-sig-phylo mailing list - R-sig-phylo@r-project.org
> https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
> Searchable archive at
> http://www.mail-archive.com/r-sig-phylo@r-project.org/
>

[[alternative HTML version deleted]]

___
R-sig-phylo mailing list - R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/

Re: [R-sig-phylo] binary trait influencing diversification

2014-12-19 Thread Brian O'Meara

The diversitree package has a simulator for this.

Best,
Brian

___
Brian O'Meara
Assistant Professor
Dept. of Ecology & Evolutionary Biology
U. of Tennessee, Knoxville
http://www.brianomeara.info

Postdoc collaborators wanted: http://nimbios.org/postdocs/
Calendar: http://www.brianomeara.info/calendars/omeara

On Fri, Dec 19, 2014 at 9:28 AM, Karla Shikev  wrote:
>
> Dear R-sig-phyloers,
>
> I'm looking for a function that will simulate a binary trait that
> influences the rate of speciation or extinction. For instance the trait
> would change from 0 to 1 according to some transition matrix, and the
> speciation of a branch in state 0 would be, say, twice the speciation rate
> of a branch at state 1. I tried to modify the functions I find in geiger
> and phytools, but I couldn't keep track of things during the simulation.
>
> Any suggestion would be fantastic.
>
> Karla Shikev
>
> [[alternative HTML version deleted]]
>
> ___
> R-sig-phylo mailing list - R-sig-phylo@r-project.org
> https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
> Searchable archive at
> http://www.mail-archive.com/r-sig-phylo@r-project.org/
>

[[alternative HTML version deleted]]

___
R-sig-phylo mailing list - R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/

Re: [R-sig-phylo] fitDiscrete across multiple datasets

2014-10-27 Thread Brian O'Meara

Well, a model with different rates (but same tree scaling) for different
characters is nested in a model with same rates (and same tree scaling) for
different characters, so you could do a likelihood ratio test to see which
model is better (though of course AIC et al. are also possible). See the
appendix of this article (http://www.genetics.org/content/177/3/1395.long)
for some discussion of this in the context of comparative methods (though
the method hasn't been used much, due to difficult implementation). Of
course, the field has been been doing partitioned and nonpartitioned models
for tree inference for years, just not for comparative methods as much.

In general, I like Ben's idea to use multiple characters to get at shared
rates; we should see more of this.

Best,
Brian


_______
Brian O'Meara
Assistant Professor
Dept. of Ecology & Evolutionary Biology
U. of Tennessee, Knoxville
http://www.brianomeara.info

Postdoc collaborators wanted: http://nimbios.org/postdocs/
Calendar: http://www.brianomeara.info/calendars/omeara

On Mon, Oct 27, 2014 at 3:16 PM, Liam J. Revell  wrote:

> Luke's points are correct. However, if we wanted to estimate a common
> transition matrix across characters we could just substitute optim.pml for
> a phyDat object of type="USER" for ace internally; and if we wanted to
> estimate different parameter values across different data columns, we could
> just run fitTransform separately on each column of X and sum the
> log-likelihoods. I don't know whether either of these things (or the main
> exercise) are a good idea, but they are all certainly possible.
>
> All the best, Liam
>
> Liam J. Revell, Assistant Professor of Biology
> University of Massachusetts Boston
> web: http://faculty.umb.edu/liam.revell/
> email: liam.rev...@umb.edu
> blog: http://blog.phytools.org
>
> On 10/27/2014 2:57 PM, Benjamin Furman wrote:
>
>> Hello All,
>>
>>  I would also like to echo a thank you for the responses. They have
>> been helpful. As for your summary Luke, I'll have to let the experts
>> comment on that.
>>
>> Ben
>>
>> On Mon, Oct 27, 2014 at 1:49 PM, Luke Matthews
>> mailto:lmatth...@activatenetworks.net>>
>> wrote:
>>
>> Hi all,
>> Thanks Brian and Liam for these very cool responses.  It does make
>> sense that one should be able to iterate through the characters and
>> add the likelihoods, which is elegant and seems appropriate.  It
>> seems to me what this process accomplishes is:
>>
>> 1. Completely couples the estimation of delta across the characters
>> - thus we are now picking the value for delta that maximizes the
>> likelihood across all characters not allowing any variation in delta
>> across them.
>>
>> 2. Leaves completely uncoupled the transition rate estimates across
>> the different characters.  Thus, the different characters are freely
>> estimating rates of transition from 2 to 3 gene copies, etc. such
>> that the rate for one gene has no influence on the rate for another.
>>
>> Do I have that right?  Depending on Ben's hypothesized processes
>> those may be reasonable assumptions.  I can also imagine processes
>> where somewhat decoupling the delta estimation or somewhat coupling
>> the rate estimations would be reasonable.
>>
>> Best
>> Luke
>>
>> -Original Message-
>> From: Liam J. Revell [mailto:liam.rev...@umb.edu
>> <mailto:liam.rev...@umb.edu>]
>> Sent: Monday, October 27, 2014 1:26 PM
>> To: omeara.br...@gmail.com <mailto:omeara.br...@gmail.com>; Luke
>> Matthews
>> Cc: r-sig-phylo@r-project.org <mailto:r-sig-phylo@r-project.org>
>> Subject: Re: [R-sig-phylo] fitDiscrete across multiple datasets
>>
>> Hi all.
>>
>> Here is a link to code that does what Brian suggests:
>>
>> http://blog.phytools.org/2014/10/optimizing-tree-
>> transformations-for.html
>>
>> Note that (as noted in the blog post) I have arbitrarily used ace
>> internally to compute the discrete character log-likelihood, because
>> it is fast. You could instead change the code in minor ways and use
>> fitDiscrete, which is slower but should be more robust.
>>
>> The demo is fairly explicit, but let us know if it works or if I
>> have made any errors.
>>
>> All the best, Liam
>>
>> Liam J. Revell, Assistant Professor of Biology University of
>> Massachusetts Boston
>> web: http:

Re: [R-sig-phylo] fitDiscrete across multiple datasets

2014-10-27 Thread Brian O'Meara

You can calculate the likelihood of the data under a given transformation:
transform the tree with a delta of 0.3 or whatever, then calculate the
likelihood of the data under a Brownian motion model using this transformed
tree. This is the same likelihood as calculating the likelihood of the data
under a delta model directly (assuming the same delta). However, what you
can do is apply the same transformation to all your datasets and add the
likelihood. This becomes a new likelihood function. You can then optimize
this (optim, nloptr, etc.). I can rig up a working example later tonight --
off to teach now.

Brian

___
Brian O'Meara
Assistant Professor
Dept. of Ecology & Evolutionary Biology
U. of Tennessee, Knoxville
http://www.brianomeara.info

Postdoc collaborators wanted: http://nimbios.org/postdocs/
Calendar: http://www.brianomeara.info/calendars/omeara

On Mon, Oct 27, 2014 at 9:34 AM, Luke Matthews <
lmatth...@activatenetworks.net> wrote:

> HI Ben,
> I too am curious if anyone has an R answer to this question you pose.  One
> non-R way I can think of is to put the data into MrBayes, which has a
> number of different models available.  You could probably effectively test
> some models not baked in MrBayes by systematically manipulating the
> branchlengths you submit to it.  MrBayes provides various options for how
> tightly coupled the parameter estimates should be across the different
> genes.  It would seem something similar might exist within R but I'm not
> aware of any.
> Best
> Luke
>
> Luke J. Matthews | Senior Scientific Director | Activate Networks, Inc.
>
> --
>
> Message: 2
> Date: Fri, 24 Oct 2014 16:23:58 -0400
> From: Benjamin Furman 
> To: r-sig-phylo@r-project.org
> Subject: [R-sig-phylo] fitDiscrete across multiple datasets
> Message-ID:
>  g...@mail.gmail.com>
> Content-Type: text/plain; charset="UTF-8"
>
> Hello Everyone,
>
> I have a tree and discrete data (number of gene copies, for many
> genes) and would like to use the fitDiscrete function in geiger, or
> something similar. However, I would like to estimate the parameters given
> all of the datasets, not just with the data for each gene. For instance, if
> I was using the "delta" model to vary rates across the tree, I would like
> this delta value to reflect some sort of summary value across all datasets.
> Does anyone have an idea as to how this could be accomplished or perhaps
> point me in the right direction?
>
> Thank you for any guidance,
> Ben
>
>
> --
> Benjamin Furman, B.Sc. Specialization
> Ph.D. Candidate, Evans Lab <http://benevanslab.wordpress.com>
> McMaster University
> Twitter: @Xen_Ben
> Email: benjamin.ls.fur...@gmail.com, furma...@mcmaster.ca
> website: http://benjaminfurman.wordpress.com
>
> [[alternative HTML version deleted]]
>
>
>
> --
>
> ___
> R-sig-phylo mailing list
> R-sig-phylo@r-project.org
> https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
>
>
> End of R-sig-phylo Digest, Vol 81, Issue 12
>
> ___
> R-sig-phylo mailing list - R-sig-phylo@r-project.org
> https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
> Searchable archive at
> http://www.mail-archive.com/r-sig-phylo@r-project.org/
>

[[alternative HTML version deleted]]

___
R-sig-phylo mailing list - R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/

Re: [R-sig-phylo] tip order in phylogeny - How does it work?

2014-10-08 Thread Brian O'Meara

Look at reorder.phylo() in ape and reorder() in phylo4: your desired tip
order may be there.

Best,
Brian

___
Brian O'Meara
Assistant Professor
Dept. of Ecology & Evolutionary Biology
U. of Tennessee, Knoxville
http://www.brianomeara.info

Postdoc collaborators wanted: http://nimbios.org/postdocs/
Calendar: http://www.brianomeara.info/calendars/omeara

On Wed, Oct 8, 2014 at 4:45 AM, Juan Antonio Balbuena 
wrote:

>  Hello
> I am working on a simulation that requires adding a number of random tips
> to multiple phylogenies:
>
> treeP <- read.nexus(file= "my file.tre") # 5K trees used to build a
> consensus tree (treeP is thus a multiPhylo object)
> NT = 30 # number of taxa in phylogeny
> N.dup = 3 # number of tips (duplications) I want to add to the trees
> sites <- sort(sample(1:NT, N.dup), decreasing=TRUE) # Random tips to
> duplicate
> dup.labels <- treeP[[1]]$tip.label[sites]
> treeP <- .uncompressTipLabel(treeP)
> #
> # Add duplicates to branches following a Liam Revell recipe published in
> his blog:
> #
> for (n in 1:length(treeP)) {
>   for (i in 1:N.dup ) {
> pos <- runif(n=1)*
> treeP[[n]]$edge.length[which(treeP[[n]]$edge[,2]==sites[i])]
> edl <- runif(n=1)*
> treeP[[n]]$edge.length[which(treeP[[n]]$edge[,2]==sites[i])]
> treeP[[n]] <- bind.tip(treeP[[n]], tip.label=paste(dup.labels[i], 1,
> sep="."),
>edge.length=edl, where=sites[i], position=pos)
>   }
> }
> #
> treeP<- .compressTipLabel(treeP)
> # Check tip labels:
> treeP$tip.label
> $tip.label
>  [1] "S24"   "S25"   "S9""S15"   "S3""S21"   "S12"   "S17"
> "S19"   "S30"   "S22"   "S4""S16"   "S27"   "S10"   "S23"   "S1"
> "S2""S14"
> [20] "S6""S8""S26"   "S29"   "S13"   "S20"   "S7""S28"
> "S18"   "S5""S11"   "S18.1" "S20.1" "S16.1"
>
> When I plot any tree in treeP, S18.1, S20.1 and S16.1 appear together with
> S18, S20 and S16, as intended. So why are they at the end of the array
> above? This behaviour makes makes my life difficult because it forces me to
> rearrange a number of matrices to make them match this tip label order. I
> would appreciate if a kind soul can explain why it is that way and whether
> is possible to reorder the tip labels, placing each added tip together with
> its respective counterpart?
>
> Thanks in advance for your help.
>
> Juan Antonio
> --
>
> Dr. Juan A. Balbuena
> Cavanilles Institute of Biodiversity and Evolutionary Biology
> University of Valencia
> http://www.uv.es/~balbuena
> P.O. Box 22085
> http://www.uv.es/cophylpaco
> <http://www.uv.es/cavanilles/zoomarin/index.htm>
> 46071 Valencia, Spain
> e-mail: j.a.balbu...@uv.estel. +34 963 543 658fax +34 963 543 733
> 
> *NOTE!* For shipments by EXPRESS COURIER use the following street address:
> C/ Catedrático José Beltrán 2, 46980 Paterna (Valencia), Spain.
> 
>
> ___
> R-sig-phylo mailing list - R-sig-phylo@r-project.org
> https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
> Searchable archive at
> http://www.mail-archive.com/r-sig-phylo@r-project.org/
>

[[alternative HTML version deleted]]

___
R-sig-phylo mailing list - R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/

[R-sig-phylo] Tree-for-all hackathon

2014-06-10 Thread Brian O'Meara

This might be of interest for various R phylogenetics programmers: there's
currently no package to access this info.

Title
Tree-for-All: A hackathon to access OpenTreeâs global phylogeny resources
Content
A global âtree of lifeâ will transform biological research in a broad range
of disciplines from ecology to bioengineering. To help facilitate that
transformation, the OpenTree project [1] now provides online access to
>4000 published phylogenies, and a newly generated tree covering more than
2.5 million species.

The next step is to build tools to enable the community to use these
resources.  To meet this aim, OpenTree, Arbor [2] and NESCentâs HIP working
groups [3] are staging a week-long hackathon September 15 to 19 at U.
Michigan, Ann Arbor.  Participants in this âTree-for-allâ will work in
small teams to develop tools that use OpenTreeâs web services to extract,
annotate, or add data in ways useful to the community.  Teams also may
focus on testing, expanding and documenting the web services.

How could a global phylogeny be useful in your research or teaching?  What
other data from OpenTree would be valuable?  How could OpenTree web
services be integrated into familiar workflows and analysis tools?   How
could we add to the database of published trees, or enrich it with
annotations?

If you can imagine using these resources, and you have the skills to work
collaboratively to turn those ideas into products (as a coder, or working
side-by-side with coders), we invite you to apply for the hackathon.  The
full call for participation (http://bit.ly/1ioPPMc) provides instructions
for how to apply, and how to share your ideas with potential teammates
(strongly encouraged prior to applying).  Applications are due July 8th.
Travel support is provided.  Women and underrepresented minorities are
especially encouraged to apply.

If you have questions, contact Karen Cranston (karen.crans...@nescent.org,
@kcranstn, OpenTree), Arlin Stoltzfus (ar...@umd.edu, HIP), Julie Allen (
juli...@illinois.edu, HIP), or Luke Harmon (lu...@uidaho.edu, Arbor).


[1] http://www.opentreeoflife.org
[2] http://www.arborworkflows.com/
[3] http://www.evoio.org/wiki/HIP (Hackathons, Interoperability,
Phylogenies)

___
Brian O'Meara
Assistant Professor
Dept. of Ecology & Evolutionary Biology
U. of Tennessee, Knoxville
http://www.brianomeara.info

Postdoc collaborators wanted: Check NIMBioS' website
Calendar: http://www.brianomeara.info/calendars/omeara

[[alternative HTML version deleted]]

___
R-sig-phylo mailing list - R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/

Re: [R-sig-phylo] Phylogenetic signal, variation in rates and sample size

2014-05-05 Thread Brian O'Meara

For rate estimation between temperate and tropical, I'd suggest using
OUwie, as it's made for exactly this question: map tropical/temperate on
the tree, try the "BMS" model to estimate the rate of evolution in
temperate and tropical taxa (note that I'm one of the authors, so take this
bias into account). There are other models, too, to allow you to test
things like whether temperate and tropical taxa have different evolutionary
means, and you can compare these models to Brownian motion models,
depending on your question. Note that in theory, these methods should be
robust to random sampling of taxa.

For your first question, you could estimate the amount of phylogenetic
signal, create a much larger tree under some branching process, transform
the branch lengths to match your phylogenetic signal, simulate traits on
this transformed tree, then subsample back and estimate the phylogenetic
signal on the untransformed but pruned simulated tree. Ideally, the
parameter estimate you get from the simulated data matches the true
parameter, suggesting that sampling does not matter for phylogenetic
signal. Frankly, I'd be surprised if random sampling does matter for
signal, though some biased sampling (even from taxonomic bias, of not
separating two populations into different species until they are "different
enough") could have an effect.

Best,
Brian

_______
Brian O'Meara
Assistant Professor
Dept. of Ecology & Evolutionary Biology
U. of Tennessee, Knoxville
http://www.brianomeara.info

Postdoc collaborators wanted: Check NIMBioS' website
Calendar: http://www.brianomeara.info/calendars/omeara

On Mon, May 5, 2014 at 11:33 AM, dga...@huskers.unl.edu <
dga...@huskers.unl.edu> wrote:

> Hi Imran,
>
> I'm by no means an expert on phylogenetic signal with incomplete
> phylogenies but I would check out the GEIGER package, specifically the
> "mecca" function. I think mecca is set up to do estimates based on
> incomplete tip sampling.
>
> Cheers,
> -Dan Gates
> PhD candidate
> Smith/Pilson Lab University of Colorado/Nebraska
> http://ec2-54-186-114-109.us-west-2.compute.amazonaws.com:8080/DanielJGates
> 
> From: r-sig-phylo-boun...@r-project.org 
> on behalf of imran khaliq 
> Sent: Monday, May 5, 2014 5:15 AM
> To: r-sig-phylo@r-project.org
> Subject: [R-sig-phylo] Phylogenetic signal, variation in rates and
> sample size
>
> Dear All,
> I am trying to understand the effects of incomplete phylogenies on the
> phylogenetic signal. I am using different trees and I have at least 100
> species in each tree but still these are only 2 % of total species. So how
> can I be sure that results are generalize enough to apply on the whole
> group? Secondly, if I want to compare varaition in trait's rates in two
> groups e.g. tropical and temperate, what methods are appropriate to
> estiamte the evolutionary rates. I tried Brownian rate parameter estimation
> by tranforming the branch lengths according to the lambda values of trait
> in each group.  It gives me lower varaince for the group that has higher
> lambda value and that is confusing. In fact with the same trait values, if
> I transform the branch lengths of the tree with different lamda values it
> gives me higher varaince for higher lambda value.
> Can anyone help me?
>
> Best regards,
> Imran
>
> [[alternative HTML version deleted]]
>
> ___
> R-sig-phylo mailing list - R-sig-phylo@r-project.org
> https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
> Searchable archive at
> http://www.mail-archive.com/r-sig-phylo@r-project.org/
> ___
> R-sig-phylo mailing list - R-sig-phylo@r-project.org
> https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
> Searchable archive at
> http://www.mail-archive.com/r-sig-phylo@r-project.org/
>

[[alternative HTML version deleted]]

___
R-sig-phylo mailing list - R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/

Re: [R-sig-phylo] Fit BM, OU (1 theta), OU (2 theta) with the same function

2014-02-11 Thread Brian O'Meara

Make sure you're using version 1.39. Version 1.38 introduced this error; it
was fixed about a week later (just a few days ago), but that version may
not have made it out to all the cran mirrors yet.
http://cran.rstudio.comis the mirror I've been directing people to
most recently; it's updated
frequently, plus for developers it lets you see how often your packages are
downloaded.

Best,
Brian

_______
Brian O'Meara
Assistant Professor
Dept. of Ecology & Evolutionary Biology
U. of Tennessee, Knoxville
http://www.brianomeara.info

Students wanted: Applications due Jan. 1, annually
Postdoc collaborators wanted: Check NIMBioS' website
Calendar: http://www.brianomeara.info/calendars/omeara


On Tue, Feb 11, 2014 at 9:09 PM, Santiago Sánchez  wrote:

> Hi Brian,
>
> Thank you for pointing these out. I've been playing around with OUwie and I
> ran into an error which I couldn't solve. This comes out when fitting any
> OU model implemented in OUwie, and it doesn't happen with BM models (they
> run fine):
>
> test <- OUwie(phy, data, model="OU1")
> Initializing...
> Finished. Begin thorough search...
> Finished. Summarizing results.
> Error in svd(m) : infinite or missing values in 'x'
>
> I would appreciate if you could give me some suggestions.
>
> Cheers,
> Santiago
>
>
>
>
>
> 2014-02-11 15:09 GMT-05:00 Brian O'Meara :
>
> > There are at least three packages to do this: OUCH, SLOUCH, and OUwie.
> > OUCH and OUwie are on CRAN; for SLOUCH, go to
> > http://freshpond.org/software/SLOUCH/ .
> >
> > Best,
> > Brian
> >
> > ___
> > Brian O'Meara
> > Assistant Professor
> > Dept. of Ecology & Evolutionary Biology
> > U. of Tennessee, Knoxville
> > http://www.brianomeara.info
> >
> > Students wanted: Applications due Jan. 1, annually
> > Postdoc collaborators wanted: Check NIMBioS' website
> > Calendar: http://www.brianomeara.info/calendars/omeara
> >
> >
> > On Tue, Feb 11, 2014 at 3:05 PM, Santiago Sánchez <
> > santiago.snc...@gmail.com> wrote:
> >
> >> Hello list,
> >>
> >> Does someone know if there is a function out there that can fit (similar
> >> to
> >> fitContinuous() in geiger and compar.ou() in ape) OU models with
> multiple
> >> optima (theta) as well as a BM model.
> >>
> >> I'm not sure if fitContinuous() allows OU with multiple theta, and, as
> far
> >> as I know there is no "compar.bm" in ape.
> >>
> >> Any help is greatly appreciated.
> >>
> >> Cheers,
> >> Santiago
> >>
> >> --
> >> Santiago Sánchez-Ramírez
> >> Department of Ecology and Evolutionary Biology, University of Toronto
> >> Department of Natural History (Mycology), Royal Ontario Museum
> >> 100 Queen's Park
> >> Toronto, ON
> >> M5S 2C6
> >> Canada
> >> Other email: santiago.sanc...@mail.utoronto.ca
> >> Tel. 416-586-8025
> >>
> >> [[alternative HTML version deleted]]
> >>
> >>
> >> ___
> >> R-sig-phylo mailing list - R-sig-phylo@r-project.org
> >> https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
> >> Searchable archive at
> >> http://www.mail-archive.com/r-sig-phylo@r-project.org/
> >>
> >
> >
>
>
> --
> Santiago Sánchez-Ramírez
> Department of Ecology and Evolutionary Biology, University of Toronto
> Department of Natural History (Mycology), Royal Ontario Museum
> 100 Queen's Park
> Toronto, ON
> M5S 2C6
> Canada
> Other email: santiago.sanc...@mail.utoronto.ca
> Tel. 416-586-8025
>
> [[alternative HTML version deleted]]
>
>
> ___
> R-sig-phylo mailing list - R-sig-phylo@r-project.org
> https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
> Searchable archive at
> http://www.mail-archive.com/r-sig-phylo@r-project.org/
>

[[alternative HTML version deleted]]

___
R-sig-phylo mailing list - R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/

Re: [R-sig-phylo] Fit BM, OU (1 theta), OU (2 theta) with the same function

2014-02-11 Thread Brian O'Meara

There are at least three packages to do this: OUCH, SLOUCH, and OUwie. OUCH
and OUwie are on CRAN; for SLOUCH, go to
http://freshpond.org/software/SLOUCH/ .

Best,
Brian

___
Brian O'Meara
Assistant Professor
Dept. of Ecology & Evolutionary Biology
U. of Tennessee, Knoxville
http://www.brianomeara.info

Students wanted: Applications due Jan. 1, annually
Postdoc collaborators wanted: Check NIMBioS' website
Calendar: http://www.brianomeara.info/calendars/omeara

On Tue, Feb 11, 2014 at 3:05 PM, Santiago Sánchez  wrote:

> Hello list,
>
> Does someone know if there is a function out there that can fit (similar to
> fitContinuous() in geiger and compar.ou() in ape) OU models with multiple
> optima (theta) as well as a BM model.
>
> I'm not sure if fitContinuous() allows OU with multiple theta, and, as far
> as I know there is no "compar.bm" in ape.
>
> Any help is greatly appreciated.
>
> Cheers,
> Santiago
>
> --
> Santiago Sánchez-Ramírez
> Department of Ecology and Evolutionary Biology, University of Toronto
> Department of Natural History (Mycology), Royal Ontario Museum
> 100 Queen's Park
> Toronto, ON
> M5S 2C6
> Canada
> Other email: santiago.sanc...@mail.utoronto.ca
> Tel. 416-586-8025
>
> [[alternative HTML version deleted]]
>
>
> ___
> R-sig-phylo mailing list - R-sig-phylo@r-project.org
> https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
> Searchable archive at
> http://www.mail-archive.com/r-sig-phylo@r-project.org/
>

[[alternative HTML version deleted]]

___
R-sig-phylo mailing list - R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/

Re: [R-sig-phylo] OUwie diagn=T Error in dim(edges)

2014-01-24 Thread Brian O'Meara

It's likely that something is wrong with your tree, especially mapping of
regimes. However, another possibility is that there is a problem computing
the Hessian matrix. Without your tree and data, it would be hard to tell.

You might also try doing contour plots. They take more time, but they're a
much better estimate of uncertainty than the Hessian approach. We used to
have code to do this in OUwie until CRAN maintainers got upset (the code
required akima, which has a more restrictive license than OUwie's). You (or
anyone else) can still get the contour plotting code at
https://r-forge.r-project.org/scm/viewvc.php/*checkout*/pkg/R/OUwie.contour.R?revision=133&root=ouwie.

Best,
Brian

_______
Brian O'Meara
Assistant Professor
Dept. of Ecology & Evolutionary Biology
U. of Tennessee, Knoxville
http://www.brianomeara.info

Students wanted: Applications due Jan. 1, annually
Postdoc collaborators wanted: Check NIMBioS' website
Calendar: http://www.brianomeara.info/calendars/omeara

On Fri, Jan 24, 2014 at 9:00 AM, Maria Ghazali  wrote:

> Hi,
> Please, explain me what I am doing wrong in calculating OUwie (OUwie
> version 1.37).  I want to run some additional diagnostics (with diagn=T) to
> estimate standard errors of alpha and sigma (the element solution.se).
> But an error notification appears.
> Example: data(tworegime)
> model.tworegime=OUwie(tree,trait,model=c("OUMV"),root.station=TRUE,diagn=T)
>
> #Initializing... #Finished. Begin thorough search...
> #Finished. Summarizing results.
>  #Error in dim(edges) : 'edges' is missing
> Thank you!
> [[alternative HTML version deleted]]
>
>
> ___
> R-sig-phylo mailing list - R-sig-phylo@r-project.org
> https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
> Searchable archive at
> http://www.mail-archive.com/r-sig-phylo@r-project.org/
>

[[alternative HTML version deleted]]

___
R-sig-phylo mailing list - R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/

Re: [R-sig-phylo] compar.ou

2013-11-12 Thread Brian O'Meara

Often if you have a trait constrained to be positive, it's appropriate to
log-transform it, which has the happy effect of making the theta for the
trait when converted back to a non-log scale constrained to be positive (as
well as probably being more appropriate for how your trait evolves).

You can have theta outside the range of observed values, but this should
only happen in something like OU1 if your tree is not ultrametric
(ultrametric = equal root to tip length, as with a chronogram of extant
species), and even then is unlikely (basically, it would only happen if you
have something like a trend).

You could do a constrained search by wrapping OUwie.fixed() in an optimizer
where you return a really bad, but finite, likelihood if you exceed one of
your bounds (or use an optimizer that allows setting bounds), or, perhaps
easier, modify the OUwie() function to return a bad likelihood for
particular values of theta. But my first suggestion is to think about
whether your data are better log-transformed (i.e, under Brownian motion,
is an increase or decrease of 10% of the present value equally likely
regardless of trait value (in which case, log transformation makes sense)).
You also may want to check that the branch lengths you are using make sense
for your problem (OU, BM, etc. models are largely (not entirely) about
transforming branch lengths on trees, so you want to make sure they are
sensible for your question to start with: do you care about rates with time
units or amount of molecular change?).

Best,
Brian

___
Brian O'Meara
Assistant Professor
Dept. of Ecology & Evolutionary Biology
U. of Tennessee, Knoxville
http://www.brianomeara.info

Students wanted: Applications due Dec. 15, annually
Postdoc collaborators wanted: Check NIMBioS' website
Calendar: http://www.brianomeara.info/calendars/omeara

On Tue, Nov 12, 2013 at 6:39 AM, sandra goutte  wrote:

> Dear all,
>
> Following up this conversation; Using OUwie, i get reasonable values for
> theta under several models for all my traits, but one. In this case, this
> is a trait which can only take positive values and i get negative theta
> values for all the different models, including BM and OU1. the standard
> error is not big enough to consider my theta value to be around zero. (for
> example: theta = -2.279, se= 0.209). Any idea of what could be causing this
> and if it is possible to force OUwie to estimate theta within certain
> boundaries?
>
> Thank you,
> Sandra.
>
>
> 2013/10/29 Marguerite Butler 
>
> > Hi Sandra and others,
> >
> > You can also assess confidence using parametric bootstrap, a procedure
> > which we generally recommend for all users. ouch has built-in facilities
> to
> > do so  (the bootstrap() and simulate() functions in addition to update()
> ).
> >  I think there are examples in my tutorial. If not, let me know and I can
> > send you another.
> >
> > Marguerite
> >
> > On Oct 29, 2013, at 6:08 AM, Cecile Ane  wrote:
> >
> > > FYI, we have some theory to explain why alpha has large standard errors
> > and in which conditions. As Brian says, it comes with a flat likelihood
> > with respect with alpha.
> > > http://dx.doi.org/10.1214/13-AOS1105 or
> > http://www.stat.wisc.edu/~ane/publis/2013HoAne_AoS.pdf
> > >
> > > On 10/29/2013 09:46 AM, Brian O'Meara wrote:
> > >> In at least the OUwie paper we spent a lot of time doing simulations
> to
> > >> determine this empirically (this may have been examined in other
> papers,
> > >> too, though none come to mind). Alpha can be estimated, but sometimes
> > with
> > >> scarily large standard errors (but not always). This property should
> > hold
> > >> for related methods (it's a property of the likelihood surface for
> these
> > >> models). You can just plot the alpha values vs likelihood to get a
> > sense of
> > >> this surface for your dataset (ideally, estimating the other
> parameters
> > for
> > >> each fixed alpha, which is possible, though the kludge of holding the
> > other
> > >> parameters at their MLE and just varying alpha will give you a sense
> of
> > the
> > >> surface, though it's a bit non-conservative). We had code in OUwie to
> > do a
> > >> contour plot of the likelihood surface but took it out (a small
> > licensing
> > >> difference between akima (university license) and OUwie (GPL) led to
> > OUwie
> > >> being pulled from CRAN by the CRAN maintainers until we resolved the
> > >> difference).
> > >>
> > >> Brian
> > >>
> > >>

Re: [R-sig-phylo] Error in OUwie

2013-10-29 Thread Brian O'Meara

The error happens after it does the calculation for the point estimates, so
it's likely happening when it starts doing the calculations to look at the
curvature at the solution (to see if it's a maximum rather than a saddle
point and to get an estimate of standard errors).  Doing

OU1 <- OUwie(tree, data, model = "OU1", diagn=FALSE)

will turn off these extra diagnostics for now.

Note there's a dedicated OUwie support list at
https://groups.google.com/forum/#!forum/ouwie-discuss , too. People often
post to dedicated lists first before R-sig-phylo, but such lists can be
hard to find (I think that's true in this case) or are inactive. One other
advantage of R-sig-phylo posts for help is that you're more likely to hear
about other package alternatives (i.e., "Have you tried mvSLOUCH or OUCH
instead?").

Best,
Brian

___
Brian O'Meara
Assistant Professor
Dept. of Ecology & Evolutionary Biology
U. of Tennessee, Knoxville
http://www.brianomeara.info

Students wanted: Applications due Dec. 15, annually
Postdoc collaborators wanted: Check NIMBioS' website
Calendar: http://www.brianomeara.info/calendars/omeara

On Tue, Oct 29, 2013 at 4:20 PM, Christopher D. Muir <
cdm...@biodiversity.ubc.ca> wrote:

> All,
>
> This my first time posting to R-sig-anything, so apologies for any breach
> in etiquette. I searched through the mailing list and couldnt find
> anything about this problem. A brief description of my data: I am running
> OUwie on a 528-taxon megatree generated using phylomatic. I made the tree
> dichotomously branching using multi2di, then assigned short branch
> lengths to these so that I could estimate ancestral character states
> (regimes in the OU model) to each node. OUwie works fine with my dataset
> for the BM1 and BMS models, but for all of the OU models, I get the same
> error with the singular-value decomposition function:
>
> OU1 <- OUwie(tree, data, model = "OU1")
> Initializing...
> Finished. Begin thorough search...
> Finished. Summarizing results.
> Error in svd(m) : infinite or missing values in 'x'
>
> Im using R v. 2.15.1 and the latest OUwie v 1.33.
>
> I think this has something to do with the structure of my tree and/or the
> frequency of regime shifts. When I replace my data with Brownian Motion
> data simulated with my tree, I generally get the same error. The most
> salient feature I can think of is that regime shifts are quite frequent
> within the tree and there are many short branches because of resolving
> polytomies. There are no NAs or anything unusual in the trait data.
>
>  Chris
>
> ---
> Christopher D. Muir, Biodiversity PDF
> Office: BioDiv 237
> E-mail: cdm...@biodiversity.ubc.ca
>
> Biodiversity Research Centre
> University of British Columbia
> 6270 University Blvd.
> Vancouver, BC, Canada
> V6T 1Z4
>
>
> [[alternative HTML version deleted]]
>
>
> ___
> R-sig-phylo mailing list - R-sig-phylo@r-project.org
> https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
> Searchable archive at
> http://www.mail-archive.com/r-sig-phylo@r-project.org/
>

[[alternative HTML version deleted]]

___
R-sig-phylo mailing list - R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/

Re: [R-sig-phylo] compar.ou

2013-10-29 Thread Brian O'Meara

In at least the OUwie paper we spent a lot of time doing simulations to
determine this empirically (this may have been examined in other papers,
too, though none come to mind). Alpha can be estimated, but sometimes with
scarily large standard errors (but not always). This property should hold
for related methods (it's a property of the likelihood surface for these
models). You can just plot the alpha values vs likelihood to get a sense of
this surface for your dataset (ideally, estimating the other parameters for
each fixed alpha, which is possible, though the kludge of holding the other
parameters at their MLE and just varying alpha will give you a sense of the
surface, though it's a bit non-conservative). We had code in OUwie to do a
contour plot of the likelihood surface but took it out (a small licensing
difference between akima (university license) and OUwie (GPL) led to OUwie
being pulled from CRAN by the CRAN maintainers until we resolved the
difference).

Brian


_______
Brian O'Meara
Assistant Professor
Dept. of Ecology & Evolutionary Biology
U. of Tennessee, Knoxville
http://www.brianomeara.info

Students wanted: Applications due Dec. 15, annually
Postdoc collaborators wanted: Check NIMBioS' website
Calendar: http://www.brianomeara.info/calendars/omeara


On Tue, Oct 29, 2013 at 10:21 AM, sandra goutte  wrote:

> Thank you Marguerite. Looking at OUwie and OUCH/SLOUCH, i see that alpha is
> estimated along the other parameters, whereas in Hansen 1997 and other
> papers it is suggested that this would lead to very large standard errors.
> Is that problem resolved in these functions?
>
> Best,
> Sandra.
>
>
> 2013/10/26 Marguerite Butler 
>
> > Dear Sandra,
> >
> > You might also want to look at the papers that go along with slouch,
> ouch,
> > and ouwie. Here are some pdfs, along with some tutorials.
> >
> > On my Rcompstart, the ouch stuff starts around page 165
> >
> > Let me know if you need help with ouch.
> >
> > Marguerite
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> > On Oct 24, 2013, at 7:03 AM, sandra goutte  wrote:
> >
> > Dear list,
> >
> > My aim is to compare the fit of models for which *theta* is allowed to
> > change at different nodes (different combinations of 1 ,2 or 3 nodes). I
> > don't really understand the calculation of the deviance, but if i'm not
> > mistaken the difference between the deviances of 2 models follows a
> > *chi*square distribution with the difference of parameters in the
> > models as
> > degree of freedom. Now, how to compare two models with the same number of
> > optimum changes but at different nodes?
> >
> > I am having a hard time understanding the details of the function with
> > regards to Hansen's (1997) paper. What is the relationship between the
> > deviance and the RSS from Hansen (1997)? Is it possible to calculate the
> > RSS from the output of compar.ou ?
> >
> > I would appreciate any advice or insight on the subject!
> > Sandra.
> >
> > --
> > PhD Student
> > Muséum National d'Histoire Naturelle
> > Département Systématique et Évolution
> > USM 601 / UMR 7205 Origine, Structure et Évolution de la Biodiversité
> > Reptiles & Amphibiens - Case Postale 30
> > 25 rue Cuvier
> > F-75005 Paris
> >
> > Tel :  +33 (0) 1 40 79 34 90
> > Mobile: +33 (0) 6 79 20 29 99
> >
> > [[alternative HTML version deleted]]
> >
> > ___
> > R-sig-phylo mailing list - R-sig-phylo@r-project.org
> > https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
> > Searchable archive at
> > http://www.mail-archive.com/r-sig-phylo@r-project.org/
> >
> >
> >  
> >  Marguerite A. Butler
> >  Associate Professor
> >
> >  Department of Biology
> > University of Hawaii
> > 2538 McCarthy Mall, Edmondson Hall 216
> > Honolulu HI 96822
> >
> > Office: 808-956-4713
> > Dept: 808-956-8617
> >  Lab:  808-956-5867
> > FAX:   808-956-9812
> >  http://www.hawaii.edu/zoology/faculty/butler.html
> > http://www2.hawaii.edu/~mbutler
> >  http://www.hawaii.edu/zoology/
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
>
>
> --
> PhD Student
> Muséum National d'Histoire Naturelle
> Département Systématique et Évolution
> USM 601 / UMR 7205 Origine, Structure et Évolution de la Biodiversité
> Reptiles & Amphibiens - Case Postale 30
> 25 rue Cuvier
> F-75005 Paris
>
> Tel :  +33 (0) 1 40 79 34 90
> Mobile: +33 (0) 6 79 20 29 99
>
> [[alternative HTML version deleted]]
>
>
> ___
> R-sig-phylo mailing list - R-sig-phylo@r-project.org
> https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
> Searchable archive at
> http://www.mail-archive.com/r-sig-phylo@r-project.org/
>

[[alternative HTML version deleted]]

___
R-sig-phylo mailing list - R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/

Re: [R-sig-phylo] compar.ou (sandra goutte)

2013-10-25 Thread Brian O'Meara

You can also use OUwie for a variety of models (OU with different means
and/or different variances and/or different attraction values), but it
returns AIC and lnL scores but not RSS. It sounds, though, that mvSLOUCH
might be the best option in your case.

Best,
Brian

___
Brian O'Meara
Assistant Professor
Dept. of Ecology & Evolutionary Biology
U. of Tennessee, Knoxville
http://www.brianomeara.info

Students wanted: Applications due Dec. 15, annually
Postdoc collaborators wanted: Check NIMBioS' website
Calendar: http://www.brianomeara.info/calendars/omeara

On Fri, Oct 25, 2013 at 8:47 AM, Krzysztof Bartoszek wrote:

> Hi Sandra,
> I don't know about compar.ou but my mvSLOUCH package (returns RSS and
> allows you to have measurement error and missing data) and Butler & King's
> ouch package return you a lot of stastics that you can use for model
> comparison. Maybe in your case the AIC.C would be more appropriate?
>
> Best wishes
> Krzysztof Bartoszek
>
> ___
> R-sig-phylo mailing list - R-sig-phylo@r-project.org
> https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
> Searchable archive at
> http://www.mail-archive.com/r-sig-phylo@r-project.org/
>

[[alternative HTML version deleted]]

___
R-sig-phylo mailing list - R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/

Re: [R-sig-phylo] Identical likelihood values for OU and EB models

2013-10-17 Thread Brian O'Meara

Check the parameter values, not just the likelihood values. EB and OU work
by transforming the tree branch lengths, and it could be that the parameter
values that are optimal for OU end up with a tree with the same set of
branch lengths as with the optimal EB parameter even though the models
themselves are different. To examine this, plot the tree using
transform.phylo with each model and its parameters. You should also look at
the edge.length vector for each of the transformed trees (more precise) but
it's useful to get a visual sense of what effect each model has on the tree.

Best,
Brian

___
Brian O'Meara
Assistant Professor
Dept. of Ecology & Evolutionary Biology
U. of Tennessee, Knoxville
http://www.brianomeara.info

Students wanted: Applications due Dec. 15, annually
Postdoc collaborators wanted: Check NIMBioS' website
Calendar: http://www.brianomeara.info/calendars/omeara

On Thu, Oct 17, 2013 at 8:40 PM, Jingchun Li  wrote:

> Sorry, just realized the values I put there are the AIC values. But the
> likelihood shows the same pattern, i.e., OU and EB models have the same
> values.
>
> Jingchun
>
>
> On Thu, Oct 17, 2013 at 7:18 PM, Jingchun Li  wrote:
>
> > Dear all,
> >
> > I am trying to fit a series of trait evolution models to a phylogeny with
> > about 50 species and some morphological data, such as body size, using
> the
> > fitContinuous() function in geiger.
> >
> > I have tested the following models: BM, OU, EB, Delta, Trend, Kappa and
> > White
> > What I found confusing is that for some of the traits, the likelihood
> > values for OU and EB are exactly the same.
> >
> > I was thinking maybe I have a very flat likelihood surface, but even that
> > the likelihoods shouldn't be exactly the same right? Plus some of those
> > traits have relatively strong phylogenetic signals.
> >
> > Here are examples of the fit results for some of the traits
> >
> >BM OU EB Delta Trend kappa white232.72300 176.75430 176.75430
> > 177.14530 223.81530 144.35620 181.70850-160.07130 -182.63660
> > -182.63660 -182.34340 -166.59150 -186.92670 -176.65170120.08064
> > 77.75367 77.75367 77.75443 112.23931 89.28702 76.99379
> > Any thoughts? Thank you very much!
> >
> > Jingchun
> >
>
> [[alternative HTML version deleted]]
>
> ___
> R-sig-phylo mailing list - R-sig-phylo@r-project.org
> https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
> Searchable archive at
> http://www.mail-archive.com/r-sig-phylo@r-project.org/
>

[[alternative HTML version deleted]]

___
R-sig-phylo mailing list - R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/

Re: [R-sig-phylo] trouble estimating lambda with fitDiscrete

2013-09-30 Thread Brian O'Meara

I don't necessarily see it as a bad idea. The discussion of lambda mostly
concerns continuous traits, but it's just a way to stretch the tree to see
if transformed branchlengths provide a better fit. You could see if
discrete character change is based on observed branch lengths (lambda = 1)
or if there's no phylogenetic signal (lambda = 0). Phylogenetic signal
itself may be of dubious interest, but I don't see how discrete traits are
worse for this than continuous traits.

For a discussion of phylogenetic signal and why it might not be a good
measure for some processes, see Revell et al. 2008:
http://sysbio.oxfordjournals.org/content/57/4/591.full

For a discussion of "tree stretching" as a general class of methods that
can be used for various kinds of data, see O'Meara 2012
http://www.annualreviews.org/doi/abs/10.1146/annurev-ecolsys-110411-160331[yes,
self-promotion].

Best,
Brian

_______
Brian O'Meara
Assistant Professor
Dept. of Ecology & Evolutionary Biology
U. of Tennessee, Knoxville
http://www.brianomeara.info

Students wanted: Applications due Dec. 15, annually
Postdoc collaborators wanted: Check NIMBioS' website
Calendar: http://www.brianomeara.info/calendars/omeara

On Mon, Sep 30, 2013 at 4:18 PM, Liam J. Revell  wrote:

> I'm not sure of the wisdom (or interpretation) of fitting a lambda model
> to discrete traits; however it looks like your main problem is that you are
> using geiger 1.99-x. To get the old function output you will need to use
> the CRAN archive & install from source. You should be able to fit the
> lambda model in geiger >=1.99 if you think this is a good idea - but the
> argument/value has changed from treeTransform="lambda" to
> transform="lambda". Good luck. - Liam
>
> Liam J. Revell, Assistant Professor of Biology
> University of Massachusetts Boston
> web: 
> http://faculty.umb.edu/liam.**revell/<http://faculty.umb.edu/liam.revell/>
> email: liam.rev...@umb.edu
> blog: http://blog.phytools.org
>
>
> On 9/30/2013 12:15 PM, Katherine Brooks wrote:
>
>> Hello,
>>
>> I am trying to estimate lambda using the fitDiscrete function so that I
>> can
>> obtain a measure of phylogenetic signal in a trait.  The trait has 5
>> character states and is coded as letters A, B, C, D, E.
>>
>> When I run fitDiscrete, I do not get the appropriate output, as suggested
>> by many phylogenetics wikis I've read.  I should get output that looks
>> like
>> this:
>>
>> Finding the maximum likelihood solution
>> [050  100]
>> []
>> $Trait1
>> $Trait1$lnl
>> [1] -86.76405
>>
>> $Trait1$q
>>  [,1]
>> [1,] -0.03717188
>>
>> $Trait1$treeParam
>> [1] 0.9943398
>>
>> $Trait1$message
>> [1] "R thinks that this is the right answer."
>>
>> Perhaps I am making a mistake with the data input?  Here is what I
>> enter and the output I get:
>>
>>
>>  newphy=read.nexus("S1078.wo.**perotensis.fig.tre")
>>>
>>
>>  newphy
>>>
>>
>> Phylogenetic tree with 55 tips and 54 internal nodes.
>>
>> Tip labels:
>> Ammospermophilus_harrisii, Ammospermophilus_interpres,
>> Ammospermophilus_leucurus, Callospermophilus_lateralis,
>> Callospermophilus_madrensis, Callospermophilus_saturatus, ...
>>
>> Rooted; includes branch lengths.
>>
>>
>>  SocGrCat=read.csv(file.choose(**),row.names=1)
>>>
>>
>>  SocGrCat
>>>
>>SocGrade
>> Ammospermophilus_harrisiiA
>> Ammospermophilus_interpres   A
>> Ammospermophilus_leucurusA
>> Callospermophilus_lateralis  A
>> Callospermophilus_madrensis  A
>> Callospermophilus_saturatus  A
>> Cynomys_gunnisoniD
>> Cynomys_leucurus C
>> Cynomys_ludovicianus E
>> Cynomys_mexicanusD
>> Cynomys_parvidensD
>> Ictidomys_mexicanus  A
>> Ictidomys_tridecemlineatus   B
>> Ictidomys_parvidens  A
>> Marmota_baibacinaE
>> Marmota_bobakE
>> Marmota_broweri  E
>> Marmota_caligata E
>> Marmota_camtschatica E
>> Marmota_caudata  E
>> Marmota_flaviventris D
>> Marmota_himalayana   E
>> Marmota_marmota  E
>>

Re: [R-sig-phylo] Pairwise tree distance matrix

2013-09-19 Thread Brian O'Meara

You may look into treedist in phangorn (but just does pairs of trees, I
believe) or dist.multiphylo in distory (polynomial time, but I don't think
it's RF distance), or dist.topo in ape (but again, pairwise and not RF).

There's also RMesquite (http://rmesquite.r-forge.r-project.org/) to have R
just call Mesquite functions.

Best,
Brian

_______
Brian O'Meara
Assistant Professor
Dept. of Ecology & Evolutionary Biology
U. of Tennessee, Knoxville
http://www.brianomeara.info

Students wanted: Applications due Dec. 15, annually
Postdoc collaborators wanted: Check NIMBioS' website
Calendar: http://www.brianomeara.info/calendars/omeara

On Thu, Sep 19, 2013 at 6:41 PM, Rob Lanfear  wrote:

> Hi All,
>
> I'm looking for a method to calculate a pairwise distance matrix of RF
> distances between a set of trees.
>
> Specifically, I know it's possible to do this in linear time (relative to
> the number of taxa), using an algorithm proposed in 1985[1]. This algorithm
> is implemented in various places (e.g. TreeSetVis in Mesquite), but I
> couldn't find an implementation in R.
>
> If anyone knows of an implementation, or has ideas on where best to start
> building one, please let me know.
>
> Cheers,
>
> Rob
>
>
> [1] Day,W. H. E. 1985. Optimal algorithms for comparing trees with labeled
> leaves. J. Classiï¬cation 2:7â28.
>
> --
> Rob Lanfear
> Research Fellow,
> Ecology, Evolution, and Genetics,
> Research School of Biology,
> Australian National University
>
> phone: +61 (0)2 6125 3611
>
> www.robertlanfear.com
>
> [[alternative HTML version deleted]]
>
>
> ___
> R-sig-phylo mailing list - R-sig-phylo@r-project.org
> https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
> Searchable archive at
> http://www.mail-archive.com/r-sig-phylo@r-project.org/
>

[[alternative HTML version deleted]]

___
R-sig-phylo mailing list - R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/

Re: [R-sig-phylo] Problem with fitContinuous function

2013-07-17 Thread Brian O'Meara

In addition to Liam's and Graham's suggestions, one other thing that is
useful, but which we don't do enough, is to simply visualize the results.
Do plot(ouTree(phy, your_estimate)) and you can see what Graham means about
signal. Do phytools' contMap with and without the transformed tree to look
at how your traits are distributed: maybe you have one extreme value that's
driving this? Are your tree's starting branch lengths very weird?
(ultrametric trees aren't required, but often make a lot of sense for this
sort of analysis for trees of extant organisms). There may be an issue with
tree or data that becomes obvious with this sort of inspection, or you may
figure out that a new kind of model is needed (maybe your data suggest a
two mean OU model). You don't want to make post hoc hypotheses when doing
this, of course, but you can see problems this way.

Best,
Brian

_______
Brian O'Meara
Assistant Professor
Dept. of Ecology & Evolutionary Biology
U. of Tennessee, Knoxville
http://www.brianomeara.info

Students wanted: Applications due Dec. 15, annually
Postdoc collaborators wanted: Check NIMBioS' website
Calendar: http://www.brianomeara.info/calendars/omeara


On Wed, Jul 17, 2013 at 1:26 PM, Slater, Graham  wrote:

> Hi Oscar,
>
> Liam is of course correct that you can change the bounds. The default
> bounds in geiger are quite wide however. It's probably worth noting that
> the fact that both OU and trend (=linear rate change through time) have ML
> parameter estimates at their optimization bounds suggests that there is
> very little 'phylogenetic signal' in your trait. This is because both OU
> and increasing rates through time result in trait values for closely
> related tips in ultrametric trees that are much more dissimilar than
> expected in high signal models, like BM or ACDC/trend with declining rates
> through time. It's definitely worth increasing the bounds for optimization
> but I'd also suggest performing some 'phylogenetic signal' tests to find
> out whether you are able to say much about the processes/rates underlying
> the evolution of your trait
>
> Cheers,
>
> Graham
> ---
> --
> Graham Slater
> Peter Buck Post-Doctoral Fellow
> Department of Paleobiology
> National Museum of Natural History
> The Smithsonian Institution [NHB, MRC 121]
> P.O. Box 37012
> Washington DC 20013-7012
>
>
> (202) 633-1316
> slat...@si.edu
> <
> https://legacy.si.edu/owa/redir.aspx?C=HcyfurW5xkyXw2bWu-Td5xmRPtn7-c9I1lX
> iVWbeJqzfhgybLjvFBckclIN237FXoGKsGXlPAGg.&URL=mailto%3aSlaterG%40si.edu>
> www.eeb.ucla.edu/gslater
> <
> https://legacy.si.edu/owa/redir.aspx?C=HcyfurW5xkyXw2bWu-Td5xmRPtn7-c9I1lX
> iVWbeJqzfhgybLjvFBckclIN237FXoGKsGXlPAGg.&URL=http%3a%2f%2fwww.eeb.ucla.edu
> %2fgslater>
>
>
>
>
>
>
> On 7/17/13 1:11 PM, "Liam J. Revell"  wrote:
>
> >You can just change the bounds for optimization, see help(fitContinuous)
> >for details.
> >
> >E.g.:
> >
> >fitContinuous(...,bounds=list(beta=c(0,100)))
> >
> >or something. Liam
> >
> >Liam J. Revell, Assistant Professor of Biology
> >University of Massachusetts Boston
> >web: http://faculty.umb.edu/liam.revell/
> >email: liam.rev...@umb.edu
> >blog: http://blog.phytools.org
> >
> >On 7/17/2013 12:55 PM, oscar inostroza wrote:
> >> Dear list
> >>
> >> Im trying to use the function fitContinuous implemented in Geiger
> >>(Harmon
> >> et al., 2008) to adjust the different macroevolutionay models to my data
> >> set. However, when run the models "trend" and "OU" the resuls show the
> >> following error: Parameter estimates appear at bounds (beta to trend
> >>model
> >> and alpha to OU model). anyone know why this happens and how to fix it?
> >>
> >> p.s. the problem persists even when I apply a transformation to the data
> >> (e.g. log)
> >>
> >>
> >>
> >> ___
> >> R-sig-phylo mailing list - R-sig-phylo@r-project.org
> >> https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
> >> Searchable archive at
> >>http://www.mail-archive.com/r-sig-phylo@r-project.org/
> >>
> >
> >___
> >R-sig-phylo mailing list - R-sig-phylo@r-project.org
> >https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
> >Searchable archive at
> >http://www.mail-archive.com/r-sig-phylo@r-project.org/
>
> ___
> R-sig-phylo mailing list - R-sig-phylo@r-project.org
> https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
> Searchable archive at
> http://www.mail-archive.com/r-sig-phylo@r-project.org/
>

[[alternative HTML version deleted]]

___
R-sig-phylo mailing list - R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/

Re: [R-sig-phylo] Count Deep Coal in R?

2013-07-08 Thread Brian O'Meara

I think the package phybase has functions for looking at gene trees in
species trees that could be modified for this. The package has been purged
from CRAN (well, archived), but you can still install from source. I'm
CCing the package author, Liang Liu, to see if he has any ideas.

Best,
Brian

___
Brian O'Meara
Assistant Professor
Dept. of Ecology & Evolutionary Biology
U. of Tennessee, Knoxville
http://www.brianomeara.info

Students wanted: Applications due Dec. 15, annually
Postdoc collaborators wanted: Check NIMBioS' website
Calendar: http://www.brianomeara.info/calendars/omeara

On Mon, Jul 8, 2013 at 7:27 AM, Melisa Olave  wrote:

> just wondering if you found the way to count deep coal in R??
> I'm trying to do the same... with no success!
>
> thank you!
> Melisa
>
> ___
> R-sig-phylo mailing list - R-sig-phylo@r-project.org
> https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
> Searchable archive at
> http://www.mail-archive.com/r-sig-phylo@r-project.org/
>

[[alternative HTML version deleted]]

___
R-sig-phylo mailing list - R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/

Re: [R-sig-phylo] obtaining reasonable values from OUCH

2013-06-29 Thread Brian O'Meara

No, it's just univariate (I was responding to your example about doing a
single character). For multivariate OU, I think OUCH is the only game in
town.

Best,
Brian

___
Brian O'Meara
Assistant Professor
Dept. of Ecology & Evolutionary Biology
U. of Tennessee, Knoxville
http://www.brianomeara.info

Students wanted: Applications due Dec. 15, annually
Postdoc collaborators wanted: Check NIMBioS' website
Calendar: http://www.brianomeara.info/calendars/omeara


On Sat, Jun 29, 2013 at 9:11 PM, Pascal Title  wrote:

> Thanks for the reply, Brian. Does OUwie implement the multivariate model?
> >From the documentation, I don't see any mention of applying it to multiple
> traits at the same time.
>
> cheers,
> -Pascal
>
>
> On Sat, Jun 29, 2013 at 11:32 PM, Brian O'Meara  wrote:
>
> > You're at best on the border of how many parameters you can estimate
> given
> > the size of your dataset. One thing you might try is running OUwie, which
> > also implements that model but uses a different optimizer and starting
> > values, though I wouldn't be surprised if you get similar answers.
> Trying a
> > grid of values for a contour plot can tell you something about the
> > likelihood surface you're trying to optimize on; it could be very flat.
> We
> > had this function in OUwie until a license conflict with akima made us
> have
> > to cut this functionality, but it's easy enough to do.
> >
> > Best,
> > Brian
> >
> >
> > ___
> > Brian O'Meara
> > Assistant Professor
> > Dept. of Ecology & Evolutionary Biology
> > U. of Tennessee, Knoxville
> > http://www.brianomeara.info
> >
> > Students wanted: Applications due Dec. 15, annually
> > Postdoc collaborators wanted: Check NIMBioS' website
> > Calendar: http://www.brianomeara.info/calendars/omeara
> >
> >
> > On Sat, Jun 29, 2013 at 9:03 AM, Pascal Title  wrote:
> >
> >>  Hello,
> >>
> >> Given a phylogeny with 29 tips, and 5 traits that each fit a OU model
> >> better than a BM model of trait evolution (according to fitContinuous in
> >> GEIGER), I would like to fit a multivariate OU model to these 5 traits
> >> using the hansen() function in the OUCH package in R. The goal is to get
> >> multivariate rate values. I am using R v3.0.1 and version 2.8-2 of the
> >> OUCH
> >> package.
> >>
> >> Under default settings, I get an "unsuccessful convergence" error.
> >> By changing various settings, I stop getting a convergence error
> message.
> >> But the alpha and sigma2 values seem to be unrealistically large:
> >>
> >> *hansen(data = otd[colnames(data)], tree = ot, regimes =
> otd["regimes"], *
> >> *sqrt.alpha = priors, sigma = priors, method = "L-BFGS-B", *
> >> *maxit = 5000)*
> >> *
> >> *
> >> * optim  diagnostic message:  CONVERGENCE: REL_REDUCTION_OF_F <=
> >> FACTR*EPSMCH*
> >> *alpha:*
> >> *   [,1]   [,2][,3]   [,4]   [,5]*
> >> *[1,]  61.385945 17.5563426 -33.7110759   3.697183  50.622310*
> >> *[2,]  17.556343 45.9034684  -0.3705461  27.995283  -6.633346*
> >> *[3,] -33.711076 -0.3705461  42.8040546   8.072352 -40.675454*
> >> *[4,]   3.697183 27.9952829   8.0723524  25.435097 -12.115062*
> >> *[5,]  50.622310 -6.6333460 -40.6754540 -12.115062  59.536768*
> >> *
> >> *
> >> *sigma squared:*
> >> *  [,1]  [,2]  [,3]  [,4]  [,5]*
> >> *[1,] 89.262795 32.572924  9.099638 28.769264 24.711740*
> >> *[2,] 32.572924 40.506308 13.271956 19.279625 -9.633395*
> >> *[3,]  9.099638 13.271956  4.387822  5.986198 -3.965904*
> >> *[4,] 28.769264 19.279625  5.986198 11.966763  2.241925*
> >> *[5,] 24.711740 -9.633395 -3.965904  2.241925 18.995603*
> >>
> >>
> >> I've been told that sigma2/(2*alpha) should be approximately equal to
> the
> >> variance of the trait. If I try this for the first trait,
> sigma2/(2*alpha)
> >> = 0.727, and the actual variance of the first trait is 1.304. So that
> >> doesn't match, and values for alpha and sigma2 are extremely large.
> >>
> >> My question is, where do I go from here? What can I do to get more
> >> realistic values? I've heard that you can give different starting values
> >> to
> >> the sqrt.alpha and sigma arguments of the hansen function, but here
>

Re: [R-sig-phylo] obtaining reasonable values from OUCH

2013-06-29 Thread Brian O'Meara

You're at best on the border of how many parameters you can estimate given
the size of your dataset. One thing you might try is running OUwie, which
also implements that model but uses a different optimizer and starting
values, though I wouldn't be surprised if you get similar answers. Trying a
grid of values for a contour plot can tell you something about the
likelihood surface you're trying to optimize on; it could be very flat. We
had this function in OUwie until a license conflict with akima made us have
to cut this functionality, but it's easy enough to do.

Best,
Brian


_______
Brian O'Meara
Assistant Professor
Dept. of Ecology & Evolutionary Biology
U. of Tennessee, Knoxville
http://www.brianomeara.info

Students wanted: Applications due Dec. 15, annually
Postdoc collaborators wanted: Check NIMBioS' website
Calendar: http://www.brianomeara.info/calendars/omeara


On Sat, Jun 29, 2013 at 9:03 AM, Pascal Title  wrote:

> Hello,
>
> Given a phylogeny with 29 tips, and 5 traits that each fit a OU model
> better than a BM model of trait evolution (according to fitContinuous in
> GEIGER), I would like to fit a multivariate OU model to these 5 traits
> using the hansen() function in the OUCH package in R. The goal is to get
> multivariate rate values. I am using R v3.0.1 and version 2.8-2 of the OUCH
> package.
>
> Under default settings, I get an "unsuccessful convergence" error.
> By changing various settings, I stop getting a convergence error message.
> But the alpha and sigma2 values seem to be unrealistically large:
>
> *hansen(data = otd[colnames(data)], tree = ot, regimes = otd["regimes"], *
> *sqrt.alpha = priors, sigma = priors, method = "L-BFGS-B", *
> *maxit = 5000)*
> *
> *
> * optim  diagnostic message:  CONVERGENCE: REL_REDUCTION_OF_F <=
> FACTR*EPSMCH*
> *alpha:*
> *   [,1]   [,2][,3]   [,4]   [,5]*
> *[1,]  61.385945 17.5563426 -33.7110759   3.697183  50.622310*
> *[2,]  17.556343 45.9034684  -0.3705461  27.995283  -6.633346*
> *[3,] -33.711076 -0.3705461  42.8040546   8.072352 -40.675454*
> *[4,]   3.697183 27.9952829   8.0723524  25.435097 -12.115062*
> *[5,]  50.622310 -6.6333460 -40.6754540 -12.115062  59.536768*
> *
> *
> *sigma squared:*
> *  [,1]  [,2]  [,3]  [,4]  [,5]*
> *[1,] 89.262795 32.572924  9.099638 28.769264 24.711740*
> *[2,] 32.572924 40.506308 13.271956 19.279625 -9.633395*
> *[3,]  9.099638 13.271956  4.387822  5.986198 -3.965904*
> *[4,] 28.769264 19.279625  5.986198 11.966763  2.241925*
> *[5,] 24.711740 -9.633395 -3.965904  2.241925 18.995603*
>
> I've been told that sigma2/(2*alpha) should be approximately equal to the
> variance of the trait. If I try this for the first trait, sigma2/(2*alpha)
> = 0.727, and the actual variance of the first trait is 1.304. So that
> doesn't match, and values for alpha and sigma2 are extremely large.
>
> My question is, where do I go from here? What can I do to get more
> realistic values? I've heard that you can give different starting values to
> the sqrt.alpha and sigma arguments of the hansen function, but here these
> are lower triangular matrices, so I don't know what to adjust, or what
> other starting values to give, or how to know when I've found the most
> realistic values I'm going to find.
>
> >From the following link, you can download the tree, data and script to see
> for yourself:
> https://dl.dropboxusercontent.com/u/34644229/Pascal_OUCH.zip
>
> Any advice or suggestions would be greatly appreciated!
>
> Thank you,
>
> -Pascal
>
>
>
>
> --
> Pascal Title, MSc.
> PhD student, Rabosky Lab <
> http://www-personal.umich.edu/~drabosky/Home.html>
> Dept of Ecology and Evolutionary Biology
> University of Michigan
> pti...@umich.edu
>
> [[alternative HTML version deleted]]
>
>
> ___
> R-sig-phylo mailing list - R-sig-phylo@r-project.org
> https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
> Searchable archive at
> http://www.mail-archive.com/r-sig-phylo@r-project.org/
>

[[alternative HTML version deleted]]

___
R-sig-phylo mailing list - R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/

Re: [R-sig-phylo] Difficulty getting posterior probabilities into ape?

2013-06-19 Thread Brian O'Meara

Phyloch has never been on CRAN as far as I know, but you can get it from
http://www.christophheibl.de/Rpackages.html .

Best,
Brian

___
Brian O'Meara
Assistant Professor
Dept. of Ecology & Evolutionary Biology
U. of Tennessee, Knoxville
http://www.brianomeara.info

Students wanted: Applications due Dec. 15, annually
Postdoc collaborators wanted: Check NIMBioS' website
Calendar: http://www.brianomeara.info/calendars/omeara


On Wed, Jun 19, 2013 at 11:14 PM, Tom Diggs  wrote:

> Thank you very much for the answer.  Unfortunately, that command is not
> working for me.  It apparently reads the file just like the read.nexus
> command does in my case.  The tree comes in fine, with the branch lengths,
> but no posterior values.
>
> So I guess that the read.nexus command ignores anything inside square
> brackets?  I imagine that's what the read.annotated.nexus was designed to
> deal with, but unless I'm doing something incredibly wrong, it's not
> working.
>
> I also read that phyloch was a way to get around this problem, but it's
> not in CRAN anymore either.
>
> Otherwise, should I go through and just eliminate the square brackets
> manually?  Replace them with regular parentheses?  Leave them out
> altogether?
>
> The phylogeny I'm currently dealing with isn't THAT big, and while it'd be
> a pain to do it manually, it wouldn't be prohibitive.
>
> Has anyone done that before?
>
> Thanks,
> Tom
> - Original Message - From: "Jombart, Thibaut" <
> t.jomb...@imperial.ac.uk>
> To: "Tom Diggs" ; 
> Sent: Wednesday, June 19, 2013 12:39 PM
> Subject: RE: [R-sig-phylo] Difficulty getting posterior probabilities into
> ape?
>
>
>
> Hello,
>
> Marc Suchard has recently implemented this in the package epibase - check
> the function read.annotated.nexus.
>
> All the best
>
> Thibaut.
> __**__
> From: r-sig-phylo-bounces@r-project.**org[
> r-sig-phylo-bounces@r-**project.org ]
> on behalf of Tom Diggs [jtdi...@mail.ad.uab.edu]
> Sent: 19 June 2013 18:33
> To: r-sig-phylo@r-project.org
> Subject: [R-sig-phylo] Difficulty getting posterior probabilities into ape?
>
> Hi all,
>
> I was wondering if anyone else has had this problem, and how they
> addressed it.
>
> I've been using BEAST to generate trees, and TreeAnnotator to summarize
> them.  In the output nexus file, the tree plainly has posterior
> probabilities for each node listed, but when I try to import the trees into
> R, my tree has no node labels (mytree$node.labels = NULL).
>
> I searched the archives for this listserve, and found one instance of this
> problem cropping up with MrBayes output. http://www.mail-archive.com/r-**
> sig-ph...@r-project.org/**msg02227.html<http://www.mail-archive.com/r-sig-phylo@r-project.org/msg02227.html>
>
> Unfortunately, the package "phybase", listed as a potential solution to
> the problem, is no longer in CRAN.
>
> Am I missing something obvious?  Is ape having trouble reading the node
> labels for some reason?
>
> Any help would be much appreciated.  Thanks so much.
>
> Tom
>[[alternative HTML version deleted]]
>
> __**_
> R-sig-phylo mailing list - R-sig-phylo@r-project.org
> https://stat.ethz.ch/mailman/**listinfo/r-sig-phylo<https://stat.ethz.ch/mailman/listinfo/r-sig-phylo>
> Searchable archive at http://www.mail-archive.com/r-**
> sig-ph...@r-project.org/<http://www.mail-archive.com/r-sig-phylo@r-project.org/>
>
> __**_
> R-sig-phylo mailing list - R-sig-phylo@r-project.org
> https://stat.ethz.ch/mailman/**listinfo/r-sig-phylo<https://stat.ethz.ch/mailman/listinfo/r-sig-phylo>
> Searchable archive at http://www.mail-archive.com/r-**
> sig-ph...@r-project.org/<http://www.mail-archive.com/r-sig-phylo@r-project.org/>
>

[[alternative HTML version deleted]]

___
R-sig-phylo mailing list - R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/

Re: [R-sig-phylo] using ace (ape) with missing data

2013-06-12 Thread Brian O'Meara

What about treating this as a visualization problem rather than as a
missing data problem? Maybe some of the congruifier code now in Geiger
could help: it's made to stretch a tree based on another tree, but the
basic functionality of matching up nodes could help paint states from your
reduced tree to your full tree. You could even paint the edges that are
only in the full tree gray and leave the others red/blue/etc. so that
people aren't inadvertently misled into thinking there is data for all tips
for all characters. It might be a bit annoying to code, but it at least
means your results are not affected by decisions about how to treat missing
data.

Best,
Brian


_______
Brian O'Meara
Assistant Professor
Dept. of Ecology & Evolutionary Biology
U. of Tennessee, Knoxville
http://www.brianomeara.info

Students wanted: Applications due Dec. 15, annually
Postdoc collaborators wanted: Check NIMBioS' website
Calendar: http://www.brianomeara.info/calendars/omeara


On Wed, Jun 12, 2013 at 11:10 AM, sandra goutte  wrote:

> Hi Luke,
>
> Thank you for your message. I know that the "Continuous" section of
> BayesTraits accepts missing data, but i haven't found any information on
> how it is really dealt with (that's why i'd rather code it myself instead
> of having a "black box"). My goal here is not to add unexisting
> information, but rather to be able to represent everything on the same
> tree. I guess i can always extract the data from the ace output and add it
> to the nodes on my larger tree, but it seems like there should be an easier
> way.
>
> If i understand well, what you propose would be something like a weigthed
> mean of the traits values of all the tree, with the phylogenetic distance
> to the species of interest as weight? i don't think that would be too
> dificult to write, i'll try that and compare the values with the pruned
> tree! :)
>
> thanks again!
> Cheers,
> Sandra.
>
>
>
>
>
> 2013/6/12 Luke Matthews 
>
> > Hi Sandra and Liam,
> >
> >
> >
> > I think a strong case could be made here to use informative priors rather
> > than the flat priors (equal state probabilities) that are the default of
> > Liam's implementation.  Consider the case of a species with a missing
> trait
> > value that is surrounded by species where we have the trait value, and
> the
> > values for the related species are all the same (let's say state 1 in a
> > character with states 0,1,and 2).  Probably the missing value is also a
> 1,
> > because we have information (the states of related species) that informs
> > our prior away from equal probabilities of each of the 3 states.  I would
> > suggest Liam's method but with weighted probabilities based on what the
> > related species have or by imputing from correlated other traits you have
> > measured.  So, the weighted probabilities might come out as 0.8,0.1,0.1
> > rather than 0.33,0.33,0.33.
> >
> >
> >
> > You mentioned your trait is continuous.  The analogous 'flat' of prior
> for
> > a continuous trait would be to assign all the missing species the mean
> > value for the entire data set.  But this is unreasonable, given that they
> > are all more likely to have a mean closer to the local mean in their part
> > of the phylogeny.  You could impute the missing continuous values by
> using
> > the known values and the branch lengths.  I think BayesTraits has a
> canned
> > implementation of this, but probably some function in R does as well.
> >  Again, you could also impute from other correlated trait data you have.
> >
> >
> >
> > Lastly, any imputation you do on the missing data is not adding any data
> > or certainty to the analysis - you only have what you measured.  Anything
> > you do with degrees of freedom should consider only the number of
> measured
> > data points.  If you treat the missing states with flat priors though,
> you
> > will essentially add more uncertainty to your estimates than is
> justified.
> >
> >
> >
> > Best
> >
> > Luke
> >
> >
> > Luke J. Matthews | Senior Scientific Director | Activate Networks, Inc.
> >
> >
> >
> >
> >
> > Message: 1
> >
> > Date: Tue, 11 Jun 2013 15:58:30 +0200
> >
> > From: sandra goutte mailto:gou...@mnhn.fr>>
> >
> > To: r-sig-phylo@r-project.org<mailto:r-sig-phylo@r-project.org>
> >
> > Subject: [R-sig-phylo] using ace (ape) with missing data
> >
> > Message-ID:
> >
> >  > 7go_fbn-gmgtt

[R-sig-phylo] R postdoc with O'Meara and Carstens

2013-04-22 Thread Brian O'Meara

This should be coming in via EvolDir soon, but may be of interest to many
of the readers of this list. You can find out more about the project,
including looking at the R code, at http://www.brianomeara.info/phrapl .
Please forward to any good candidates, and let me know if you have any
questions (or suggestions).

Thanks,
Brian

Phylogeography often compares a select handful of models to answer an
empirical question. We have just been funded to continue development of
Phrapl, a way to examine thousands of potential phylogeographic models for
fit to data. It is a bit like ModelTest, but for models of population
subdivision and migration rather than substitution rates. The aim is for
empiricists to be able to let their data tell them which processes are
important in the phylogeographic history of a group. A postdoctoral
position is available to work on improving, testing, and applying this
approach. The current code is mostly in R but using other languages to
speed up the computation internally is quite possible. The position is in
the lab of PI Brian OMeara at U. of Tennessee, Knoxville, but coordination
with Co-PI Bryan Carstens and occasional travel to Ohio State is expected.

The OMeara lab currently hosts five postdocs (including four co-mentored
through NIMBioS) as well as a co-advised grad student. The EEB department
features ten research groups whose work includes
phylogenetics/phylogeography, and the presence of the National Institute
for Mathematical and Biological Synthesis further creates a robust
ecosystem of colleagues.

Knoxville, Tennessee, is just a half hour away from the US most visited
national park, Great Smoky Mountains NP. Cost of living is low, and the
Knoxville area can cater to interests ranging from farmers markets, local
music, and handmade crafts to SEC football and NASCAR. I strive to create a
lab group that is welcoming and open. We have good diversity in terms of
gender (57% female), life stage (i.e., 43% with young children) but (so
far) poor diversity on various other metrics. Mentoring to allow postdocs
to reach their career goals is a focus. Research in the lab ranges from
empirical work on plant taxonomy to quantitative trait evolution models to
models of protein evolution.

Applications should include
1) A cover letter (include expected completion date of PhD, if appropriate,
as well as relevant skills)
2) A CV
3) A short research statement
4) Contact information for two references
5) Link(s) to repositories with examples of code you have written or
attachments including such code.

Experience in programming is strongly preferred; experience in R is
desired, but not required. If in doubt, apply: please do not self-select
yourself out from what might be a mutually beneficial position.

Review of applications will begin on May 1 and continue until filled (start
dates are flexible). Presubmission inquiries are encouraged (I can give you
a copy of the proposal, point you to the code, and answer any questions you
may have).

All qualified applicants will receive equal consideration for employment
and admissions without regard to race, color, national origin, religion,
sex, pregnancy, marital status, sexual orientation, gender identity, age,
physical or mental disability, or covered veteran status.

Eligibility and other terms and conditions of employment benefits at The
University of Tennessee are governed by laws and regulations of the State
of Tennessee, and this non-discrimination statement is intended to be
consistent with those laws and regulations.

In accordance with the requirements of Title VI of the Civil Rights Act of
1964, Title IX of the Education Amendments of 1972, Section 504 of the
Rehabilitation Act of 1973, and the Americans with Disabilities Act of
1990, The University of Tennessee affirmatively states that it does not
discriminate on the basis of race, sex, or disability in its education
programs and activities, and this policy extends to employment by the
University.

Inquiries and charges of violation of Title VI (race, color, national
origin), Title IX (sex), Section 504 (disability), ADA (disability), Age
Discrimination in Employment Act (age), sexualorientation, or veteran
status should be directed to the Office of Equity and Diversity (OED), 1840
Melrose Avenue, Knoxville, TN 37996-3560, telephone (865) 974-2498 (V/TTY
available) or 974-2440. Requests for accommodation of a disability should
be directed to the ADA Coordinator at the Office of Equity and Diversity.

___
Brian O'Meara
Assistant Professor
Dept. of Ecology & Evolutionary Biology
U. of Tennessee, Knoxville
http://www.brianomeara.info

Students wanted: Applications due Dec. 15, annually
Postdoc collaborators wanted: Check NIMBioS' website
Calendar: http://www.brianomeara.info/calendars/omeara

[[alternative HTML version deleted]]

___
R-sig-phylo mailing list - R-sig-phylo@r-pro

Re: [R-sig-phylo] question about analysing trait differences

2013-04-21 Thread Brian O'Meara

The problem is reconstructing overlap down the tree (it's possible to do,
but whether it's possible to do well is another question). One thing you
could do to avoid this is to use the *other* method from Felsenstein's 1985
independent contrasts paper, taking pairs of species independent of other
pairs. A way to do this:

1) Get a clade of size two (aka a "cherry" sensu Semple and Steel). Every
tree has at least one.
2) Difference in overlap and diference in mass for the pair of taxa making
up this clade becomes one point (perhaps standardize for time, and
positivize it such that you subtract them in the order that leaves the mass
difference positive).
3) Prune these two taxa off the tree.
4) Go back to step 1. When you're done, assuming no polytomies, you'll have
zero or one species left. [it's possible to do with polytomies, too: if
assumed to be hard, then just do a pair of taxa from a terminal polytomy
and leave the rest for the next step. If assumed to be soft, then just take
two from a terminal polytomy and discard the remaining taxa]

Contrasts gives you Ntax - 1 contrasts. This gives you floor(Ntax/2) but
avoids having to do reconstructions. I have code around to do this but
can't find it at the moment, but it's not hard to write -- post again if
you need help with it and no better ideas have been proposed.

Hope this helps,
Brian

___
Brian O'Meara
Assistant Professor
Dept. of Ecology & Evolutionary Biology
U. of Tennessee, Knoxville
http://www.brianomeara.info

Students wanted: Applications due Dec. 15, annually
Postdoc collaborators wanted: Check NIMBioS' website
Calendar: http://www.brianomeara.info/calendars/omeara

On Sun, Apr 21, 2013 at 10:26 AM, Langevelde, Frank van <
frank.vanlangeve...@wur.nl> wrote:

> Dear all
> I would like to test whether there is a relationship between two pairwise
> trait differences, or to be more specific: overlap in geographical ranges
> between pairs of species (as dependent variable in regression) related to
> the body mass differences of the pairs (as independent variable). As I
> expect that these traits are not phylogeneticaly independent, I would like
> to add phylogeny in the analysis. I thought about using independent
> contrasts (as these are also calculated pairwise), but I do not know how to
> analyse. One suggestion would be to use the distance between the pairs of
> species (e.g. distance to the closest shared node in the tree) and include
> these as the second independent variable. Is this possible or does anyone
> has a better suggestion?
>
> Thanks for your help!
>
> Kind regards
>
> Frank van Langevelde
>
> Resource Ecology group
> Wageningen University
> The Netherlands
> http://www.resecol.wur.nl/staff/frank/index_FrankvL.htm
>
> ___
> R-sig-phylo mailing list - R-sig-phylo@r-project.org
> https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
> Searchable archive at
> http://www.mail-archive.com/r-sig-phylo@r-project.org/
>

[[alternative HTML version deleted]]

___
R-sig-phylo mailing list - R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/

Re: [R-sig-phylo] Estimating ancestral states of a trait under "selection"

2013-04-19 Thread Brian O'Meara

We have a method in R, TreEvo, that can allow you to do things like have a
model where the optimum shifts through time according to some factor (like
external data on climate) or where there are constraints that change
through time: it's up on R-forge, and you can install the package, but it
hasn't been published yet and so could have problems that won't appear
until peers have looked at it more thoroughly. If you're fine waiting to
publish with it until it's been vetted and is in press, feel free to play
with it.

Of course, another thing you could do is throw a prior on the root state
based on your knowledge of the climate then and then do a Bayesian
reconstruction. I don't know of any packages that do this at the moment,
but it wouldn't be hard to write.

Best,
Brian

_______
Brian O'Meara
Assistant Professor
Dept. of Ecology & Evolutionary Biology
U. of Tennessee, Knoxville
http://www.brianomeara.info

Students wanted: Applications due Dec. 15, annually
Postdoc collaborators wanted: Check NIMBioS' website
Calendar: http://www.brianomeara.info/calendars/omeara

On Fri, Apr 19, 2013 at 4:00 PM, Eliot Miller  wrote:

> Hello all,
>
> I've read some past posts and tried to figure this one out myself. It
> appears that the short answer is if there are not some extinct lineages in
> the tree with known trait values, it's difficult to do.
>
> I have a data set of the current climatic niches of a continental radiation
> of taxa. If I reconstruct the ancestral state of the root using ace() with
> REML, I get a semi-believable answer, though obviously with wide CIs. My
> gut instinct is that the absolute value that is returned is slightly less
> than it should be.
>
> The issue is that I know the climate of the continent has changed radically
> (but at least consistently in one direction!) during the course of the
> radiation of the clade. An educated guess would put the magnitude of the
> change at 1200 to 550 mm rain per year over 60 million years. I'd guess
> that most lineages have felt this aridification as a slow and steady
> "selective" pressure to "figure it out" in increasingly dry conditions
> (there's probably also been a lot of extinction but let's skip that point
> for now).
>
> So, while I don't have extinct lineages with known trait values, I do have
> what I'd be comfortable modeling as a continuous selective pressure
> throughout the course of the radiation. I'm interested to see how doing
> this would change the actual absolute value returned by the ancestral state
> reconstruction. Any tips on how this might be best effected?
>
> Final note, my tree is from uncorrected molecular distances, so it's not
> ultrametric. The analysis I'm trying to run here is tangential to my main
> focus, so I'm ok with "making" the tree ultrametric with something like
> chronopl() just out of curiosity to see what comes out. Someday I'll have a
> time calibrated tree and then I can re-run this analysis.
>
> Best wishes,
> Eliot
>
> [[alternative HTML version deleted]]
>
> ___
> R-sig-phylo mailing list - R-sig-phylo@r-project.org
> https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
> Searchable archive at
> http://www.mail-archive.com/r-sig-phylo@r-project.org/
>

[[alternative HTML version deleted]]

___
R-sig-phylo mailing list - R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/

Re: [R-sig-phylo] R Dendrograms for Subclones in a Somatic Cell Population: Time Interval Sampling.

2013-02-28 Thread Brian O'Meara

An old parsimony-based approach to this is known as stratocladistics. There
are no R implementations, as far as I know, but you could wrap phangorn to
do this, I imagine, though it does require writing a new function.
Pseudocode:

#my.taxa.vector is character vector of tips
best.phy <- rtree(length(my.taxa.vector), tip.label=my.taxa.vector)
best.trees <- c(best.phy)
best.score <- parsimony(best.phy, data) + strato(best.phy, times)
for (i in sequence(nsteps)) {
  new.phy <- rSPR(best.trees[[1]])
  new.score <- pars(new.phy, data) + strato(new.phy, times)
  if (new.score == best.score) {
best.trees <- c((new.phy), best.trees)
  }
  if (new.score < best.score) {
best.trees <- c(new.phy)
best.score <- new.score
  }
}

this would do a greedy spr search: you'd want to restart from different
trees and such. The only tricky thing is figuring out the new strato
function: try a set of branch lengths and take as the best score the one
that implies the least amount of stratographic debt (ape's node.height
function could be useful for this; one thing that makes these easy is that
times and node heights must be integers (number of time points from root),
so even searching exhaustively is feasible, though almost
surely unnecessary). Some of these branch lengths could be zero,
indicating, in the case of a terminal branch length, a sample that is a
direct ancestor of another sample. The tree isn't quite rooted but it is
polarized, with nodes sampled further back in time pushed down the tree.
Writing the strato function really wouldn't be that bad to do.

Another approach is to do a likelihood search assuming a clock, but with
tips constrained to occur at time of sampling rather than being coeval.
Heibl and Cusimano's Lagopus package (not on CRAN, go to
http://www.christophheibl.de/mdt/mdtinr.html) calls PAML and multidivtime
to estimate a tree with age constraints. Multidivtime can use constraints
such that the tips are not coeval, but I'm not certain that Lagopus can
pass this information (it can certainly do node constraints, just not sure
about tip constraints). If it can, or could be modified to do so, this
would give you a tree with samples constrained to be at the right times and
with possibly zero length branches for truly ancestral samples. You could
then collapse these using di2multi in ape.

Someone else may know of other ways to attack this problem.

Hope this helps,
Brian

___
Brian O'Meara
Assistant Professor
Dept. of Ecology & Evolutionary Biology
U. of Tennessee, Knoxville
http://www.brianomeara.info

Students wanted: Applications due Dec. 15, annually
Postdoc collaborators wanted: Check NIMBioS' website
Calendar: http://www.brianomeara.info/calendars/omeara

On Thu, Feb 28, 2013 at 3:21 PM,  wrote:

> Estimating Ongoing Evolution by Repeated Sampling with "Long" Time
> Intervals.
>
> Is there a way to construct dendrograms similar to those used in
> phylogenetics but
> with 2 main differences:
>  (1) Instead of observing at one time, small samples from a very large
> population are taken at regular intervals, so that some observed cells
> could
> easily correspond to an internal node rather than a leaf.
>  (2) There is no obvious outgroup; root should if possible be
> estimated by
> presuming that observations from later time points are on average farther
> from
> root.
>
> More specifically, consider a large, heterogeneous, unstably evolving in
> vitro cell
> culture apparently not subject to a Hayflick limit. In our feasibility
> study, a
> sample of 20 cells were tested at t=0 for about 100 different numerical
> aspects of
> their karyotype (for each cell an ordered vector of 100 numbers is
> measured from
> the genome; the individual numbers all have the same order of magnitude).
>
> About 15 cell generations later the observation is repeated and similarly
> four more
> times for a total of 120 cells over a time span of about 60 cell
> generations. I
> would like to estimate the behavior of the major subclones  Are some
> spinning off
> new karyotypes? Which ones, if any, are in the process of taking over? Are
> some
> being outcompeted? And so on.
>
> Various difference matrices and binary dendrograms with the cells as
> leaves are
> easily constructed and are suggestive. For example at timepoint 5 one
> karyotype
> which was prominent, with lots of duplicates, for timepoints 1-4
> disappears from
> the samples. But the dendrograms themselves dont really use the fact that
> observations were made at six consective times rather than simultaneously;
> and they
> require me to make a guess about where root is. There must be a better way
> to use
> the data. I assume people who work,

Re: [R-sig-phylo] Analysis with Multiple cores on Mac Workstation

2012-12-08 Thread Brian O'Meara

I agree with Daniel that going in parallel is probably overkill in this
case. However, if you do want to get into parallelization with R, a good
place to start is the CRAN task view on high performance and parallel
computing: http://cran.r-project.org/web/views/HighPerformanceComputing.html.
Like all task views, it provides an overview of the packages in that
domain and an easy way to install all of them at once (see
http://cran.r-project.org/web/views/ ). The built-in parallel package
(since R 2.14) makes it easy to use all your cores using mclapply(). If you
tend to think in for loops (as most of us do) rather than apply functions,
the foreach package combined with doMC is another easy way: there's a good
vignette for this at
http://cran.r-project.org/web/packages/doMC/vignettes/gettingstartedMC.pdf.

Best,
Brian


___
Brian O'Meara
Assistant Professor
Dept. of Ecology & Evolutionary Biology
U. of Tennessee, Knoxville
http://www.brianomeara.info

Students wanted: Applications due Dec. 15, annually
Postdoc collaborators wanted: Check NIMBioS' website
Calendar: http://www.brianomeara.info/calendars/omeara


On Sat, Dec 8, 2012 at 7:01 AM, Daniel Barker  wrote:

> Dear Jose,
>
> Is this a problem in practice? The calculation of branch lengths Brian
> describes do not sound very time-consuming to run. If you're dealing with
> an enormous tree or enormous sample, they would take some time - but
> presumably, the greater bottleneck would be obtaining the sample of trees
> in the first place.
>
> Anything can become time-consuming if repeated, e.g. if you're dealing
> with thousands of separate samples of trees. If things are taking too long
> in that situation, the best solution would be to run serial processes, not
> multithreaded - but many of them, submitted to a batch queuing system
> running on the computer. Examples include SLURM, GridEngine, LSF, Unix
> 'batch' (comes as part of OS X but I've never managed to set up the load
> threshold appropriately on OS X), or perhaps Xgrid (I don't know it).
>
> Except in very unusual situations, this kind of 'task farming', if
> properly set up, will always be at least as efficient as 'true'
> parallelisation like multithreading - often faster. It also doesn't
> require any special programming, and parallel programming can be fairly
> painful.
>
> An even simpler approach is to divide the task into n approximately
> equally time-consuming scripts (if you have n CPU cores) and launch them
> all at the same time - which uses all your cores, and doesn't even require
> a batch queuing system.
>
>
> Best wishes,
>
> Daniel
>
> On 08/12/2012 02:43, "José Hidasi"  wrote:
>
> >Dear all,
> >
> >I want to create a consensus tree with branch lengths (Brian O'Meara's
> >function on post "*[R-sig-phylo] Why no branch lengths on consensus
> >trees?*) using
> >a Mac Workstation. However, if I only type the function on it, R will not
> >use all cores for running the analysis. I would like to know if there is a
> >function or any way to divide the analysis within the cores, or to use all
> >cores for running the program.
> >
> >I know this is not the best forum to ask something like that (using
> >multiple cores), but I imagined that someone might have the solution for
> >that as some of you work with large databases.
> >
> >Best,
> >José Hidasi
> >
> >
> >
> >*
> >This is Brian O'Meara's function, that i am using to create the consensus
> >tree, on the post "*[R-sig-phylo] Why no branch lengths on consensus
> >trees?":*
> >
> >"I have a function to create a consensus tree with branch lengths. You
> >feed it a given topology (often a consensus topology, made with ape), then
> >a list of trees, and tell it what you want the branch lengths to
> >represent. It could be the proportion of input trees with that edge (good
> >for summarizing bootstrap or Bayes proportions) or the mean, median, or sd
> >of branch lengths for those trees that have that edge. Consensus branch
> >lengths in units of proportion of matching trees has obvious utility.
> >As Daniel says, the average branch lengths across a set of trees is
> >more difficult to see a use case for, but you could imagine doing
> >something
> >like taking the ratogram output from r8s on a set of trees and summarizing
> >the rate average and rate sd on a given, "best", tree as two sets of
> >branch
> >lengths on that tree.
> >
> >I've put the function source at
> >
> https://r-forge.r-project.org/scm/viewvc.ph

Re: [R-sig-phylo] Why no branch lengths on consensus trees?

2012-11-21 Thread Brian O'Meara

I have a function to create a consensus tree with branch lengths. You feed
it a given topology (often a consensus topology, made with ape), then a
list of trees, and tell it what you want the branch lengths to represent.
It could be the proportion of input trees with that edge (good for
summarizing bootstrap or Bayes proportions) or the mean, median, or sd of
branch lengths for those trees that have that edge. Consensus branch
lengths in units of proportion of matching trees has obvious utility. As
Daniel says, the average branch lengths across a set of trees is more
difficult to see a use case for, but you could imagine doing something like
taking the ratogram output from r8s on a set of trees and summarizing the
rate average and rate sd on a given, "best", tree as two sets of branch
lengths on that tree.

I've put the function source at
https://r-forge.r-project.org/scm/viewvc.php/*checkout*/pkg/R/consensusBrlen.R?revision=110&root=omearalab.
 You can source the file for the function (consensusBrlen() ) and
other
functions it needs. It also uses phylobase. Note that this is alpha-quality
code -- it's been checked a bit, but verify it's doing what you want.

Here's an example of how to use it

 library(ape)

library(phylobase)

phy.a<-rcoal(15)

phy.b<-phy.a

phy.b$edge.length<-phy.b$edge.length+runif(length(phy.b$edge.length), 0,
0.1)

phy.c<-rcoal(15)

phy.list<-list(phy.a, phy.b, phy.c)

phy.consensus<-consensusBrlen(phy.a, list(phy.a, phy.b, phy.c),
type="mean_brlen")

Best,
Brian

PS: Note that I am actively looking for grad students: info at
http://www.brianomeara.info/lab . Guaranteed five years support, subject to
decent performance.

___
Brian O'Meara
Assistant Professor
Dept. of Ecology & Evolutionary Biology
U. of Tennessee, Knoxville
http://www.brianomeara.info

Students wanted: Applications due Dec. 15, annually
Postdoc collaborators wanted: Check NIMBioS' website
Calendar: http://www.brianomeara.info/calendars/omeara

On Wed, Nov 21, 2012 at 11:09 AM, Daniel Barker wrote:

> Dear Scott,
>
> What should branch lengths on a consensus tree represent?
>
> They cannot be expected substitutions per residue. This would imply no
> evolution at points where uncertain branching patterns have been reduced
> to a multi-furcation - which is not what the multi-furcation is meant to
> imply. (Rather: there was evolution, but we aren't very certain about the
> branching pattern.)
>
> But, MrBayes does provide average lengths of some kind.
>
> Best wishes,
>
> Daniel
>
> On 21/11/2012 15:13, "Scott Chamberlain"  wrote:
>
> >When making a consensus tree using ape::consensus the branch lengths are
> >lost. Is there a way to not lose the branch lengths? Or to add them
> >somehow
> >to the consensus tree after making it.
> >
> >library(ape)
> >>
> >>cat("owls(((Strix_aluco:4.2,Asio_otus:4.1):4.1,Athene_noctua:7.3):6.3,Tyt
> >>o_alba:13.5);", file = "ex1.tre", sep = "\n")>
> >>cat("owls(((Strix_aluco:1.2,Asio_otus:4.5):3.1,Athene_noctua:7.3):6.3,Tyt
> >>o_alba:13.5);", file = "ex2.tre", sep = "\n")>
> >>cat("owls(((Strix_aluco:3.2,Asio_otus:4.7):8.1,Athene_noctua:7.3):6.3,Tyt
> >>o_alba:13.5);", file = "ex3.tre", sep = "\n")> tree1 <-
> >>read.tree("ex1.tre")> tree2 <- read.tree("ex2.tre")> tree3 <-
> >>read.tree("ex3.tre")> trees <- c(tree1, tree2, tree3)> trees_con <-
> >>consensus(trees)> trees_con
> >Phylogenetic tree with 4 tips and 3 internal nodes.
> >Tip labels:[1] "Strix_aluco"   "Asio_otus" "Athene_noctua" "Tyto_alba"
> >Rooted; no branch lengths.
> >
> >
> >Thanks, Scott Chamberlain
> >
> >   [[alternative HTML version deleted]]
> >
> >___
> >R-sig-phylo mailing list
> >R-sig-phylo@r-project.org
> >https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
> >
>
>
> --
> Daniel Barker
> http://bio.st-andrews.ac.uk/staff/db60.htm
> The University of St Andrews is a charity registered in Scotland : No
> SC013532
>
> ___
> R-sig-phylo mailing list
> R-sig-phylo@r-project.org
> https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
>

[[alternative HTML version deleted]]

___
R-sig-phylo mailing list
R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo

Re: [R-sig-phylo] estimating mutation rates

2012-11-08 Thread Brian O'Meara

If you want the branch-specific substitution rate, you could estimate the
branch lengths on a topology constrained to match the chronogram, then
divide each of the molecular tree's branch lengths by the corresponding
chronogram branch lengths to get these rates. If you want a global rate,
you might constrain the molecular tree to have the same proportional branch
lengths as the chronogram. Look at pml() in phangorn for calculating the
likelihoods.

Best,
Brian

___
Brian O'Meara
Assistant Professor
Dept. of Ecology & Evolutionary Biology
U. of Tennessee, Knoxville
http://www.brianomeara.info

Students wanted: Applications due Dec. 15, annually
Postdoc collaborators wanted: Check NIMBioS' website
Calendar: http://www.brianomeara.info/calendars/omeara

On Thu, Nov 8, 2012 at 12:49 PM, john d  wrote:

> Hi there,
>
> I have a well-supported phylogeny (with branch lengths proportional to
> absolute time) and I'm interested in estimating the mutation rate of
> an independently-obtained molecular dataset. Any suggestions?
>
> thanks!
>
> John
>
> ___
> R-sig-phylo mailing list
> R-sig-phylo@r-project.org
> https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
>

[[alternative HTML version deleted]]

___
R-sig-phylo mailing list
R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo

Re: [R-sig-phylo] variation in rates over time, unexpected message when using Brownie.lite

2012-09-18 Thread Brian O'Meara

That's a fairly small tree to be doing this with -- an old rule of thumb (I
think from Pagel) is 20 tips per parameter. Common issues are zero length
(or effectively zero length) branch lengths (what is min(phy$edge.length)?)
and polytomies (methodologically not a problem, but many packages don't
deal well with these).

Brian

_______
Brian O'Meara
Assistant Professor
Dept. of Ecology & Evolutionary Biology
U. of Tennessee, Knoxville
http://www.brianomeara.info

Students wanted: Applications due Dec. 15, annually
Postdoc collaborators wanted: Check NIMBioS' website
Calendar: http://www.brianomeara.info/calendars/omeara


On Tue, Sep 18, 2012 at 4:08 PM, Agus Camacho wrote:

> Dear all, I tried to test something similar to Jason, but relating the
> change in rates to the acquisition of a trait, instead to a specific date.
>
> For that, I used a phylogeny with ten taxa, without outgroup.
> I used exactly the same script gently proposed by Liam but got the
> following error message:
> "Error in solve.default(model1$hessian) :Lapack routine dgesv: system is
> exactly singular"
>
> I've been looking through the internet but was not able to identify what
> causes this error. Any hint about that?
>
> BTW, anybody knows how to directly answer a message found in the digest of
> the mail list, without leaving the thread?
>
> Below, I pasted the complete thread dealing with this topic.
>
> Cheers,
> Agus
>
> Message: 9
> Date: Tue, 18 Sep 2012 00:13:00 -0400
> From: "Brian O'Meara" 
> To: Jason S 
> Cc: "R-sig-phylo@r-project.org" 
> Subject: Re: [R-sig-phylo] variation in rates over time
> Message-ID:
> <
> cakywhkq-jjroqzdvfutphi4xopxam+aaacu64nldb3inxoz...@mail.gmail.com>
> Content-Type: text/plain
>
> I agree with the suggestions so far. I just wanted to point out a few more
> alternatives:
>
> You could use the geiger package to estimate the best scaling for the
> tworatetree transformation to do this (should be equivalent to the earlier
> solutions, though it would require running optimization).
>
> You could also use the OUwie package with a tree that has been given simmap
> mappings using phytools. The advantage of this is that you could evaluate
> Brownian models but you could also look at various Ornstein-Uhlenbeck
> models (though note that while there is information about different
> Brownian rates before and after a time slice, info about different alphas
> (strength of pull parameter) and thetas (the attractor, aka "optimal
> value") is rapidly (but not immediately) lost under an OU process).
>
> For completeness, especially for citations, note that O'Meara et al. (2006)
> and Thomas et al. (2006) independently arrived at essentially the same
> method, so it is worth reading both papers.
>
> Best,
> Brian
>
> ___
> Brian O'Meara
> Assistant Professor
> Dept. of Ecology & Evolutionary Biology
> U. of Tennessee, Knoxville
> http://www.brianomeara.info
>
> Students wanted: Applications due Dec. 15, annually
> Postdoc collaborators wanted: Check NIMBioS' website
> Calendar: http://www.brianomeara.info/calendars/omeara
>
>
> On Mon, Sep 17, 2012 at 9:11 PM, Jason S  wrote:
>
> >
> >
> > Thanks, guys. That's exactly what I needed.
> >
> >
> > 
> >  From: Liam J. Revell 
> > To: Matt Pennell 
> >
> > -project.org>
> > Sent: Monday, September 17, 2012 9:22 PM
> > Subject: Re: [R-sig-phylo] variation in rates over time
> >
> > Hi Jason. Matt is absolutely correct. You can do this with phytools.
> > Say, for instance, you have an ultrametric phylogeny with branches in
> > millions of years (tree) and data vector containing the trait values for
> > species (x) and you want to test the hypothesis that the last 3.4 my has
> > a different rate of evolution than the rest of the tree, you could do
> > this as follows:
> >
> > library(phytools) # load phytools
> > tree<-make.era.map(tree,c(0,max(nodeHeights(tree))-3.4))
> > plotSimmap(tree,pts=F,lwd=3) # visualize
> > fit<-brownie.lite(tree,x) # fit model
> >
> > That's it. Good luck. Liam
> >
> > Liam J. Revell, Assistant Professor of Biology
> > University of Massachusetts Boston
> > web: http://faculty.umb.edu/liam.revell/
> > email: liam.rev...@umb.edu
> > blog: http://phytools.blogspot.com
> >
> > On 9/17/2012 7:46 PM, Matt Pennell wrote:
> > >
> >  Jason,
> > >
> > > I think the best way to do this is with

Re: [R-sig-phylo] variation in rates over time

2012-09-17 Thread Brian O'Meara

I agree with the suggestions so far. I just wanted to point out a few more
alternatives:

You could use the geiger package to estimate the best scaling for the
tworatetree transformation to do this (should be equivalent to the earlier
solutions, though it would require running optimization).

You could also use the OUwie package with a tree that has been given simmap
mappings using phytools. The advantage of this is that you could evaluate
Brownian models but you could also look at various Ornstein-Uhlenbeck
models (though note that while there is information about different
Brownian rates before and after a time slice, info about different alphas
(strength of pull parameter) and thetas (the attractor, aka "optimal
value") is rapidly (but not immediately) lost under an OU process).

For completeness, especially for citations, note that O'Meara et al. (2006)
and Thomas et al. (2006) independently arrived at essentially the same
method, so it is worth reading both papers.

Best,
Brian

_______
Brian O'Meara
Assistant Professor
Dept. of Ecology & Evolutionary Biology
U. of Tennessee, Knoxville
http://www.brianomeara.info

Students wanted: Applications due Dec. 15, annually
Postdoc collaborators wanted: Check NIMBioS' website
Calendar: http://www.brianomeara.info/calendars/omeara


On Mon, Sep 17, 2012 at 9:11 PM, Jason S  wrote:

>
>
> Thanks, guys. That's exactly what I needed.
>
>
> 
>  From: Liam J. Revell 
> To: Matt Pennell 
>
> -project.org>
> Sent: Monday, September 17, 2012 9:22 PM
> Subject: Re: [R-sig-phylo] variation in rates over time
>
> Hi Jason. Matt is absolutely correct. You can do this with phytools.
> Say, for instance, you have an ultrametric phylogeny with branches in
> millions of years (tree) and data vector containing the trait values for
> species (x) and you want to test the hypothesis that the last 3.4 my has
> a different rate of evolution than the rest of the tree, you could do
> this as follows:
>
> library(phytools) # load phytools
> tree<-make.era.map(tree,c(0,max(nodeHeights(tree))-3.4))
> plotSimmap(tree,pts=F,lwd=3) # visualize
> fit<-brownie.lite(tree,x) # fit model
>
> That's it. Good luck. Liam
>
> Liam J. Revell, Assistant Professor of Biology
> University of Massachusetts Boston
> web: http://faculty.umb.edu/liam.revell/
> email: liam.rev...@umb.edu
> blog: http://phytools.blogspot.com
>
> On 9/17/2012 7:46 PM, Matt Pennell wrote:
> >
>  Jason,
> >
> > I think the best way to do this is with the approach of O'Meara et al.
> 2006
> > Evolution "Brownie".
> >
> > Liam Revell has implemented this in R in his package phytools. You can
> > modify the steps taken in this tutorial here
> >
> http://phytools.blogspot.com/2011/07/running-brownielite-for-arbitrarily.html
> > perhaps
> > in conjunction with the function make.era.map()
> >
> http://phytools.blogspot.com/2011/10/new-function-to-plot-eras-on-tree.html
> > though
> > I admittedly have not tried this myself.
> >
> > Perhaps Liam or someone else has a better explanation but I hope this is
> at
> > least somewhat helpful.
> >
> > cheers,
> > matt
> >
>
> >
> >>
> >>
> >>
>  Hello,
> >>
> >> I see that there are several interesting alternatives to test for
> >> different rates among clades. However, I was wondering if there is a
> method
> >> to test for varying rates over time. I'm aware of Pagel's delta and the
> EB
> >> model, but I was thinking more in terms of testing if there is a
> different
> >> rate for the entire tree after a specified point in time. For instance,
> if
> >> a snail predator colonizes an island 3.4 Mya, is there evidence for an
> >> increased rate of evolution in the prey after that point in time?
> Something
> >> like two lambdas, one for before and one for after that point in time.
> >>
> >> Thanks!
> >>
> >> Jason
> >>  [[alternative HTML version deleted]]
> >>
> >> ___
> >> R-sig-phylo mailing list
> >> R-sig-phylo@r-project.org
> >> https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
> >>
> >
> > [[alternative HTML version deleted]]
> >
> > ___
> > R-sig-phylo mailing list
> > R-sig-phylo@r-project.org
> > https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
> >
> [[alternative HTML version deleted]]
>
>
> ___
> R-sig-phylo mailing list
> R-sig-phylo@r-project.org
> https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
>
>

[[alternative HTML version deleted]]

___
R-sig-phylo mailing list
R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo

[R-sig-phylo] R resource on cluster

2012-08-30 Thread Brian O'Meara

I'm not sure if people know about this yet, so I thought I'd point it
out. iPlant Discovery Environment [1] is a web GUI that hosts various
R functions and other phylogenetics programs: ape, picante, geiger,
ouch, plus raxml, phylip, phyml, and others. Note that with the R
packages, it only exposes a few main functions, like ace() for ape,
rather than all the functions. It is similar to the cipres web portal
in that you can choose what to run and it sends jobs off to a
supercomputer, but there isn't much overlap in the programs. It can be
nice for running a large number of jobs, especially slow ones you
don't want on your desktop. It requires a login but is free. iPlant is
sponsored by NSF for building a cyberinfrastructure for plants, but it
appears to be open to all organisms. New tools can be added by users
[2], so you can add your favorite but slow package.

Right now, the interface is a GUI, so it doesn't feel like R, despite
having some R in the back end. Barb Banbury [3], a postdoc in my lab,
is working on a dedicated R interface to iPlant tools using their API
-- stay tuned for more news on that.

Best,
Brian

[1]: http://www.iplantcollaborative.org/discover/discovery-environment
[2]:
https://pods.iplantcollaborative.org/wiki/display/DEmanual/Integrating+an+Application+into+the+Discovery+Environment
[3]: http://barbbanbury.info

___________
Brian O'Meara
Assistant Professor
Dept. of Ecology & Evolutionary Biology
U. of Tennessee, Knoxville
http://www.brianomeara.info

Students wanted: Applications due Dec. 15, annually
Postdoc collaborators wanted: Check NIMBioS' website
Calendar: http://www.brianomeara.info/calendars/omeara

___
R-sig-phylo mailing list
R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo

Re: [R-sig-phylo] diversitree with multicore

2012-03-21 Thread Brian O'Meara

The multicore package has been rolled into the built-in R package parallel
in R >=2.14.0, with slight changes. There's a task view on parallelization
that might help:
http://cran.r-project.org/web/views/HighPerformanceComputing.html . I've
found the foreach package helpful for parallelization, too. It's especially
good if you still think in terms of for loops (as I do, sadly, though I'm
working on it).

Brian

_______
Brian O'Meara
Assistant Professor
Dept. of Ecology & Evolutionary Biology
U. of Tennessee, Knoxville
http://www.brianomeara.info

Students wanted: Applications due Dec. 15, annually
Postdoc collaborators wanted: Check NIMBioS' website
Calendar: http://www.brianomeara.info/calendars/omeara

On Wed, Mar 21, 2012 at 11:58 AM, Rafael Rubio de Casas wrote:

> Dear all,
> I have been trying to come up with easy ways to split diversitree runs
> among processors when using multiple trees. It seems that the multicore
> package would be a good way of doing it, but I have failed to use
> multicore with diversitree functions. Essentially, I cannot figure out
> how to tell multicore that the function takes two set of objects, one
> containing the likelihood functions and another one containing the
> parameters. Does anybody have experience and/or advice doing this? Any
> guidance would be most welcome.
> Other alternatives for splitting diversitree jobs among processors would
> also be highly appreciated.
> Thanks in advance,
> Rafa
> --
> National Evolutionary Synthesis Center
> *NESCent <http://www.nescent.org/>*
> 2024 W. Main Street, Suite A200
> Durham, NC27705
> r...@nescent.org <mailto:r...@duke.edu>
> 919.668.9107
>
>[[alternative HTML version deleted]]
>
> ___
> R-sig-phylo mailing list
> R-sig-phylo@r-project.org
> https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
>

[[alternative HTML version deleted]]

___
R-sig-phylo mailing list
R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo

Re: [R-sig-phylo] Estimating MRCA dates

2012-02-02 Thread Brian O'Meara

Ape can do this, function chronopl. In general, one good way to find out
which package can do something is by looking at the task view for
phylogenetics: http://cran.r-project.org/web/views/Phylogenetics.html .
This will also allow you to install all the phylogenetics packages (at
least those on CRAN) with just a couple of commands. An updated version
should be coming out later today (and if someone notices other things
missing, please let me know).

Brian

___
Brian O'Meara
Assistant Professor
Dept. of Ecology & Evolutionary Biology
U. of Tennessee, Knoxville
http://www.brianomeara.info

Students wanted: Applications due Dec. 15, annually
Postdoc collaborators wanted: Check NIMBioS' website
Funding wanted: Want to collaborate on a grant?

On Thu, Feb 2, 2012 at 8:45 AM, Sean Downey  wrote:

> Greetings!
>
> I'm new to the list, thanks to Emanuel for bringing it to my attention. I
> have a general question for the group. Several years ago I used r8s for
> estimating mrca dates for a tree using an internal node date as a prior. I
> believe BEAST also does this. But is there an R package for this kind of
> relaxed-clock rate smoothing? I'm beginning a new project and want to make
> sure I'm using up-to-date methods.
>
> Thanks,
> Sean
>
> --
> Sean S. Downey
> Institute of Archaeology
> University College London
> 31-34 Gordon Square
> London
> WC1H 0PY
> s.dow...@ucl.ac.uk <mailto:s.dow...@ucl.ac.uk>
> http://www.homepages.ucl.ac.**uk/~tcrnssd/<http://www.homepages.ucl.ac.uk/~tcrnssd/><
> http://www.homepages.ucl.ac.**uk/%7Etcrnssd/<http://www.homepages.ucl.ac.uk/%7Etcrnssd/>
> >
> UK Cell: +44 (0)7980 284 574
>
> __**_
> R-sig-phylo mailing list
> R-sig-phylo@r-project.org
> https://stat.ethz.ch/mailman/**listinfo/r-sig-phylo<https://stat.ethz.ch/mailman/listinfo/r-sig-phylo>
>

[[alternative HTML version deleted]]

___
R-sig-phylo mailing list
R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo

Re: [R-sig-phylo] Convert instantaneous transition rate to time.

2011-11-22 Thread Brian O'Meara

Another way (influenced by some of Elchanan Mossel's work, though it also
relates to Cecile's idea of looking for saturation) is to look at
information about state at the root. If the data are very informative about
it, one state will have most of the relative likelihood. If the rate * time
is high, there is less information about the state at the root, and the
proportion of relative likelihoods for different states will converge to
the equilibrium state frequencies (assuming a time-symmetric model). For
example, for a short tree for DNA data, the ratio of relative likelihoods
for A, T, G, and C at the root might be c(0.99, 0.01, 0, 0) [most weight on
A] but on a longer tree with less information the reconstruction might be
c(0.27, 0.25, 0.24, 0.24). If the equilibrium frequency is c(0.25, 0.25,
0.25, 0.25), this suggests the second tree has less signal about the root
state than the first.

What I'm not sure about is how to actually measure this distance. Let the
relative likelihoods (normalized to sum to one) of the reconstructed root
state be the vector root.empirical, and let the equilibrium frequencies be
root.equilibrium. You could do:

1) Shannon_entropy( root.empirical ): the flatter the distribution, the
more entropy, the less we know about the root state. The thing I worry
about here is what if the equilibrium frequency isn't flat? Then a
root.empirical that is far from root.equilibrium (indicating some
information) but is flat is judged as less informative than a
root.empirical that exactly equals the root.equilibrium vector, which
doesn't seem right. [note that "Shannon_entropy()" is pseudocode, though
I'm sure somewhere on CRAN there's a function of this sort]

2) Euclidean distance between root.empirical and root.equilibrium: but does
this put too much weight on what is happening at suboptimal states?

3) Distance between max(root.empirical) and
root.equilibrium[which.max(root.empirical)]: but then something that should
be maximally informative (max(root.empirical)==1) might not be depending on
root.equilibrium. I.e., if root.equilibrium=c(0.4, 0.1, 0.25, 0.25), the
"information" about the root from root.empirical.1=c(1,0,0,0) is 1 - 0.4 =
0.6, whereas the "information" from root.empirical.2=c(0, 1, 0, 0) is 1 -
0.1 = 0.9, even though for each the weight for the root state is as strong
as can be.

Perhaps someone else has an idea about a better way to measure this. In
practice, I bet all these measures are pretty highly correlated and so
which one is used might not have much practical importance, though this
would be worth investigating if actual conclusions depend on the results.

Best,
Brian

___
Brian O'Meara
Assistant Professor
Dept. of Ecology & Evolutionary Biology
U. of Tennessee, Knoxville
http://www.brianomeara.info

Students wanted: Applications due Dec. 15, annually
Postdoc collaborators wanted: Check NIMBioS' website
Funding wanted: Want to collaborate on a grant?
Calendar: http://www.brianomeara.info/calendars/omeara

On Tue, Nov 22, 2011 at 1:09 PM, Cecile Ane  wrote:

> Hi Simon,
>
> One option could be to look at the expected number of substitutions over
> time t, and find the smallest time t for which you expect at least 1
> substitution. The idea here is that the first substitution is the one that
> most disrupts the signal.
> Technically, if Q is your rate matrix and p is the stationary distribution
> of states associated with Q, then the average number of substitutions over
> time t is t * r where r is the average jumping rate:
> r = sum over all states p(i) Q(i,i) in absolute value.
> So the time to the first substitution is 1/r.
>
> Cecile.
>
>
> On 11/22/11 09:01, Simon Greenhill wrote:
>
>> Hi all,
>>
>> What is the best way to convert an instantaneous transition rate (such as
>> that given by geiger's `fitDiscrete` method) into a measure of stability
>> over time?
>>
>> So, I have a set of traits with a small number of states. I want to fit
>> these onto a set of trees with branches proportional to substitutions. The
>> instantaneous transition rate for these traits give me the relative
>> stability, but I want to be able to quantify this over time e.g. what
>> proportion of signal is remaining after time t? how could I represent this
>> as a decay curve of signal over time?
>>
>> Any ideas/hints would be much appreciated!
>>
>> Simon
>>
>
> __**_
> R-sig-phylo mailing list
> R-sig-phylo@r-project.org
> https://stat.ethz.ch/mailman/**listinfo/r-sig-phylo<https://stat.ethz.ch/mailman/listinfo/r-sig-phylo>
>

[[alternative HTML version deleted]]

___
R-sig-phylo mailing list
R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo

Re: [R-sig-phylo] Unexpected behavior in birth death simulations

2011-05-18 Thread Brian O'Meara

As a general approach, when this sort of thing happens, I'll modify and load
a copy of the function that has been changed to produce a lot of debugging
info (lots of dput and print statements, for example). Then what happens at
each simulation step to identify where the problem occurs. [Better
developers would use R's built-in debugging functions like browser(),
debug(), and trace()].

Brian


_______
Brian O'Meara
Assistant Professor
Dept. of Ecology & Evolutionary Biology
U. of Tennessee, Knoxville
http://www.brianomeara.info

Students wanted: Applications due Dec. 15, annually
Postdoc collaborators wanted: Check NIMBioS' website
Funding wanted: Want to collaborate on a grant?


On Wed, May 18, 2011 at 3:59 PM, Carl Boettiger  wrote:

> Hi Tanja,list
>
> Yes, everything is going extinct, but as you say, that shouldn't happen
> when
> mu = 0.
> >
> sim.bd.age(age=2,numbsim=3,lambda=2,mu=0,frac,mrca=FALSE,complete=FALSE)
> [[1]]
> [1] 0
>
> [[2]]
> [1] 0
>
> [[3]]
> [1] 0
>
> >
> It's a rather malicious error in that, as you point out, it's a possible
> outcome when mu=0, so at first it appears nothing is wrong, but it happens
> every time that everything is extinct even for mu << lambda, or mu=0.
>
> It's curious that this same problem effects both TreeSim and Geiger, and
> that this problem appears isolated to certain architectures of computers.
> My guess is that it may be related to 32 bit architecture and the way ape
> is
> handling the tree, but really not sure how to debug this one.
>
> -Carl
>
>
> On Wed, May 18, 2011 at 12:26 PM, Stadler Tanja
> wrote:
>
> >
> > Hi Carl,
> >
> > the TreeSim return (arrays of 0s) could be because TreeSim returns 0 when
> > the tree goes extinct:
> >
> > treearray   Array of numbsim trees with the time since origin / most
> > recent common ancestor being age. If tree goes extinct (only possible
> when
> > mrca = FALSE), returns 0. If only one extant and no extinct tips, returns
> 1.
> >
> > Maybe set  seed and compare runs with complete=FALSE and complete=TRUE?
> >
> > Cheers Tanja
> >
> >
> >
> >
> > I get a similar problem when running the equivalent command in TreeSim:
> >
> >
> >  age<-2
> >lambda <- 2.0
> >mu <- 0.5
> >frac <-0.6
> >numbsim<-3
> >
> >sim.bd.age(age,numbsim,lambda,mu,frac,mrca=FALSE,complete=FALSE)
> > [[1]]
> > [1] 0
> >
> > [[2]]
> > [1] 0
> >
> > [[3]]
> > [1] 0
> >
> >
> >
> > --
> > Tanja Stadler (nee Gernhard)
> >
> > ETH Zürich
> > Institute of Integrative Biology
> > Universitätsstrasse 16
> > 8092 Zürich
> >
> > eMail: tanja.stad...@env.ethz.ch<mailto:tanja.stad...@env.ethz.ch>
> > Phone: +41 44 632 45 48
> > Office: CHN H76.1
> >
> > http://www.tb.ethz.ch/people/tstadler
> >
> >
>
>
> --
> Carl Boettiger
> UC Davis
> http://www.carlboettiger.info/
>
>[[alternative HTML version deleted]]
>
>
> ___
> R-sig-phylo mailing list
> R-sig-phylo@r-project.org
> https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
>
>

[[alternative HTML version deleted]]

___
R-sig-phylo mailing list
R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo

Re: [R-sig-phylo] reading nexus file from treebase?

2011-03-25 Thread Brian O'Meara

TITLE and LINK are used by Mesquite but not many other Nexus-reading
programs (TITLE appears as a future reserved command name in the Maddison et
al. (1997) nexus specification but I don't see LINK there).

Brian

___
Brian O'Meara
Assistant Professor
Dept. of Ecology & Evolutionary Biology
U. of Tennessee, Knoxville
http://www.brianomeara.info

Students wanted: Applications due Dec. 15, annually
Postdoc collaborators wanted: Check NIMBioS' website
Funding wanted: Want to collaborate on a grant?


On Fri, Mar 25, 2011 at 11:28 AM, Hilmar Lapp  wrote:

> Thanks for tracking this down, Emmanuel. I'm forwarding this to the
> TreeBASE developers list for consideration.
>
> BTW the phylobase implementation uses NCL (NEXUS Class Library, in C++) for
> parsing NEXUS. Given the difference, probably ape does not?
>
>-hilmar
>
>
> On Mar 25, 2011, at 12:00 AM, Emmanuel Paradis wrote:
>
>  Scott,
>>
>> readNexus (phylobase) can read your tree but not read.nexus (ape). The
>> problem is the two lines inserted within the TREES block:
>>
>> BEGIN TREES;
>> TITLE Tb10793; <<<<<<
>> LINK TAXA = M4787; <<<<<<
>> TREE Fig._3c = [&R]
>>
>> If you delete them, it's OK. FigTree also cannot read this file.
>>
>> Cheers,
>>
>> Emmanuel
>>
>> François Michonneau wrote on 25/03/2011 00:35:
>>
>>> Hi Scott,
>>>  Which version of phylobase are you using and which architecture? I can
>>> read the file on my machine.
>>>  Cheers,
>>>  -- FranÃ§ois
>>> On Thu, Mar 24, 2011 at 12:39, Scott Chamberlain <
>>> myrmecocys...@gmail.com>wrote:
>>>
>>>> Hello,
>>>>
>>>> I can't get read.nexus (ape) or readNexus (phylobase) to read nexus
>>>> files
>>>> downloaded from treebase with URLs parsed from xml files. I can't
>>>> manually
>>>> edit each file as I want to read a lot of these files. Is there an easy
>>>> fix?
>>>> One of the files is copied below.
>>>>
>>>> Thanks!
>>>> Scott Chamberlain
>>>> Rice University, EEB Dept.
>>>>
>>>>
>>>>
>>>>
>>>> #NEXUS
>>>>
>>>> [!This data set was downloaded from TreeBASE, a relational database of
>>>> phylogenetic knowledge. TreeBASE has been supported by the NSF, Harvard
>>>> University, Yale University, SDSC and UC Davis. Please do not remove
>>>> this
>>>> acknowledgment from the Nexus file.
>>>>
>>>>
>>>> Downloaded on March 24, 2011; 16:32 GMT
>>>>
>>>> TreeBASE (cc) 1994-2008
>>>>
>>>> Study reference:
>>>> Brown R., & Yang Z. 2010. Bayesian Dating of Shallow Phylogenies with a
>>>> Relaxed Clock.
>>>> Systematic Biology, 59(2): 119-131.
>>>>
>>>> TreeBASE Study URI:
>>>> http://purl.org/phylo/treebase/phylows/study/TB2:S10165]
>>>>
>>>> BEGIN TAXA;
>>>> TITLE M4787;
>>>> DIMENSIONS NTAX=16;
>>>> TAXLABELS
>>>> Chalcides_coeruleopunctatus_E2806.20
>>>> Chalcides_coeruleopunctatus_E2806.22
>>>> Chalcides_manueli_E2506.1
>>>> Chalcides_mionecton_mionecton_E2506.10
>>>> Chalcides_mionecton_mionecton_E2506.12
>>>> Chalcides_mionecton_trifasciatus_E2506.18
>>>> Chalcides_polylepis_E14124.1
>>>> Chalcides_polylepis_E14124.2
>>>> Chalcides_polylepis_E2506.21
>>>> Chalcides_sexlineatus_bistriatus_E2806.6
>>>> Chalcides_sexlineatus_sexlineatus_E2806.8
>>>> Chalcides_simonyi_E3007.2
>>>> Chalcides_sphenopsiformis_E8121.26
>>>> Chalcides_sphenopsiformis_E8121.27
>>>> Chalcides_viridanus_E2806.10
>>>> Chalcides_viridanus_E2806.14
>>>> ;
>>>> END;
>>>>
>>>> BEGIN TREES;
>>>> TITLE Tb10793;
>>>> LINK TAXA = M4787;
>>>> TREE Fig._3c = [&R]
>>>>
>>>> ((Chalcides_sphenopsiformis_E8121.26,Chalcides_sphenopsiformis_E8121.27),(((Chalcides_viridanus_E2806.10,Chalcides_viridanus_E2806.14),((Chalcides_sexlineatus_bistriatus_E2806.6,Chalcides_sexlineatus_sexlineatus_E2806.8),(Chalcides_coeruleopunctatus_E2806.22,Chalcides_coeruleopunctatus_E2806.20))),(Chalcides_simonyi_E3007.2,((Chalcides_mionecton_trifa

[R-sig-phylo] Make all trees

2011-02-16 Thread Brian O'Meara

I'm trying to find a function that generates all rooted trees, both
bifurcating and multifurcating, for a set of tips. Klaus Schliep's package
phangorn has the allTrees fn, which generates all the bifurcating trees
(rooted or unrooted as an option), but not all the multifurcating trees.
Does anyone know where to find the more general function? I can imagine
several ways to write such a function, but reuse is better than reinvention.


Thanks,
Brian

_______
Brian O'Meara
Assistant Professor
Dept. of Ecology & Evolutionary Biology
U. of Tennessee, Knoxville
http://www.brianomeara.info

Students wanted: Applications due Dec. 15, annually
Postdoc collaborators wanted: Check NIMBioS' website
Funding wanted: Want to collaborate on a grant?

[[alternative HTML version deleted]]

___
R-sig-phylo mailing list
R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo

Re: [R-sig-phylo] when not to use an ultrametric tree?

2011-02-10 Thread Brian O'Meara

If your question were about rates in terms of changes per unit time, then a
tree with branches proportional to time (which would be ultrametric for
coeval species) would be fairly important. But if your question were about
changes per unit of genetic change (or something else, like per generation,
per speciation event, etc.) then a tree with branch lengths in those units
would be most appropriate, I think.

One thought is that if you have multiple gene expression levels (expression
for gene 1 in all the species, expression for gene 2, etc.) you might be
able to use more complex models like the OU, if you are willing to assume
that the gene expression levels are all evolving under the same process
(i.e., that all the genes in species A are being pulled towards an
expression level twice that of the other species). It's like models of
nucleotide substitution: you wouldn't want to fit a GTR model if you had one
site for four species, but if you thought the same GTR model applied to all
of a million sites across four species, there is probably plenty of
information to estimate all the parameters (assuming reasonable amounts of
variation between species). Brownie can do this sort of simultaneous
character analysis for discrete trait models but not continuous ones yet,
but perhaps some other program can link continuous characters in this way
(or someone wants to modify Brownie?).

Brian

_______
Brian O'Meara
Assistant Professor
Dept. of Ecology & Evolutionary Biology
U. of Tennessee, Knoxville
http://www.brianomeara.info

Students wanted: Applications due Dec. 15, annually
Postdoc collaborators wanted: Check NIMBioS' website
Funding wanted: Want to collaborate on a grant?

On Thu, Feb 10, 2011 at 10:42 PM,  wrote:

>
> Hi All,
>
> I have what I hope is not too basic a question.
>
> Under what circumstances is it OK to not use an ultrametric tree for
> comparative analyses?
>
> I'm going to compare the fit of 1- and 2-rate brownian motion models to
> gene expression data on 4 species for which I have a good but not
> ultrametric tree.  The set of genes that I'll analyze have been called as
> differentially-expressed from the fit of a standard linear model.  The
> question is whether the differential expression levels in these 4 species
> are well-explained by the genetic distance (i.e. branch lengths) between
> the species OR whether there might be evidence for something else going on
> (pressumably selection).  Due to the low number of species in our dataset I
> won't attempt to fit OU models, despite these being more conceptually
> appropriate for the question at hand.
>
> I think the framing of our question makes it acceptable to use a
> non-ultrametric tree (and Brownie, I think, can use such a tree), but I'm
> somewhat unsure as to whether or not this would be proper.
>
> Any thoughts?
> Thanks!
>
> Dan Fulop.
> Postdoctoral Scholar
> UC Davis, Maloof Lab
>
> ___
> R-sig-phylo mailing list
> R-sig-phylo@r-project.org
> https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
>

[[alternative HTML version deleted]]

___
R-sig-phylo mailing list
R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo

Re: [R-sig-phylo] Ultrametricize?

2011-02-09 Thread Brian O'Meara

Sorry for how this sounds, but using data is probably the best way if your
end goal is to do phylogenetically-controlled analyses (rather than just the
focused goal on minimizing branch length differences between the two trees):
download sequences and use them to infer branch lengths, perhaps with your
fixed topology. There are various ways to do this in R (phangorn and ape
come to mind).

Another option is to use info from http://timetree.org/ : for a pair of
species, it can give you their divergence time based on published estimates.
I don't know of anything that automates this process. [Note: it would be
fairly easy to write an R script to do this given TimeTree's RESTful
interface, but it probably violates TimeTree's terms of use, so I would talk
to that site's authors first. It'd be darn useful, if it is allowed.]

Finally, you can use some approaches that are supposed to create an
ultrametric tree while minimizing changes in evolutionary rate over the tree
(which will have the effect of tending to minimize changes in relative
branch lengths, though it is not exactly the same as doing this -- if the
way to have most similar branch lengths is to have evolutionary rates
changing radically on a few branches, something like penalized likelihood
will instead prefer a tree that implies rates changing more gradually across
several branches). See chronopl and chronoMPL in ape for such
implementations.

Hope this helps,
Brian

On Wed, Feb 9, 2011 at 2:16 PM, Scott Chamberlain
wrote:

> Dear R users,
>
>
> I am curious the best way to ultrametricize a tree to retain the relative
> branch lengths among species. For example, in the below code I the table
> pairwisedists shows that the branch length distances differ between the
> non-ultrametric tree and the ultrametric tree. That is, distances between
> species A and B and A and C are the same in one tree, but in the other tree
> they are different. I want to use pairwise distances among each species pair
> for phylogenetically controlled analyses, but I am not sure which method is
> best for ultrametricizing trees.
>
>
>
> library(ape)
> library(reshape2)
>
> tree <- read.tree(text =
> "((Apis_mellifera:2,(Aedes_aegypti:1,(Drosophila_melanogaster:1,Oxyna_parietina:1):0.5):0.5):1,(((Homo_sapiens:2.5,(Lepus_europaeus:1,Mus_musculus:1):2):0.25,(((Gazella_dama:1,Gazella_dorcas:1):0.5,Odocoileus_virginanus:1):0.5,Equus_caballus:1):1):0.25,((Gambusia_holbrooki:1,Hoplosthethus_atlanticus:1):1,(((Oncorhynchus_mykiss:2,Oncorhynchus_clarki:2,Oncorhynchus_gorbuscha:2,Oncorhynchus_keta:2):1,Salvelinus_fontinalis:1):1,(Salmo_trutta:1,Salmo_salar:1):1):1):1):1):1;")
>
> tree2 <- compute.brlen(tree, method = "Grafen")
>
> # I used the 64 bit version, but in the 32 bit version the by column names
> are Var1 and Var2 for some reason
> pairwisedists <- merge(melt(cophenetic(tree)), melt(cophenetic(tree2)), by
> = c("X1", "X2"))
>
>
>
>
> Sincerely,
>
> Scott Chamberlain
> Rice University, EEB Dept.
>
>
>[[alternative HTML version deleted]]
>
> ___
> R-sig-phylo mailing list
> R-sig-phylo@r-project.org
> https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
>

[[alternative HTML version deleted]]

___
R-sig-phylo mailing list
R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo

Re: [R-sig-phylo] Model-Selection vs. Finding Models that "Fit Well"

2011-01-20 Thread Brian O'Meara

I think considering model adequacy is something that would be useful to do
and is not done much now. One general way to do this is to simulate under
your chosen model and see if the real data "look" very different from the
simulated data. For example, I might try a one rate vs. a two rate Brownian
motion model and find the latter fits better. If the actual true model is an
OU model with two very distinct peaks and strong selection, which is not in
my model set, I'll find that my simulated data under the two rate Brownian
model may look very different from my actual data, which will be fairly
bimodal. Aha, my model is inadequate. [but then what -- keep adding new
models, just report that your model is inadequate, ...?]

Of course, you need a method for evaluating how similar the data "look".
There's been some work on this in models for tree inference using posterior
predictive performance (work by Jonathan Bollback and Jeremy Brown come to
mind) or using other approaches (some of Peter Waddell's work), but it
hasn't really taken off yet. It'd be easy to implement such approaches in R
for comparative methods given capabilities in ape, Geiger, and other
packages.

Brian

_______
Brian O'Meara
Assistant Professor
Dept. of Ecology & Evolutionary Biology
U. of Tennessee, Knoxville
http://www.brianomeara.info

Students wanted: Applications due Dec. 15, annually
Postdoc collaborators wanted: Check NIMBioS' website
Funding wanted: Want to collaborate on a grant?

On Thu, Jan 20, 2011 at 12:27 PM, David Bapst  wrote:

> Hello all,
> I'd like to pose a question to this group, as a bit of topical
> discussion. I apologize in advance if I should mangle a concept.
>
> In many model-based PCMs and some other analyses (such as paleoTS), we
> fit models to data by finding the ML estimates of the parameters
> associated, calculate the maximum support of each model and than
> compare between models with differing parameters using an information
> criterion (AICc being probably the most used). Akaike weights can be
> calculated if we want to consider the relative fit between our models.
> This is contrary to traditional statistics, where alternative
> hypotheses are tested against some null hypothesis. Obviously, the
> later approach has proven to be thorny because rejecting some null
> hypotheses are very difficult (such as a random walk) and some
> situations truly lack a clear null model.
>
> Recently, I have heard the opinion expressed from workers of disparate
> fields (philosophy, ecology, etc.) that model-choosing methods may
> choose the best model, but with no idea of whether any of the models
> considered "fit well" to the data or not. In other words, we may have
> fit models A-D, and the best model may have been model C, but none of
> the models compared could describe the 'true' process underlying the
> data at all.
>
> This view gives me mixed feelings. Certainly, if we are using a
> model-selection approach, we should attempt the range of models that
> make sense for our data, and should particularly include that set of
> simplistic models that we may accept as the most observed process
> (Brownian motion and Ornstein-Uhlenbeck with one optima, perhaps, in
> analyses of trait evolution). Of course, we cannot include models that
> we haven't even considered or are analytically intractable. That's a
> fundamental limitation of science, however, not model-selection based
> analyses.
>
> This counter-argument did not seem to satisfy the others, who still
> wanted a measure of absolute fit, "like an R-squared". Now, perhaps
> I'm confused, but isn't R-squared technically a relative measure of
> fit between a linear model and a random scatter? I suppose the maximum
> support for a model is a measure of absolute fit, but it's not useful
> or interpretable unless I'm comparing it to the support for some other
> model.
>
> So, it seems like the desire for a measure of absolute fit is not
> well-founded, but maybe I'm wrong. Is there something more we can do
> to show how that the models we've picked aren't arbitrary? Opinions?
> -Dave Bapst, UChicago Geosci
>
> ___
> R-sig-phylo mailing list
> R-sig-phylo@r-project.org
> https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
>
>

[[alternative HTML version deleted]]

___
R-sig-phylo mailing list
R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo

[R-sig-phylo] Chambers award

2010-09-14 Thread Brian O'Meara

Many of the major contributions to R for phylogenetics have come from grad
students. Such folks may be interested in applying for this award (note that
"student" includes those who have gotten degrees in the past few years):

http://stat-computing.org/awards/jmc/announcement.html

The Statistical Computing Section of the American Statistical Association
announces the competition for the John M. Chambers Statistical Software
Award. In 1998 the Association for Computing Machinery presented its
Software System Award to John Chambers for the design and development of S.
Dr. Chambers generously donated his award to the Statistical Computing
Section to endow an annual prize for statistical software written by, or in
collaboration with, an undergraduate or graduate student. The prize carries
with it a cash award of $1000, plus a substantial allowance for travel to
the annual Joint Statistical Meetings (JSM) where the award will be
presented.

Teams of up to 3 people can participate in the competition, with the cash
award being split among team members. The travel allowance will be given to
just one individual in the team, who will be presented the award at JSM. To
be eligible, the team must have designed and implemented a piece of
statistical software. The individual within the team indicated to receive
the travel allowance must have begun the development while a student, and
must either currently be a student, or have completed all requirements for
her/his last degree after January 1, 2006. To apply for the award, teams
must provide the following materials:

To be eligible, the team must have designed and implemented a piece of
statistical software. The individual within the team indicated to receive
the travel allowance must have begun the development while a student, and
must either currently be a student, or have completed all requirements for
her/his last degree after January 1, 2008. To apply for the award, teams
must provide the following materials:

   - Current CV's of all team members.
   - A letter from a faculty mentor at the academic institution of the
   individual indicated to receive the travel award. The letter should confirm
   that the individual had substantial participation in the development of the
   software, certify her/his student status when the software began to be
   developed (and either the current student status or the date of degree
   completion), and briefly discuss the importance of the software to
   statistical practice.
   - A brief, one to two page description of the software, summarizing what
   it does, how it does it, and why it is an important contribution. If the
   team member competing for the travel allowance has continued developing the
   software after finishing her/his studies, the description should indicate
   what was developed when the individual was a student and what has been added
   since.
   - Access to the software by the award committee for their use on inputs
   of their choosing. Access to the software can consist of an executable file,
   Web-based access, macro code, or other appropriate form. Access should be
   accompanied by enough information to allow the judges to effectively use and
   evaluate the software. This information can be provided in a variety of
   ways, including but not limited to a user manual (paper or electronic), a
   paper, a URL, online help to the system, and source code. In particular, the
   team must be prepared to provide complete source code for inspection by the
   committee if requested.

All materials must be in English. We prefer that electronic text be
submitted in Postscript or PDF. The entries will be judged on a variety of
dimensions, including the importance and relevance for statistical practice
of the tasks performed by the software, ease of use, clarity of description,
elegance and availability for use by the statistical community. Preference
will be given to those entries that are grounded in software design rather
than calculation. The decision of the award committee is final.

All application materials must be received by 5:00pm EST, Monday, February
22, 2010 at the address below. The winner will be announced in May and the
award will be given at the 2010 Joint Statistical Meetings.

Chambers Software Award
c/o Fei Chen
Avaya Labs
233 Mt Airy Rd.
Basking Ridge, NJ 07920
f...@avaya.com

_______
Brian O'Meara
Assistant Professor
Dept. of Ecology & Evolutionary Biology
U. of Tennessee, Knoxville
http://www.brianomeara.info

Students wanted: Applications due Dec. 15, annually
Postdoc collaborators wanted: Check NIMBioS' website

[[alternative HTML version deleted]]

___
R-sig-phylo mailing list
R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo

Re: [R-sig-phylo] Fst - DNA sequence data

2010-08-24 Thread Brian O'Meara

You should look at the CRAN task view for genetics:
http://cran.r-project.org/web/views/Genetics.html . I haven't used R
much in that domain, but there seems to be a lot of functionality.
Also look into Bioconductor.

Best,
Brian

On Mon, Aug 23, 2010 at 8:18 AM, blue jay  wrote:
> Hi, (sorry for cross posting!)
>
> I want to analyse DNA sequence data (mtDNA) in R as in calculate Fst,
> Heterozygosity and such summary statistics. Package Adagenet converts the
> DNA sequence into retaining only retaining the polymorphic sites and then
> calcuates Fst.. but is there any other way to do this? I mean analyse the
> DNA sequence as it is.. and calculate the statistics?
>
>
> Thanks!
>
>        [[alternative HTML version deleted]]
>
> ___
> R-sig-phylo mailing list
> R-sig-phylo@r-project.org
> https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
>



-- 
___
Brian O'Meara
Assistant Professor
Dept. of Ecology & Evolutionary Biology
U. of Tennessee, Knoxville
http://www.brianomeara.info

Students wanted: Applications due Dec. 15, annually
Postdoc collaborators wanted: Check NIMBioS' website

___
R-sig-phylo mailing list
R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo

[R-sig-phylo] Phylogenetics task view

2010-07-12 Thread Brian O'Meara

Hi, all. Please let me know (off-list, usually) if you have any  
suggestions for the current phylogenetics task view -- and thanks to  
all who have made suggestions in the past. For those who don't know  
about task views, they're short overviews of all the packages  
regarding a particular subject in R, arranged by topic. Besides  
helping people find useful packages, they also help with the  
installation of packages: typing

install.packages("ctv")
library("ctv")
install.views("Phylogenetics")

will install ape, geiger, picante, and nearly 20 additional  
phylogenetics packages automatically (and you can update them all  
later using update.views("Phylogenetics") ). This can be especially  
helpful when you have a class of students who all need to install  
several packages.

Note to package developers: packages that aren't in CRAN but are  
available can (and should) be described in a task view, but they can't  
be installed as part of the install.views() feature -- another  
incentive to put things on CRAN.

Thanks,
Brian


------
Brian O'Meara
http://www.brianomeara.info
Assistant Prof.
Dept. Ecology & Evolutionary Biology
U. of Tennessee, Knoxville



--
Brian O'Meara
http://www.brianomeara.info
Assistant Prof.
Dept. Ecology & Evolutionary Biology
U. of Tennessee, Knoxville


[[alternative HTML version deleted]]

___
R-sig-phylo mailing list
R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo

Re: [R-sig-phylo] Difference in alpha value estimates between geiger and OUCH - data

2010-04-20 Thread Brian O'Meara

At times when I've seen this sort of issue (not in this particular  
case, but optimization in general) I've tried many starting points and  
plotted lines connecting each starting point with each ending point  
(plotting only two parameters at a time). Ideally, wherever you start  
in parameter space you get to the same end point, and this allows you  
to visually inspect for islands sensitive to starting position (if all  
starts converge to a few end points) or issues of flat surfaces (for  
example, if you can't estimate alpha and sigma^2 separately, all end  
points might fall on a line with a fixed alpha/sigma^2 ratio). You  
could use color or height to look at likelihoods of each solution.


You might try embedding the hansen function in a loop (or  
vectorization) starting with random starting points (or points on a  
grid) and take the replicate with the best likelihood. This won't help  
you if the issue is non-identifiability of the parameters (the star  
case Carl described), but it may help if the issue is a problem in the  
search not finding the true optimum (unless you get into numerical  
precision issues).


Best,
Brian

On Apr 20, 2010, at 2:10 PM, Carl Boettiger wrote:


Hi Alejandro,

I think you've raised an excellent question and I'd also be  
interested in
the thoughts of others on this.  I believe the short answer is that  
you
can't; your data may lack the information to estimate alpha and  
sigma^2

independently and you should therefore only attempt to estimate their
ratio.  For an extreme example, image a star phylogeny.  On such a  
tree the
OU model is described entirely by two parameters, mean (theta) and  
variance

(sigma^2/2 alpha), so there's no way to estimate all three parameters.

Geiger and ouch take different approaches to optimization.  Though  
both use
the optim package function Geiger uses the L-BFGS-B method which  
requires
bounds on parameters.  OUCH implements Nelder-Meade by default,  
which takes
intitial conditions rather than bounds.  OUCH can use any of the  
methods

provided by optim(), including L-BFGS-B, so you could specify: out <-
hansen(data, tree, regimes, sqrt.alpha, sigma, method="L-BFGS-B").   
Wasn't
obvious to me how OUCH generates the bounds in that case.  Geiger  
tries a
couple different bounds, some in the near-brownian limit and some in  
the
strong selection limit.  I think the takehome message is to be  
suspicious of

any result that seems sensitive to the numerical method / starting
conditions.  Sorry if this didn't make sense, I'm sure the package
developers and others can shed more light on this.

Cheers,
Carl



On Tue, Apr 20, 2010 at 10:40 AM, Alejandro Gonzalez <
alejandro.gonza...@ebd.csic.es> wrote:


Hello,

Sorry for not sending in any data to replicate the problem I posted  
in the
list. I now include as attachment a small database and phylogenetic  
tree in

Nexus format.
Carl Boettiger pointed out correctly that the cases in which I  
found a

large difference between alpha estimates probably also had very large
sigma^2 estimates. That is what was going on, compared with the  
sigma^2 of

the cases where the difference between alpha estimates was low, the
"problematic" cases did present high sigma^2 values. As he also  
suggested
I've modified the starting values of sqrt.alpha and sigma in the  
hansen
function. Low starting values do result in estimates which are  
closer to
those obtained with geiger. Is there any rule of thumb regarding  
how the
starting values should be modified? How do these modifications  
affect the

estimate of alpha and sigma^2?

Thank you again for the assistance.


Best wishes


Alejandro



__

Alejandro Gonzalez Voyer

Post-doc

NEW ADDRESS

Estación Biológica de Doñana
Consejo Superior de Investigaciones Científicas (CSIC)
Av Américo Vespucio s/n
41092 Sevilla
Spain

Tel: + 34 - 954 466700, ext 1749

E-mail: alejandro.gonza...@ebd.csic.es



___
R-sig-phylo mailing list
R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo





--
Carl Boettiger
Population Biology, UC Davis
http://two.ucdavis.edu/~cboettig

[[alternative HTML version deleted]]

___
R-sig-phylo mailing list
R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo


--
Brian O'Meara
http://www.brianomeara.info
Assistant Prof.
Dept. Ecology & Evolutionary Biology
U. of Tennessee, Knoxville

___
R-sig-phylo mailing list
R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo

Re: [R-sig-phylo] Different estimates of alpha with geiger and OUCH

2010-04-20 Thread Brian O'Meara

Hi, Alejandro. It's typically easier to identify solutions if you have  
a small dataset that replicates the issue which people can access  
(post in the text of the email, host on a website, etc.). You could  
always modify the species names if you're worried about getting  
scooped, though I imagine that's unlikely.


Best,
Brian

On Apr 20, 2010, at 11:39 AM, Alejandro Gonzalez wrote:


Hello,

I have a dataset with 7 traits for which I want to estimate alpha  
values. I compared value estimates with the fitContinuous function  
in geiger with those obtained with the hansen function in OUCH. For  
some traits the values are similar, not identical but quite close,  
however for other traits I get very high estimates of alpha in OUCH,  
while alpha values for those same traits estimated in geiger are  
generally low. I put some examples of the results below. The traits  
have been log-transformed, and I have measures for males and  
females. I'm not sure it is an issue with the traits used as these  
are not very different one from the other, they are all measures of  
morphological structures in 34 different cichlid species. Any ideas  
as to why the estimates would vary so much? Could it be a problem  
with the likelihood surface or convergence parameters? If it has to  
do with the likelihood surface, why does it appear to have a strong  
effect in the OUCH estimate and less so in geiger?



Alpha estimate with geiger  alpha 
estimate with OUCH

Minor differences:

Trait 1 2.44
2.80
Trait 2 1.98
3.80

Large differences:

Trait 1 1.55
111.82
Trait 2 0.5 
108.14



Cheers,


Alejandro


__

Alejandro Gonzalez Voyer

Post-doc

NEW ADDRESS

Estación Biológica de Doñana
Consejo Superior de Investigaciones Científicas (CSIC)
Av Américo Vespucio s/n
41092 Sevilla
Spain

Tel: + 34 - 954 466700, ext 1749

E-mail: alejandro.gonza...@ebd.csic.es


[[alternative HTML version deleted]]

___
R-sig-phylo mailing list
R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo


----------
Brian O'Meara
http://www.brianomeara.info
Assistant Prof.
Dept. Ecology & Evolutionary Biology
U. of Tennessee, Knoxville

___
R-sig-phylo mailing list
R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo

Re: [R-sig-phylo] Plot paleontological data with epoch names

2010-04-05 Thread Brian O'Meara

You might look into paleoPhylo (article: http://www.bioone.org/doi/abs/10.1666/0094-8373-35.3.460 
 package: http://r-forge.r-project.org/projects/paleophylo/ ): it has  
drawing and other tools targeted for paleontologists in general,  
including putting timelines on plots.


Best,
Brian

On Apr 5, 2010, at 12:54 AM, Andy Rominger wrote:


Hello,

I'm sorry this isn't really a phylogenetics question, but I think  
this list

might have more paleo subscribers than others.  Sorry to those of you
completely uninterested!

When working with data on, for example, genus diversity from the  
Cambrian to
modern, I'd like to produce a plot that has not only time across the  
x-axis,
but also the epoch names/"symbols"...e.g. Cm   O   S   D   ... etc.   
I've
seen some papers with figures like this that look like they were  
made in R
(like Alroy et al's paper from 2008 in science).  So before diving  
in and
writing my own function, I thought I'd see if anyone might now of an  
already
existing plotting function.  Thanks again, and sorry again to those  
pure

phylogeneticists.

Thanks,
Andy

[[alternative HTML version deleted]]

___
R-sig-phylo mailing list
R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo


----------
Brian O'Meara
http://www.brianomeara.info
Assistant Prof.
Dept. Ecology & Evolutionary Biology
U. of Tennessee, Knoxville

___
R-sig-phylo mailing list
R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo

Re: [R-sig-phylo] estimate likelihood of data for given model and parameters

2010-03-19 Thread Brian O'Meara

I have code for this in my TreEvo package (on R-forge -- see  
brownieneglnL in )  though note that the package itself is not all cleaned up into  
install.packages()-ready form. You could also dig into geiger's  
fitContinuous function and use that code (look for "foo" in  
fitContinuous.R) -- OUCH must have similar code. Odd if there's  
nothing more direct.


Best,
Brian

On Mar 19, 2010, at 9:47 PM, Carl Boettiger wrote:


Dear r-sig-phylo,

I'd like to be able to estimate the likelihood of a model with a given
Brownian rate parameter under a given data set.  For OU models, this  
can be
accomplished through the ouch package with the flag fit=FALSE.   
However
there does not seem to be a corresponding option for Brownian motion  
in
ouch/geiger/ape algorithms for fitting a Brownian motion model?   
Perhaps

I've overlooked something?

-Carl

--
Carl Boettiger
Population Biology, UC Davis
http://two.ucdavis.edu/~cboettig

[[alternative HTML version deleted]]

___
R-sig-phylo mailing list
R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo



Brian C. O'Meara
Asst. Prof., Dept. of Ecology and Evolutionary Biology
University of Tennessee, Knoxville
http://www.brianomeara.info
865-408-TREE (8733)

___
R-sig-phylo mailing list
R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo

[R-sig-phylo] plot.phylo leftwards in ape

2010-03-18 Thread Brian O'Meara

I'm trying to plot two trees opposite each other using ape  
(essentially as in Fig. 4.18 in Paradis' Analysis of Phylogenetics and  
Evolution with R, with some modifications). I was having trouble  
seeing the rootmost line on the plot on the right, so decided to try  
using x.lim as an argument for plot.phylo. Apparently, this doesn't  
work properly with direction="leftwards" (it seems to override the  
direction, so it plots it rightwards with bad label placement). A  
quick glance through the code didn't suggest to me how to fix it. This  
will reproduce the behavior:

library(ape)
data(bird.orders)
plot.phylo(bird.orders,direction="r") #looks okay
plot.phylo(bird.orders,direction="r", x.lim=38) #still okay
plot.phylo(bird.orders,direction="l") #still okay, points left
plot.phylo(bird.orders,direction="l", x.lim=38) #problem

It's possible the solution to my original potting issue lies elsewhere  
(I'll keep poking around at it), but I wanted to point out the  
potential x.lim+direction=l bug in any case. I'm using ape 2.5.

Thanks,
Brian


--
Brian O'Meara
http://www.brianomeara.info
Assistant Prof.
Dept. Ecology & Evolutionary Biology
U. of Tennessee, Knoxville


[[alternative HTML version deleted]]

___
R-sig-phylo mailing list
R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo

Re: [R-sig-phylo] apTreeshape

2010-03-11 Thread Brian O'Meara

You might consider putting it on R-Forge (<https://r-forge.r-project.org/ 
>).  Phylobase, task views, vegan, and many other packages are there,  
and it makes it 1) easy to work together on code (subversion  
repository) 2) easy to for users to install your package while it's in  
development (install.packages("packagename",repos="http://R-Forge.R-project.org 
") ), even for people not comfortable doing an install from source, 3)  
easy to track bugs and requests, and 4) easy to put back on CRAN.


Best,
Brian

On Mar 10, 2010, at 2:41 PM, michael.b...@imag.fr wrote:


Hi all,
I am one of the contributor to the apTreeshape package.

I know the apTreeshape package is currently moribund, and has been  
removed to the cran archive. Can anyone confirm its permanent  
decease?


I hope not, we are currently working for his resurrection.


Are there any efforts to move some of its functions into other  
packages - in particular the implementations of the Delta shift  
test and Aldous' Branching topology models?


For the moment, you can download the package at
http://cran.r-project.org/src/contrib/Archive/apTreeshape/


Hope this helps,

Feel free to send me an email if you still experience problems with  
the package.

Michael Blum


--
--
Michael BLUM
Chargé de recherche (CR2 CNRS) (Research Associate)
Tel: +33 (0)4 56 52 00 65
Fax: +33 (0)4 56 52 00 55
michael.b...@imag.fr
http://membres-timc.imag.fr/Michael.Blum/
--




All the best,

Rich





==
Rich Grenyer, Ph.D.

Research Fellow in Biodiversity Informatics
Centre For Population Biology & Division of Biology
Imperial College London at Silwood Park
Ascot, Berkshire
SL5 7PY
United Kingdom

e: gren...@imperial.ac.uk
t: +44 (0)20 7594 2474
w: http://www.imperial.ac.uk/people/grenyer

___
R-sig-phylo mailing list
R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo



___
R-sig-phylo mailing list
R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo


----------
Brian O'Meara
http://www.brianomeara.info
Assistant Prof.
Dept. Ecology & Evolutionary Biology
U. of Tennessee, Knoxville

___
R-sig-phylo mailing list
R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo

[R-sig-phylo] Phylogenetics task view

2010-03-04 Thread Brian O'Meara

There's now a phylogenetics task view available on CRAN -- basically,  
it's a brief description of what methods are available in R and in  
which packages. It was originally created as part of the NESCent- 
sponsored comparative methods in R hackathon but has recently been  
updated. You can direct suggestions for improvements to me. Thanks to  
all who helped build it in the past and to Ben Bolker and Achim  
Zeileis for helping to get it on CRAN.

Best,
Brian


------
Brian O'Meara
http://www.brianomeara.info
Assistant Prof.
Dept. Ecology & Evolutionary Biology
U. of Tennessee, Knoxville


[[alternative HTML version deleted]]

___
R-sig-phylo mailing list
R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo

Re: [R-sig-phylo] Bayesian dating in R vs. Multidivtime

2010-01-26 Thread Brian O'Meara

Emmanuel, you should check Lagopus (<http://www.christophheibl.de/mdt/mdtinr.html 
>), which seems to connect R with multidivtime (based on  
documentation -- I haven't used it).


Best,
Brian


On Jan 26, 2010, at 11:19 AM, Emmanuel Paradis wrote:


Dear all,

I'm planning to do some molecular dating using Bayesian methods, ie,  
what people usually do with Multidivtime. After reading the doc of  
Multidivtime, I'm thinking to do my own implementation in R since it  
is straightforward to specify priors, define and run the Markov  
chain, and calculate posteriors. I have two questions:


1. Has someone attempted a similar implementation?

2. Has someone written an interface between R and Multidivtime?

Many thanks in advance for any reply.

Best,

Emmanuel
--
Emmanuel Paradis
IRD, Montpellier, France
 ph: +33 (0)4 67 16 64 47
fax: +33 (0)4 67 16 64 40
http://ape.mpl.ird.fr/

___
R-sig-phylo mailing list
R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo


----------
Brian O'Meara
http://www.brianomeara.info
Assistant Prof.
Dept. Ecology & Evolutionary Biology
U. of Tennessee, Knoxville

___
R-sig-phylo mailing list
R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo

Re: [R-sig-phylo] Phylogenetic reference trees

2009-08-20 Thread Brian O'Meara

There's not yet a good single source. TreeBase.org has some
topologies, but not all studies go in there yet. The largest one I
know of there has 1569 taxa . The PhyLoTA browser () has trees for many groups and also allows you to download aligned
data to make your own. You can download NCBI's taxonomy, which is a
sort of tree, though not highly-resolved or with branch lengths, but
at least pretty comprehensive. You can also download the tree used by
the Tree of Life project (). There may be a tree available from ,
but I haven't been able to find it in a form other than a pdf image.

Having a place to download the single best current estimate of the
full tree of life would be great, and there are a few research groups
working towards this goal, but it's not ready yet, as far as I know.

Hope this helps,
Brian

Brian C. O'Meara
Asst. Prof., Dept. of Ecology and Evolutionary Biology
University of Tennessee, Knoxville
http://www.brianomeara.info
865-408-TREE (8733)

On Aug 20, 2009, at 7:31 AM, saikari keitele wrote:

Hi,

I'm trying to use the picante r package.

My aim is to use it to construct phylogenetic trees and calculate
diversity
statistics from species occurrence data in different geographic
regions.

So, I have a list of species names with abundance information
(number of

occurrences) and would like to use Phylomatic to construct a tree.

The Phylomatic documentation mentions that a reference tree is
needed and

different examples of megatrees (
http://svn.phylodiversity.net/tot/megatrees/) are given for this.
However,
they apply only (I think) to trees and plants. My data includes
other kinds
of organisms. Do you know of any other downloadable reference trees
that

include the whole taxonomy?

Thank you very much.

Saikari

[[alternative HTML version deleted]]

___
R-sig-phylo mailing list
R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo

1 2 >

1 - 100 of 110 matches

Mail list logo