Re: [R-sig-phylo] collapsing sets of nodes based on label values

2020-05-22 Thread Brian O'Meara
You might need to reindex each time, something like (pseudocode):

bad.nodes <- function_to_get_node_numbers_with_support_less_than_90(phy)
while(length(bad.nodes>0)) {
   phy <- collapse.to.star(phy, bad.nodes[1])
   bad.nodes <- function_to_get_node_numbers_with_support_less_than_90(phy)
}

But this will collapse the bad node and all its descendants. I don't see a
function that just collapses at a node and leaves descendants intact, but
this may be what you're looking for (the stuff that ape::collapse.singles
does is similar, and could perhaps be modified).

One pedantic point -- we often talk about node support, collapsing nodes,
etc., but it's really about the branch: the node doesn't have 83% bootstrap
support but the bipartition created by the edge subtending it does. For
example, if the tree is (A,(B,(C,(D,E, if the support for the CDE clade
is 83%, it's actually the proportion of trees that have a split between AB
| CDE, so the branch separating the CDE clade from everything else. The
node itself there has one branch coming in (that comes from the rest of the
tree containing A and B) and two branches coming out (one going to C, one
going to DE). We could summarize trees by actual node support: how many
times we have a node that has an edge going to AB, an edge going to C, and
an edge going to DE, but we typically don't. [For example, the branch could
separate AB | CDE, and the node is AB | C | DE, but one could also have a
node that is AB | CD | E for the same bipartition on the subtending edge].
I only bring it up here because if you end up having to write code to do
collapsing, it can be important to remember that what is actually being
collapsed is the edge subtending the labeled node because it is the the
edge with low support.

Best,
Brian

___
Brian O'Meara

Professor, Dept. of Ecology & Evolutionary Biology, UT Knoxville
Associate Head, Dept. of Ecology & Evolutionary Biology, UT Knoxville
He/Him/His



On Fri, May 22, 2020 at 9:42 AM Karla Shikev  wrote:

> Dear all,
>
> I'm surprised I could not find a simple function to do this in any of the
> packages I know. I just want to take a tree with node labels (e.g.
> bootstrap values from a RAxML analysis) and collapse all nodes with support
> values below, say, 90%. I tried to do this recursively using
> collapse.to.star in phytools, but node numbers change after the first
> collapsed node and my indexing gets completely off.
>
> with best regards,
>
> Karla
>
> [[alternative HTML version deleted]]
>
> ___
> R-sig-phylo mailing list - R-sig-phylo@r-project.org
> https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
> Searchable archive at
> http://www.mail-archive.com/r-sig-phylo@r-project.org/
>

[[alternative HTML version deleted]]

___
R-sig-phylo mailing list - R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/


Re: [R-sig-phylo] Alllowing and considering "missing data" as values for reconstruction of ancestral character states

2020-04-14 Thread Brian O'Meara
We corresponded with Carolina offline, but here I'm including a generalized
solution of the quick solution we just sent her in case it's useful for
others (avoiding the issue lampooned in https://xkcd.com/979/).

The data was a delimited file with a species column and columns for each
character, with column labels "species" and then the character numbers. So,
loaded into R after calling the relevant libraries:

library(ape)
library(corHMM)
library(geiger)

phy <- read.nexus("cladeb_tree.nex")
data <- read.delim("cladeb_sexch.txt", stringsAsFactors = FALSE)

corHMM can handle polymorphic data, but it needs to be in format 0&3, not
0+3, which was how it was coded initially. So change that:

for (i in 2:ncol(data)) {
  data[,i] <- gsub('\\+', '&', data[,i])
}

Then reconstruct ancestral states and show the output in a pdf. Loop over
each character, deleting taxa from the matrix that have an NA and deleting
taxa from the matrix and from the tree that aren't on both:

trait_indices <- c(2:ncol(data))
pdf(file="CarolinaPlots.pdf")
for (i in seq_along(trait_indices)) {
  col_num <- trait_indices[i]
  data_local <- data[,c(1,col_num)]
  data_local <- data_local[!is.na(data_local[,2]),]
  data_local <- data_local[data_local$species%in%phy$tip.label,]
  bad_tips <- phy$tip.label[!phy$tip.label%in%data_local$species]
  phy_local <- phy
  if(length(bad_tips)>0) {
phy_local <- ape::drop.tip(phy, bad_tips)
  }
  recon <- rayDISC(phy_local, data_local, model="ARD")
  plotRECON(recon$phy,recon$states,title=paste0(colnames(data)[col_num], ":
", paste(names(table(data_local[,2])), "=", table(data_local[,2]),
collapse=", ")))
}
dev.off()

Hopefully a few general tips there. Slightly better style would be to
preface most functions with the package name: use corHMM::rayDISC() rather
than rayDISC -- this is becoming more expected in packages.

Best,
Brian

___
Brian O'Meara

Professor, Dept. of Ecology & Evolutionary Biology, UT Knoxville
Associate Head, Dept. of Ecology & Evolutionary Biology, UT Knoxville
He/Him/His



On Mon, Apr 13, 2020 at 4:18 PM Carolina Santos Vieira <
carolsantosvie...@gmail.com> wrote:

> Hi Liam and everyone
>
> The numbers from this message "1000 trees with a mapped discrete character
> with states: 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43,
> 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55" are the same numbers from
> the nodes on my tree. As I replied to Brian, I assumed that this was a list
> of which states were mapped during make.simmap simulation. This warning
> message only returns when applying the ARD model.
> This character in particular is binary, and has no missing data or
> non-applicable coding.
>
> As I mentioned previously, a big part of my dataset contains polymorphic
> and non-applicable coding, and characters with 3 states (0,1,2). For those,
> when the script runs, the likelihoods are very high, and plots don't agree
> with coding.
>
> Thank you so much for your replies.
> Best,
> Carolina Vieira
> Bióloga - Mestre em Ecologia e Conservação
> Doutoranda em Biologia Animal (PPGBAN - UFRGS)
>
>
>
> <https://www.avast.com/sig-email?utm_medium=email_source=link_campaign=sig-email_content=webmail>
>  Livre
> de vírus. www.avast.com
> <https://www.avast.com/sig-email?utm_medium=email_source=link_campaign=sig-email_content=webmail>.
>
> <#m_2399085183096779712_m_1993609250326834374_DAB4FAD8-2DD7-40BB-A1B8-4E2AA1F9FDF2>
>
> Em seg., 13 de abr. de 2020 às 16:57, Liam J. Revell 
> escreveu:
>
>> Hi Carolina.
>>
>> I agree with Brian that a Q matrix in which the elements of a row are
>> all zeros is not likely to be you problem. It just means that the ML
>> estimated transition rate from 1->0, in your case, is zero.
>>
>> I agree with Brian that the message "1000 trees with a mapped discrete
>> character with states: 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40,
>> 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55" is
>> surprising. Does your discrete character really have 27 states?
>>
>> All the best, Liam
>>
>> Liam J. Revell
>> Associate Professor, University of Massachusetts Boston
>> Profesor Asistente, Universidad Católica de la Ssma Concepción
>> web: http://faculty.umb.edu/liam.revell/, http://www.phytools.org
>>
>> Academic Director UMass Boston Chile Abroad:
>> https://www.umb.edu/academics/caps/international/biology_chile
>>
>> On 4/13/2020 3:23 PM, Brian O'Meara wrote:
>> > **[EXTERNAL SENDER]
>> >
>> > With raydisc, you need one column with th

Re: [R-sig-phylo] Alllowing and considering "missing data" as values for reconstruction of ancestral character states

2020-04-13 Thread Brian O'Meara
With raydisc, you need one column with the species name and a column with
the data; so something like data[,c(1,2)] might be needed. If you want, you
could email the files (tree, data, script to run) to Jeremy (
jmbea...@uark.edu) and me to check out.

For the simmap results, a rate of zero can be the estimate if it seems that
there are no transitions from state 0 to 1. But the number of discrete
states at the end ("1000 trees with a mapped discrete character with
states: 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45,
46, 47, 48, 49, 50, 51, 52, 53, 54, 55") seems unexpected to me.

Best,
Brian


_______
Brian O'Meara

Professor, Dept. of Ecology & Evolutionary Biology, UT Knoxville
Associate Head, Dept. of Ecology & Evolutionary Biology, UT Knoxville
He/Him/His



On Mon, Apr 13, 2020 at 2:32 PM Carolina Santos Vieira <
carolsantosvie...@gmail.com> wrote:

> Hi Liam and all users
>
> My morphological data varies from binary and multisate characters,
> including polymorphism and non-applicable coding for many of those
> characters. I've made a lot of progress after Liam's advices, but lately I
> have been getting an error from make.simmap.
>
> By trying to run a simulation with make.simmap under ARD model on
> characters with states that the probability is zero on occuring in a given
> terminal,  after calling summary(), this error is displayed:
>
> Error in colMeans(x$count) : 'x' must be an array of at least two
> dimensions
>
> When changing the input of the same data in a two column array, the error
> still persists. The same data was analyzed with make.simmap under "ER" and
> "SYM" models, and they run just fine. I am also comparing the results using
> fitDiscrete (geiger) and ace (ape) functions and they display results for
> "ARD" model without a glitch. According to output from fitDiscrete
> function, ARD model is indicated as the "best" model comparing AIC
> likelihoods.
> I've noticed that in this particular character, the "Q" matrix extracted
> from make.simmap simmulation under "ARD" is the following:
>
> Q =
>   01
> 0 -1.916782 1.916782
> 1  0.00 0.00
>
> Are those "zero" probabilities the cause of the error? And by that I mean,
> should I take from this evidence that "ARD" is not an appropriate model to
> this character?
>
> Also, results from characters with "non-applicable" coding (NA in matrix)
> for some terminal are coming out very different from what I would expect.
> Is this coding allowed for ancestral reconstruction?
> I am sorry for the many questions, and I would be very glad if any
> suggestions come to mind.
>
> Best wishes,
> Carolina Vieira.
>
> -
> This is the error rundown:
>
> > anc.tree_ard<-make.simmap(tree,ch434,model="ARD")
> make.simmap is sampling character histories conditioned on the transition
> matrix
>
> Q =
>   01
> 0 -1.916782 1.916782
> 1  0.00 0.00
> (estimated using likelihood);
> and (mean) root node prior probabilities
> pi =
>   0   1
> 0.5 0.5
> Done.
> > plot(anc.tree_ard,node.numbers=T,cores,fsize=0.5,ftype="i")
> > add.simmap.legend(colors=cores,prompt=FALSE,x=0.5*par()$usr[1],
> +   y=-30*par()$usr[3],fsize=0.8)
> > title(main="ACSR \"ARD\" model (make.simmap)")
> > anc.tree_ard<-make.simmap(tree,ch434,model="ARD",nsim=1000)
> make.simmap is sampling character histories conditioned on the transition
> matrix
>
> Q =
>   01
> 0 -1.916782 1.916782
> 1  0.00 0.00
> (estimated using likelihood);
> and (mean) root node prior probabilities
> pi =
>   0   1
> 0.5 0.5
> Done.
> > anc.tree_ard
> 1000 phylogenetic trees with mapped discrete characters
> > simmap_ard<-summary(anc.tree_ard)
> > simmap_ard
> 1000 trees with a mapped discrete character with states:
>  29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46,
> 47, 48, 49, 50, 51, 52, 53, 54, 55
>
> Error in colMeans(x$count) :  'x' must be an array of at least two
> dimensions
>
>
>
> Carolina Vieira
> Bióloga - Mestre em Ecologia e Conservação
> Doutoranda em Biologia Animal (PPGBAN - UFRGS)
>
>
> <
> https://www.avast.com/sig-email?utm_medium=email_source=link_campaign=sig-email_content=webmail
> >
> Livre
> de vírus. www.avast.com
> <
> https://www.avast.com/sig-email?utm_medium=email_source=link_campaign=sig-email_content=webmail
> >.
> <#D

Re: [R-sig-phylo] ancestral state transitions clustered in time?

2020-04-09 Thread Brian O'Meara
You can use something like corHMM (https://doi.org/10.1093/sysbio/syt034)
and perhaps its rayDISC function to find parts of the tree where the
transition rates between states are different (for your question, where
rates might be higher). Basically, it allows a hidden "trait" (which isn't
necessarily a single trait, just some thing or things that change on the
tree) to affect the transition rates. A good example is
https://academic.oup.com/view-large/figure/115233499/syt034f3.jpeg -- you
can see where transition rates are higher. The nice thing (though note my
conflict of interest, as a coauthor) is that it allows the rates to
actually change over the tree; different approaches that use a single rate
matrix over the whole tree and then reconstruct changes (using things such
as stochastic character mapping) are essentially using a model to infer
things in order to find deviations from a model. One downside of the corHMM
model is that the hidden changes evolve under a continuous time Markov
model -- fine if you think whatever factors affect character rate
(biogeographic changes, presence of certain other species, new
morphological traits, etc.) evolve in this way (which is true for nearly
all of them) but not if a single factor appears suddenly (i.e., radical
change in environment for all species once an asteroid his the planet).
CorHMM might have a timeslice function to allow this, but I'm not sure.

If you have a lot of characters, a different approach could be to estimate
branch lengths under something like a Lewis MKV model, which basically
gives you amount of change averaged across all characters, and then compare
the ratio of those branch lengths to the chronogram branch lengths for
those same edges. Those branches with higher than the average ratio have
more character change per unit of time than other branches. There's
probably a paper or papers that do this, but I don't recall any at the
moment, my apologies.

If the question is about two rate regimes (i.e., before or after a massive
event) or gradual change of rates, you could do tree transformations: use
Geiger's tree transformation function with discrete characters to find the
transformations that best fit the data -- maybe more change earlier than
later, a discrete shift in rate 30 MY ago, etc.

Some standard caveats on these and other potential solutions: ancestral
state reconstruction is very hard to do well: N tips have N-1 ancestral
states to estimate, plus rate parameters. That's a lot, which is why
methods that fit overall rates could be better than reconstructing many
changes. But it's possible to over-interpret these, too: you can in theory
from corHMM get an estimate of the support for every hidden state and rates
for every node, but I wouldn't read a lot into every single node. Also,
there's still the challenge of Maddison & FitzJohn (2015):
https://doi.org/10.1093/sysbio/syu070. It still scares me, personally, and
the issues raised affect our interpretation of many methods that look at
rates, such as the ones here.

Best,
Brian


_______
Brian O'Meara

Professor, Dept. of Ecology & Evolutionary Biology, UT Knoxville
Associate Head, Dept. of Ecology & Evolutionary Biology, UT Knoxville
He/Him/His



On Thu, Apr 9, 2020 at 2:55 PM Elizabeth Miller  wrote:

> I am interested in testing whether ancestral state changes (e.g. habitat
> transitions) are clustered in time and/or in specific lineages rather than
> randomly distributed on phylogeny. Does a method already exist for testing
> this? I am aware of BayesTraits for testing correlation among states, but
> not a method for clustered distribution of state changes with time. What
> immediately comes to mind is simulating an expected temporal distribution
> of state changes if changes were random and comparing to observed temporal
> distribution, but I just want to check if something more sophisticated
> exists.
>
> Thank you!
>
> --
> Elizabeth Miller, Ph.D.
> NSF Postdoctoral Research Fellow
> School of Aquatic and Fishery Sciences
> University of Washington
>
> [[alternative HTML version deleted]]
>
> ___
> R-sig-phylo mailing list - R-sig-phylo@r-project.org
> https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
> Searchable archive at
> http://www.mail-archive.com/r-sig-phylo@r-project.org/
>

[[alternative HTML version deleted]]

___
R-sig-phylo mailing list - R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/


Re: [R-sig-phylo] Phylogenetic varying slopes and intercepts model

2019-09-11 Thread Brian O'Meara
Also worth noting is that people often use taxonomic levels if they don't
have a tree. Trees are more available than might be expected (though still
far less available than they should be), so you can get a tree and use
phylogenetic comparative methods.

Sources of trees:

https://tree.opentreeoflife.org/
https://www.treebase.org/
https://phylo.cs.nmsu.edu/
http://datelife.org/ (though the datelife R package
<https://github.com/phylotastic/datelife> may be more useful, depending on
scale)
https://datadryad.org/

Plus people make individual trees available: large trees for birds, sharks
and rays, plants, etc. that may not be deposited into a formal repository
but are on a website, plus trees stored as supplemental info in papers. So
if you're using taxonomy as a predictor because there doesn't seem to be a
good tree, there might actually be one with a bit of digging.

Best,
Brian

___
Brian O'Meara, http://brianomeara.info, especially Calendar
<http://brianomeara.info/calendar.html>, CV
<http://brianomeara.info/cv.html>, and Feedback
<http://brianomeara.info/feedback.html>

Professor, Dept. of Ecology & Evolutionary Biology, UT Knoxville
Associate Head, Dept. of Ecology & Evolutionary Biology, UT Knoxville
He/Him/His



On Wed, Sep 11, 2019 at 1:33 AM Joe Felsenstein 
wrote:

> Oscar Inostraza --
>
> Is the variable  x also evolving on the tree?  If so you need to use
> standard phylogenetically-informed comparative methods to estimate the
> variances and covariances of changes in both characters.
>
> You may not be able to assume that  y   responds instantly to  x.
>
> J.F.
> 
> Joe Felsenstein, j...@gs.washington.edu
> Department of Genome Sciences, University of Washington, Box 355065,
> Seattle, WA. 98195-5065 USA
>
> [[alternative HTML version deleted]]
>
> ___
> R-sig-phylo mailing list - R-sig-phylo@r-project.org
> https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
> Searchable archive at
> http://www.mail-archive.com/r-sig-phylo@r-project.org/
>

[[alternative HTML version deleted]]

___
R-sig-phylo mailing list - R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/


Re: [R-sig-phylo] model averaging using brownie.lite

2019-09-05 Thread Brian O'Meara
 I bet the first
tree has more meaningful data: the pectinate tree with those branch lengths
will likely have all but one of the taxa having nearly identical states. So
I think tree shape and branch lengths should matter for this. I've done
some preliminary analyses on this, building on Beaulieu et al. (2018) and
Jhwueng et al. (2014,  https://doi.org/10.1515/sagmb-2013-0048, also note
COI), but nothing definitive yet.

It's also worth looking at Ho and Ané (2014,
https://doi.org/10./2041-210X.12285) who talk about AIC in the context
of OU shifts, but who get into sample size with shifts in a modified BIC
that uses taxa in different regimes as sample size (but again, univariate,
so maybe it's actually matrix size).

I also probably am missing important work by others -- my apologies if so.
If you know of any, please let me know (and probably Karla, too!).

So, in summary yeah, what Liam said: number of taxa, but it might be
more complex.

Best,
Brian

___
Brian O'Meara, http://brianomeara.info, especially Calendar
<http://brianomeara.info/calendar.html>, CV
<http://brianomeara.info/cv.html>, and Feedback
<http://brianomeara.info/feedback.html>

Professor, Dept. of Ecology & Evolutionary Biology, UT Knoxville
Associate Head, Dept. of Ecology & Evolutionary Biology, UT Knoxville
He/Him/His



On Thu, Sep 5, 2019 at 10:00 AM Liam Revell  wrote:

> Dear Karla.
>
> In my opinion, it is probably correct to use the number of tips on the
> tree as the sample size for AICc when estimating the Brownian rate: as
> the number of independent pieces of information is n-1, just like with
> an ordinary variance. For other parameters in phylogenetic comparative
> analyses, the effective sample size may be different, however.
>
> All the best, Liam
>
> Liam J. Revell
> Associate Professor, University of Massachusetts Boston
> Profesor Asistente, Universidad Católica de la Ssma Concepción
> web: http://faculty.umb.edu/liam.revell/, http://www.phytools.org
>
> Academic Director UMass Boston Chile Abroad (starting 2019):
> https://www.umb.edu/academics/caps/international/biology_chile
>
> On 9/5/2019 9:49 AM, Karla Shikev wrote:
> > [EXTERNAL SENDER]
> >
> > Thanks so much, Liam! Just one quick follow-up question: what do you
> > suggest should be the sample size for transforming AIC into AICc? the
> > number of tips on the tree?
> >
> > Karla
> >
> > On Thu, Sep 5, 2019 at 10:27 AM Liam Revell  wrote:
> >
> >> Dear Karla.
> >>
> >> You could try & create your own logLik method for the object class
> >> "brownie.lite" as follows:
> >>
> >> ## method
> >> logLik.brownie.lite<-function(object,...){
> >>  lik<-setNames(
> >>  c(object$logL1,object$logL.multiple),
> >>  c("single-rate","multi-rate"))
> >>  attr(lik,"df")<-c(object$k1,object$k2)
> >>  lik
> >> }
> >> ## fit model
> >> fit<-brownie.lite(tree,x)
> >> ## use it
> >> logLik(fit)
> >> AIC(fit)
> >>
> >> All the best, Liam
> >>
> >> Liam J. Revell
> >> Associate Professor, University of Massachusetts Boston
> >> Profesor Asistente, Universidad Católica de la Ssma Concepción
> >> web: http://faculty.umb.edu/liam.revell/,
> https://nam01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.phytools.orgdata=02%7C01%7Cliam.revell%40umb.edu%7C04607945a9f74968c14e08d73207f67f%7Cb97188711ee94425953c1ace1373eb38%7C0%7C0%7C637032882107478464sdata=ofsem4h4SNk6g6QFUwD%2BJKO3TsTArNfH9%2BAyYDEjCvY%3Dreserved=0
> >>
> >> Academic Director UMass Boston Chile Abroad (starting 2019):
> >> https://www.umb.edu/academics/caps/international/biology_chile
> >>
> >> On 9/5/2019 9:13 AM, Karla Shikev wrote:
> >>> [EXTERNAL SENDER]
> >>>
> >>> Dear all,
> >>>
> >>> I've been trying to use brownie.lite to implement the tutorial
> available
> >>> here (
> >>
> https://nam01.safelinks.protection.outlook.com/?url=http%3A%2F%2Ftreethinkers.org%2Ftutorials%2Fmorphological-evolution-in-r%2Fdata=02%7C01%7Cliam.revell%40umb.edu%7C04607945a9f74968c14e08d73207f67f%7Cb97188711ee94425953c1ace1373eb38%7C0%7C0%7C637032882107478464sdata=64k6WMtazzmyn0SLRrx2wEA%2F2wkk3%2B%2F3dBS0HtjlUT8%3Dreserved=0
> )
> >> to
> >>> calculate model-averaged rates of evolution and for model selection (1
> >>> versus 2 rates). However, the current version of phytools 0.6-99 won't
> 

Re: [R-sig-phylo] model averaging for discrete character evolution

2019-08-07 Thread Brian O'Meara
I think this is ok in theory: say you have 10% of the weight on the ER
model, 90% on the ARD model, so your states at the nodes are based 10% on
the probabilities from ER, 90% from ARD. [Worth noting, of course, that
with N tips with one character each, estimating N-1 ancestral states, plus
the rates, plus the model weights, is asking a bit much of the data -- it
can be worth asking if there's another way to answer the biological
question]. The risk of model averaging is if some of the models are truly
terrible: you might get extreme parameter values because there's not enough
info to estimate them well. It won't be apparent with discrete ancestral
states (since there's a finite, reasonable state space), but you could have
an asymmetrical rate near infinity, for example, if you fit an ARD model
with not much data, and infinity times a weight for that model still means
a big number.

Model averaging for parameter estimates between the rates is ok, btw:

symmetrical model:

rate_0_to_1 = s
rate_1_to_0 = s

asymmetrical model:

rate_0_to_1 = a
rate_1_to_0 = b

weighted params:

rate_0_to_1 = s * weight_sym + a * weight_asym
rate_1_to_0 = s * weight_sym + b * weight_asym

Best,
Brian

___
Brian O'Meara, http://www.brianomeara.info, especially Calendar
<http://brianomeara.info/calendars/omeara/>, CV
<http://brianomeara.info/cv/>, and Feedback
<http://brianomeara.info/teaching/feedback/>

Professor, Dept. of Ecology & Evolutionary Biology, UT Knoxville
Associate Head, Dept. of Ecology & Evolutionary Biology, UT Knoxville



On Wed, Aug 7, 2019 at 11:33 AM Jacob Berv 
wrote:

> Dear R-sig phylo
>
> I’ve been running a few discrete character Mk type models using phytool's
> SIMMAP — and I had the idea that it might be useful to try model averaging
> across posterior probabilities for node states.
>
> Might this make sense to do, to avoid problems associated with model
> ranking via AIC? Ie, average the node state probabilities based on AIC
> weights? Is there some fundamental problem with this?
>
> I could imagine generating some code to generate all possible transition
> models given a set of N states, and then rather than ranking with AIC,
> model averaging for parameter estimates (though now that I think about it,
> not sure how one might reasonably average a symmetrical rate and an
> asymmetrical rate).
>
> Best,
> Jake
>
> [[alternative HTML version deleted]]
>
> ___
> R-sig-phylo mailing list - R-sig-phylo@r-project.org
> https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
> Searchable archive at
> http://www.mail-archive.com/r-sig-phylo@r-project.org/
>

[[alternative HTML version deleted]]

___
R-sig-phylo mailing list - R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/


Re: [R-sig-phylo] patristic distance doubts and meanings

2019-02-06 Thread Brian O'Meara
Hi, Bruno. It's also possible that there are other ways of getting at the
biological question you're going after, since others also examine how
phylogenetic relatedness affects overlap (my apologies for the redundancy
if you know about this already, just a teachable moment, as they say). An
early, key review of this is Webb et al., (2002):
https://doi.org/10.1146/annurev.ecolsys.33.010802.150448 and a more recent
review is Cavender-Bares et al. (2009):
https://doi.org/10./j.1461-0248.2009.01314.x. The literature blossoms
from there, but those should give a feel for the approaches.

Here's some packages used for things like overlap in species in areas
[stolen from the task view
<https://cloud.r-project.org/web/views/Phylogenetics.html>]:

*Community/Microbial Ecology *: picante
<https://cloud.r-project.org/web/packages/picante/index.html>, vegan
<https://cloud.r-project.org/web/packages/vegan/index.html>, SYNCSA
<https://cloud.r-project.org/web/packages/SYNCSA/index.html>, phylotools
<https://cloud.r-project.org/web/packages/phylotools/index.html>, PCPS
<https://cloud.r-project.org/web/packages/PCPS/index.html>, caper
<https://cloud.r-project.org/web/packages/caper/index.html>, DAMOCLES
<https://cloud.r-project.org/web/packages/DAMOCLES/index.html>, and cati
<https://cloud.r-project.org/web/packages/cati/index.html> integrate
several tools for using phylogenetics with community ecology. HMPTrees
<https://cloud.r-project.org/web/packages/HMPTrees/index.html> and GUniFrac
<https://cloud.r-project.org/web/packages/GUniFrac/index.html> provide
tools for comparing microbial communities. betapart
<https://cloud.r-project.org/web/packages/betapart/index.html> allows
computing pair-wise dissimilarities (distance matrices) and multiple-site
dissimilarities, separating the turnover and nestedness-resultant
components of taxonomic (incidence and abundance based), functional and
phylogenetic beta diversity. adiv
<https://cloud.r-project.org/web/packages/adiv/index.html> can calculate
various indices of biodiversity including species, functional and
phylogenetic diversity, as well as alpha, beta, and gamma diversities.
entropart <https://cloud.r-project.org/web/packages/entropart/index.html> can
measure and partition diversity based on Tsallis entropy as well as
calculate alpha, beta, and gamma diversities. ecospat
<https://cloud.r-project.org/web/packages/ecospat/index.html> can also
examine phylogenetic diversity. metacoder
<https://cloud.r-project.org/web/packages/metacoder/index.html> is an R
package for handling large taxonomic data sets, like those generated from
modern high-throughput sequencing, like metabarcoding.


Best,
Brian

___
Brian O'Meara, http://www.brianomeara.info, especially Calendar
<http://brianomeara.info/calendars/omeara/>, CV
<http://brianomeara.info/cv/>, and Feedback
<http://brianomeara.info/teaching/feedback/>

Associate Professor, Dept. of Ecology & Evolutionary Biology, UT Knoxville
Associate Head, Dept. of Ecology & Evolutionary Biology, UT Knoxville



On Wed, Feb 6, 2019 at 3:07 PM Eliot Miller  wrote:

> The amount of time two taxa have been evolving independently from one
> another is, barring some complex introgression, independent of other
> species in the tree. Pruning the tree from however many species to 700
> won't affect the distances between those 700. After pruning, use
> cophenetic() to nearly instantly calculate the relevant distances between
> taxa. All the values you need are in the matrix. Access the evolutionary
> distance between any two taxa by pulling the relevant element of the
> resulting cophenetic matrix, e.g. copheneticMatrix["sp1", "sp2"]. If you
> are interested in the time since two taxa shared a common ancestor,
> assuming the tree is ultrametric and in units of time, that value is the
> distance between those two taxa divided by 2.
>
> Cheers,
> Eliot
>
> On Mon, Feb 4, 2019 at 5:30 PM Bruno Garcia Luize 
> wrote:
>
> > Thank you Liam,
> >
> > For sure it is a lot of computational time. I'm only interested in a
> subset
> > of species, lets say close to 700 species in a regional species pool. I
> did
> > the computations using fastDist and it take almost 7 hs to complete in a
> > windows based 16 Gb RAM. The function was realy useful for my propose.
> > Another doubt that came is if I prune the bigger tree with my species
> list
> > and apply the ape::cophenetic the distances will remain the same or the
> > prunning will change the tree edges lengths and consequently the
> patristic
> > distances. But, by the way I see that fastDist did the work I want.
> >
> > Best regards,
> >
> > 

Re: [R-sig-phylo] Mesquite->ape node numbers and/or parsimony ancestral reconstruction

2019-01-31 Thread Brian O'Meara
It's possible that Mesquite (and within R) that discrete traits may be
interpreted by the software as continuous, depending on how they're stored
-- so it could do a reconstruction of a character with tip states of 0, 1,
and 2 and get ancestral values of 0.1, 1.75, etc. using squared change
parsimony or the equivalent. This isn't something one should do (for one
thing, if the trait is unordered, the ancestor of a clade with taxa in
state 0 and 2 having a state of 1 doesn't make sense) but it is possible
someone could run squared change parsimony on discrete traits if the
software mistakenly treats them as continuous.

Best,
Brian

___
Brian O'Meara, http://www.brianomeara.info, especially Calendar
<http://brianomeara.info/calendars/omeara/>, CV
<http://brianomeara.info/cv/>, and Feedback
<http://brianomeara.info/teaching/feedback/>

Associate Professor, Dept. of Ecology & Evolutionary Biology, UT Knoxville
Associate Head, Dept. of Ecology & Evolutionary Biology, UT Knoxville



On Thu, Jan 31, 2019 at 2:53 PM Liam Revell  wrote:

> Roland.
>
> You have not done 'squared change parsimony' in that case, but just
> parsimony. The problem (among various) is that for a discrete character
> there is not always a single most parsimonious set of ancestral states
> and there are often many equally parsimonious reconstructions. That's
> why some scientists have developed criteria (such as 'ACCTRAN' and
> 'DELTRAN') for consistently selecting one or two among the usually many
> equally parsimonious reconstructions. One should not be deceived by
> this, however. Just because the ACCTRAN and DELTRAN criteria exist
> doesn't mean that these MP solutions for ancestral states are any more
> probable than any other equally parsimonious solution (and under some
> circumstances all parsimonious solutions can be misleading).
>
> I believe parsimony ancestral state reconstruction is implemented in the
> phangorn function ancestral.pars.
>
> All the best, Liam
>
> Liam J. Revell
> Associate Professor, University of Massachusetts Boston
> Profesor Asistente, Universidad Católica de la Ssma Concepción
> web: http://faculty.umb.edu/liam.revell/, http://www.phytools.org
>
> On 1/31/2019 2:18 PM, Roland Sookias wrote:
> > Dear Liam,
> >
> > Thanks very much indeed - a really nice blog post. However, the issue is
> > that I am using discrete states, which are not ordered. (and to make
> things
> > more exciting there are also polymorphisms and missing data - although
> they
> > can be worked around I guess)
> >
> > Is there any way to get ML to yield the same as SCP would (in Mesquite at
> > least) for this kind of data?
> >
> > Best wishes
> >
> > Roland
> >
> > On Thu, Jan 31, 2019 at 8:03 PM Liam Revell  wrote:
> >
> >> Dear Roland.
> >>
> >> I didn't really solve your problem of reading a Mesquite result into R;
> >> however I just posted about the relationship between unweighted &
> >> weighted square-change-parsimony & ML ancestral character estimation on
> >> my blog:
> >> http://blog.phytools.org/2019/01/unweighted-weighted-square-change.html
> .
> >> (In summary, you should be able to do what you want within R.)
> >>
> >> All the best, Liam
> >>
> >> Liam J. Revell
> >> Associate Professor, University of Massachusetts Boston
> >> Profesor Asistente, Universidad Católica de la Ssma Concepción
> >> web: http://faculty.umb.edu/liam.revell/, http://www.phytools.org
> >>
> >> On 1/31/2019 12:48 PM, Roland Sookias wrote:
> >>> Dear all,
> >>>
> >>> I want to get squared change parsimony ancestral state reconstructions
> >> for
> >>> a matrix into R.
> >>>
> >>> I have tried the asr_max_parsimony() function from castor, but the
> >> results
> >>> are not the same as in Mesquite for example. To take an example, a
> >>> two-taxon clade with a unique state for these taxa optimized as having
> a
> >>> different state (shared by near relatives) at the ancestral node. This
> >>> doesn't happen in Mesquite and is *not* what I want.
> >>>
> >>> If there were some way to get the results from Mesquite into R that
> would
> >>> also be OK, but I cannot figure out the node numbering system in
> >> Mesquite,
> >>> to match them up with nodes as numbered by ape.
> >>>
> >>> Any help would be *very* greatly appreciated.
> >>>
> >>> Best wishes,
> >>>
&g

Re: [R-sig-phylo] OU for non-ultrametric trees

2018-12-06 Thread Brian O'Meara
You can run in OUwie if you set the root.age (we should probably make that
automatic, but haven't yet). See
https://www.rdocumentation.org/packages/OUwie/versions/1.50/topics/OUwie. I
believe the past issue with non-ultrametric trees comes from functions that
handles the creation of the VCV matrix inappropriately by assuming an
ultrametric tree. Using non-ultrametric trees may be possible in other OU
packages (ouch, slouch, phylolm, MVmorph, etc.) but I don't know for sure.

It's also worth checking for why your tree is non-ultrametric. If it has a
mixture of extinct and modern taxa, that's cool. If it's because you think
that the branch lengths reflect the amount of change that'd drive the
continuous character change (i.e., they mean number of generations for taxa
with different generation times, so they don't have the same root to tip
length) that's ok, too. If it's just that those are raw branch lengths
(parsimony, likelihood) without making the tree a chronogram, that's bad --
the main thing OU/BM models are doing is effectively stretching and
smooshing branch lengths (for OU, also adjusting the expected means) so if
the starting branch lengths don't mean something that's relevant, the
parameters you get from stretching also don't mean anything. [I imagine
you're doing it well, just a teachable moment for new students on the list].

Best,
Brian

___
Brian O'Meara, http://www.brianomeara.info, especially Calendar
<http://brianomeara.info/calendars/omeara/>, CV
<http://brianomeara.info/cv/>, and Feedback
<http://brianomeara.info/teaching/feedback/>

Associate Professor, Dept. of Ecology & Evolutionary Biology, UT Knoxville
Associate Head, Dept. of Ecology & Evolutionary Biology, UT Knoxville



On Thu, Dec 6, 2018 at 7:35 AM Danielle Miller 
wrote:

> Hi all,
>
> I have a non-rooted non-ultrametric tree and a corresponding set of a
> single trait values for each one of the tips.
> I’m interested whether it follows a BM model or an OU.
>
> Reading previous comments in the archive, I understood that running an OU
> process is inadequate in a case of non-ultrametric trees, however I did not
> fully understand why.
> As I use ouch package, what are the consequences of running an unrooted
> tree? And a non-ultrametric one?
>
> What would be the best way assessing this issue? Diversitree should be
> more reliable?
>
> I will appreciate any explanation and help,
> Danielle Miller
> ___
> R-sig-phylo mailing list - R-sig-phylo@r-project.org
> https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
> Searchable archive at
> http://www.mail-archive.com/r-sig-phylo@r-project.org/
>

[[alternative HTML version deleted]]

___
R-sig-phylo mailing list - R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/


Re: [R-sig-phylo] understanding variance-covariance matrix

2018-08-25 Thread Brian O'Meara
Hi, Agus. The variance-covariance matrix comes from the tree and the
evolutionary model, not the data. Each entry between taxa A and B in the
VCV is how much covariance I should expect between data for taxa A and B
simulated up that tree using that model. I don't want to be *that guy*, but
O'Meara et al. (2006)
https://onlinelibrary.wiley.com/doi/10./j.0014-3820.2006.tb01171.x has
a fairly accessible explanation of this (largely b/c I was just learning
about VCVs when working on that paper). Hansen and Martins (1996)
https://onlinelibrary.wiley.com/doi/10./j.1558-5646.1996.tb03914.x have
a much more detailed description of how you get these covariance matrices
from microevolutionary processes.

Typically, ape::vcv() is how you get a variance covariance for a phylogeny,
assuming Brownian motion and no measurement error. It just basically takes
the history two taxa share to create the covariance (or variance, if the
two taxa are the same taxon). A different approach, which seems to be what
you're doing, would be to simulate up a tree many times, and then for each
pair of taxa (including the pair of a taxon with itself, the diagonal of
the VCV), calculate the covariance. These approaches should get the same
results, though the shared history on the tree approach is faster.

Best,
Brian


___
Brian O'Meara, http://www.brianomeara.info, especially Calendar
<http://brianomeara.info/calendars/omeara/>, CV
<http://brianomeara.info/cv/>, and Feedback
<http://brianomeara.info/teaching/feedback/>

Associate Professor, Dept. of Ecology & Evolutionary Biology, UT Knoxville
Associate Head, Dept. of Ecology & Evolutionary Biology, UT Knoxville



On Sat, Aug 25, 2018 at 1:16 PM Agus Camacho  wrote:

> Dear list users,
>
> I am trying to make an easy R demonstration to teach the
> variance-covariance matrix to students. However, After consulting the
> internet and books, I found myself facing three difficulties to understand
> the math and code behind this important matrix. As this list is answered by
> several authors of books of phylocomp methods, thought this might make an
> useful general discussion.
>
> Here we go,
>
> 1) I dont know how to generate a phyloVCV matrix in R (Liams kindly
> described some options here
> <
> http://blog.phytools.org/2013/12/three-different-ways-to-calculate-among.html
> >
> but I cannot tell for sure what is X made of. It would seem a dataframe of
> some variables measured across species. But then, I get errors when I
> write:
>
>  tree <- pbtree(n = 10, scale = 1)
>  tree$tip.label <- sprintf("sp%s",seq(1:n))
>  x <- fastBM(tree)
> y <- fastBM(tree)
>   X=data.frame(x,y)
>  rownames(X)=tree$tip.label
>  ## Revell (2009)
>  A<-matrix(1,nrow(X),1)%*%apply(X,2,fastAnc,tree=tree)[1,]
>  V1<-t(X-A)%*%solve(vcv(tree))%*%(X-A)/(nrow(X)-1)
>## Butler et al. (2000)
>Z<-solve(t(chol(vcv(tree%*%(X-A)
>  V2<-t(Z)%*%Z/(nrow(X)-1)
>
>## pics
>Y<-apply(X,2,pic,phy=tree)
>  V3<-t(Y)%*%Y/nrow(Y)
>
> 2) The phyloVCV matrix has n x n coordinates defined by the n species, and
> it represents covariances among observations made across the n species,
> right?. Still, I do no know whether these covariances are calculated over
> a) X vs Y values for each pair of species coordinates in the matrix, across
> the n species, or b) directly over the vector of n residuals of Y, after
> correlating Y vs X, across all pairs of species coordinates. I think it may
> be a) because, by definition, variance cannot be calculated for a single
> value. I am not sure though, since it seems the whole point of PGLS is to
> control phylosignal within the residuals of a regression procedure, prior
> to actually making it.
>
> 3) If I create two perfeclty correlated variables with independent
> observations and calculate a covariance or correlation matrix for them, I
> do not get a diagonal matrix, with zeros at the off diagonals (ex. here
> <
> https://www.dropbox.com/s/y8g3tkzk509pz58/vcvexamplewithrandomvariables.xlsx?dl=0
> >),
> why expect then a diagonal matrix for the case of independence among the
> observations?
>
> Thanks in advance and sorry if I missed anything obvious here!
> Agus
> Dr. Agustín Camacho Guerrero. Universidade de São Paulo.
> http://www.agustincamacho.com
> Laboratório de Comportamento e Fisiologia Evolutiva, Departamento de
> Fisiologia,
> Instituto de Biociências, USP.Rua do Matão, trav. 14, nº 321, Cidade
> Universitária,
> São Paulo - SP, CEP: 05508-090, Brasil.
>
> [[alternative HTML version deleted]]
>
> ___
> R-sig-phylo mailing list - R-sig-phylo@r-project.o

Re: [R-sig-phylo] keeping states discrete when plotting HiSSE results

2018-07-09 Thread Brian O'Meara
Hi, Bonnie. It's not treating binary characters as continuous, but rather
showing the uncertainty in what the state is: an intermediate color means
it's not really sure. There's substantial uncertainty with ancestral
states, and our goal was to communicate that rather than the point estimate
of whatever state is best at a time lest people over interpret it [which
reminds me to finish testing the justifiably controversial OUwie ancestral
state recon that was asked about before on R-sig-phylo]. However, the way
plot.hisse.states works is basically plotting a tree of rates and then a
narrower tree of states on top of that, so I think you could just plot a
third tree of states on top of all that. You can get the internal
reconstructions and terminal states as binary characters by doing

states.tips <- hisse:::ConvertManyToBinaryState(hisse.results, "tip.mat")
states.internal <- hisse:::ConvertManyToBinaryState(hisse.results,
"node.mat")

and then use phytools to plot a tree with discrete traits using these on
top of the hisse plot. Here's the relevant code we use for plotting: you'll
want to change contMapGivenAnc to a function to give discrete change
mappings.

state.tree <- contMapGivenAnc(tree=tree.to.plot, x=states.tips, plot=FALSE,
anc.states=states.internal, lims=state.lims, ...)
state.colors <- colorRampPalette(state.colors,
space="Lab")(length(rate.tree$cols))
state.tree$cols[]<- state.colors
plot(state.tree, outline=FALSE, lwd=edge.width.state, legend=FALSE,
type=type, fsize=fsize, ...)

Hope this helps,
Brian


___________
Brian O'Meara, http://www.brianomeara.info, especially Calendar
<http://brianomeara.info/calendars/omeara/>, CV
<http://brianomeara.info/cv/>, and Feedback
<http://brianomeara.info/teaching/feedback/>

Associate Professor, Dept. of Ecology & Evolutionary Biology, UT Knoxville
Associate Head, Dept. of Ecology & Evolutionary Biology, UT Knoxville
Associate Director for Postdoctoral Activities, National Institute for
Mathematical & Biological Synthesis <http://www.nimbios.org> (NIMBioS)



On Tue, Jul 3, 2018 at 9:44 AM Bonnie Blaimer 
wrote:

> Hi all,
>
> I am using the HiSSE package and function plot.hisse.states on a list of
> several HiSSE objects to generate a heatmap phylogeny of model-averaged
> results.
>
> I'm wondering if there is a way to plot the character state reconstructions
> as being discrete changes, and if so, could someone provide guidance on
> how? The default seems to treat the binary characters as continuous and
> shows a somewhat gradual transition between the two different colors
> specified for the two states.
>
> Many thanks!
> Bonnie
>
> --
> *Bonnie B Blaimer*
> Assistant Professor
> Department of Entomology & Plant Pathology
> 3204D Gardner Hall, Campus Box 7613
> North Carolina State University
> Raleigh, NC 27695-7613
>
> [[alternative HTML version deleted]]
>
> ___
> R-sig-phylo mailing list - R-sig-phylo@r-project.org
> https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
> Searchable archive at
> http://www.mail-archive.com/r-sig-phylo@r-project.org/
>

[[alternative HTML version deleted]]

___
R-sig-phylo mailing list - R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/


Re: [R-sig-phylo] diversification rates with two traits present simultaneously

2018-06-19 Thread Brian O'Meara
Rich FitzJohn has programmed something of what Will suggested in
diversitree: look at details in ?make.musse.multitrait. I'm not sure if
there was ever a paper published on this. We do a similar thing to Will's
idea in http://rspb.royalsocietypublishing.org/content/283/1830/20152304
(with the added complication of varying over combinations of traits, since
converting six binary traits to a 2^6 state musse model would have been a
bit ambitious given the data size [or, frankly, given any possible sampling
from the tree of life]). However, neither approach deals with hidden
sources of rate heterogeneity. Jeremy Beaulieu and Daniel Caetano are
always adding stuff to hisse that deals with hidden rates (see
preprint on a hidden
biogeography model at
https://www.biorxiv.org/content/early/2018/04/17/302729), but I don't think
anything has been peer-reviewed and published yet for this use case (I
think the core hisse model is also in RevBayes, and that may have added
options making a musse version possible). If your tree is big enough,
separating it into clades and seeing if the pattern (i.e., when converting
00 -> 1, 01 -> 2, 10 -> 3, 11 -> 4 for musse, and you find that state 3
(aka binary states 10) has highest diversification rate) is consistent
across the clades could be a good way of seeing if there are some kinds of
heterogeneity that matter.

Best,
Brian

___
Brian O'Meara, http://www.brianomeara.info, especially Calendar
<http://brianomeara.info/calendars/omeara/>, CV
<http://brianomeara.info/cv/>, and Feedback
<http://brianomeara.info/teaching/feedback/>

Associate Professor, Dept. of Ecology & Evolutionary Biology, UT Knoxville
Associate Head, Dept. of Ecology & Evolutionary Biology, UT Knoxville
Associate Director for Postdoctoral Activities, National Institute for
Mathematical & Biological Synthesis <http://www.nimbios.org> (NIMBioS)



On Tue, Jun 19, 2018 at 2:03 AM Gilles Benjamin Leduc  wrote:

> Hello,
>
> Personally, I would advise you not to use a tree, for your 2 traits times
> 2 (+/-) makes 4 combinations, meaning a tree that is dichotomic cannot
> reveal anything!  Can you quantify your "diversification" without "rate"?
> How do you obtain your  "diversification rates"? If it is something you can
> count, you may try an old fashion Khi-square test… you would then know if
> there is a correlation…
>
> Have a nice day,
>
> Benjamin
>
>
> Le Lundi 18 Juin 2018 23:46 GMT, Elizabeth Christina Miller <
> ecmil...@email.arizona.edu> a écrit:
>
> > Hello,
> >
> > I am wondering what comparative method(s) is appropriate for testing if
> > diversification rates are highest when two traits are present together,
> > rather than one alone? Specifically, if I have two binary traits, let's
> say
> > freshwater/marine and temperate/tropical, what is the best way to test if
> > diversification rates are highest in tropical+freshwater groups, as
> opposed
> > to tropical+marine or temperate+freshwater? I think MuSSE can do this,
> but
> > my tree is large so I want to avoid issues associated with rate
> > heterogeneity. Are there any alternatives?
> >
> >
> >
> > --
> > Elizabeth Miller
> > PhD Candidate: Wiens Lab
> > Ecology and Evolutionary Biology
> > University of Arizona
> >
> >   [[alternative HTML version deleted]]
> >
> > ___
> > R-sig-phylo mailing list - R-sig-phylo@r-project.org
> > https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
> > Searchable archive at
> http://www.mail-archive.com/r-sig-phylo@r-project.org/
>
> ___
> R-sig-phylo mailing list - R-sig-phylo@r-project.org
> https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
> Searchable archive at
> http://www.mail-archive.com/r-sig-phylo@r-project.org/
>

[[alternative HTML version deleted]]

___
R-sig-phylo mailing list - R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/


Re: [R-sig-phylo] estimate ancestral state with OUwie models

2018-06-11 Thread Brian O'Meara
Prompted by Bruno's email, and similar requests by some students at the
Arnold & Felsenstein Evolutionary Quantitative Genetics course, we've
started adding ancestral state reconstruction to OUwie. It still needs
debugging and testing, but should be ready fairly soon. I do believe bayou
does ancestral state estimation, too, but I don't think all the models you
want will be in the set. Could be adequate to answering the biological
question, though.

Best,
Brian

___
Brian O'Meara, http://www.brianomeara.info, especially Calendar
<http://brianomeara.info/calendars/omeara/>, CV
<http://brianomeara.info/cv/>, and Feedback
<http://brianomeara.info/teaching/feedback/>

Associate Professor, Dept. of Ecology & Evolutionary Biology, UT Knoxville
Associate Head, Dept. of Ecology & Evolutionary Biology, UT Knoxville
Associate Director for Postdoctoral Activities, National Institute for
Mathematical & Biological Synthesis <http://www.nimbios.org> (NIMBioS)



On Mon, Jun 11, 2018 at 6:01 PM David Bapst  wrote:

> Just to follow off what Lucas said, but please note you cannot rescale
> branches of a phylogeny using an OU model when the tree is
> non-ultrametric (such as when it contains extinct, fossil taxa as
> tips). Slater (2014, MEE) discusses this more in a brief correction to
> Slater (2013).
>
> I don't know if anyone in this conversation has a non-ultrametric
> tree, but I wanted to make that clear for anyone who stumbles on this
> thread n the future using a google search.
> -Dave
>
>
>
> On Sun, Jun 10, 2018 at 12:25 PM, Lucas Jardim 
> wrote:
> > Hi Bruno,
> >
> > You can transform the branches of your phylogeny using the estimated
> > parameters of OU models. Then, if those models describe the observed data
> > adequatly, the transformed tree should model the observed data as a
> > Brownian motion model. So you can use an ancestral state reconstruction
> > based on Brownian motion model. However, I do not know if that is the
> best
> > approach as optimum values would not be included into the reconstruction
> > process.
> >
> > Best,
> > --
> > Lucas Jardim
> > Doutor em Ecologia e Evolução
> > Bolsista do INCT-EECBio (Ecologia, Evolução e Conservação da
> > Biodiversidade)
> > Instituto de Ciências Biológicas
> > Laboratório de Ecologia Teórica e Síntese
> > Universidade Federal de Goiás
> > http://dinizfilho.wix.com/dinizfilholab
> >
> > [[alternative HTML version deleted]]
> >
> > ___
> > R-sig-phylo mailing list - R-sig-phylo@r-project.org
> > https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
> > Searchable archive at
> http://www.mail-archive.com/r-sig-phylo@r-project.org/
>
>
>
> --
> David W. Bapst, PhD
> Asst Research Professor, Geology & Geophysics, Texas A & M University
> https://github.com/dwbapst/paleotree
> Google Calendar: https://goo.gl/EpiM4J
>
> ___
> R-sig-phylo mailing list - R-sig-phylo@r-project.org
> https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
> Searchable archive at
> http://www.mail-archive.com/r-sig-phylo@r-project.org/
>

[[alternative HTML version deleted]]

___
R-sig-phylo mailing list - R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/


Re: [R-sig-phylo] Problem with negative ages in OUwie.boot?

2018-05-02 Thread Brian O'Meara
On Wed, May 2, 2018 at 2:53 PM, David Bapst  wrote:

> Given that your tree appears to be non-ultrametric enough to cause
> branching.times to throw some nonsensical node ages, if it is supposed
> to be ultrametric. I recommend checking it carefully to figure out why
> the tips seem to not quite be at the same distance from the root.
>
>
 Sometimes this happens with tree import from a file -- it could be a
newick tree with branch lengths precise to the hundredths but a lot of the
R  ultrametric tests by default use higher precision (1e-08, iirc).

Best,
Brian

[[alternative HTML version deleted]]

___
R-sig-phylo mailing list - R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/


Re: [R-sig-phylo] grafting chronograms onto backbone phylogeny

2018-04-26 Thread Brian O'Meara
Look at congruify.phylo in geiger (and the associated paper: Eastman JM, LJ
Harmon, and DC Tank. 2013. Congruification: support for time scaling large
phylogenetic trees. Methods in Ecology and Evolution).

You could also look into SDM in ape (converting trees [not data] to
distance matrices, first): relevant paper Criscuolo, A., Berry, V.,
Douzery, E. J. P. , and Gascuel, O. (2006) SDM: A fast distance-based
approach for (super)tree building in phylogenomics. Systematic Biology, 55,
740–755.

Best,
Brian

___
Brian O'Meara, http://www.brianomeara.info, especially Calendar
<http://brianomeara.info/calendars/omeara/>, CV
<http://brianomeara.info/cv/>, and Feedback
<http://brianomeara.info/teaching/feedback/>

Associate Professor, Dept. of Ecology & Evolutionary Biology, UT Knoxville
Associate Head, Dept. of Ecology & Evolutionary Biology, UT Knoxville
Associate Director for Postdoctoral Activities, National Institute for
Mathematical & Biological Synthesis <http://www.nimbios.org> (NIMBioS)


On Thu, Apr 26, 2018 at 5:07 PM, Chris Law <cj...@ucsc.edu> wrote:

> Hi all,
>
> I am trying to graft 3 chronograms onto a family level backbone. Does
> anybody have any suggestions on what is the best way to do this? Or is
> going through the tree files in textwrangler the only way to do this.
>
> Thanks!
>
>
> *Chris Law​ **|*
> * ​*PhD Student
>  *|*
> * ​*University of California, Santa Cruz
> ​
>
>
> Coastal Biology Building
> ​
> 130 McAllister Way
> Santa Cruz, CA 95060
> cj...@ucsc.edu *|* research.pbsci.ucsc.edu/eeb/cjlaw/
> Small Mammal Research in the Forest Internship
> <http://research.pbsci.ucsc.edu/envs/smurf>
>
> [[alternative HTML version deleted]]
>
> ___
> R-sig-phylo mailing list - R-sig-phylo@r-project.org
> https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
> Searchable archive at http://www.mail-archive.com/r-
> sig-ph...@r-project.org/
>

[[alternative HTML version deleted]]

___
R-sig-phylo mailing list - R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/


Re: [R-sig-phylo] quantitative state dependent diversification of another quantitative trait

2018-04-04 Thread Brian O'Meara
I think you want SLOUCH: Hansen et al. 2008:
https://onlinelibrary.wiley.com/doi/pdf/10./j.1558-5646.2008.00412.x.
One trait evolves under Brownian motion, another trait follows it.

Best,
Brian

___
Brian O'Meara, http://www.brianomeara.info, especially Calendar
<http://brianomeara.info/calendars/omeara/>, CV
<http://brianomeara.info/cv/>, and Feedback
<http://brianomeara.info/teaching/feedback/>

Associate Professor, Dept. of Ecology & Evolutionary Biology, UT Knoxville
Associate Head, Dept. of Ecology & Evolutionary Biology, UT Knoxville
Associate Director for Postdoctoral Activities, National Institute for
Mathematical & Biological Synthesis <http://www.nimbios.org> (NIMBioS)


On Wed, Apr 4, 2018 at 2:43 PM, Zach Culumber <zculum...@gmail.com> wrote:

> Hi everyone,
>
> We are interested in examining whether shifts in one continuous trait have
> preceded shifts in another continuous trait through evolutionary time for a
> phylogeny of ~80 species.  To do this we were thinking of calculating the
> change in the value of each trait for each tip to its most recent ancestor
> (i.e., node) and then for each node to its most recent ancestral node.  We
> have all the tip values and conducted ASR to get the node values.  However,
> we are unaware of any functions or packages that would automate calculation
> of all these distances for teach tip and node to its most recent ancestor.
> Does anyone have any suggestions of functions or packages that might
> possibly do this?  Alternatively, are there better existing options for
> exploring this question? I.e., something similar to QuaSSE but for analysis
> between two quantitative traits rather than traits and species
> diversification?  Something like Bayes Traits or PGLS would allow tests of
> correlated evolution, but not the order of evolutionary shifts.
>
> Thank you for any insights!
>
> Zach
>
> [[alternative HTML version deleted]]
>
> ___
> R-sig-phylo mailing list - R-sig-phylo@r-project.org
> https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
> Searchable archive at http://www.mail-archive.com/r-
> sig-ph...@r-project.org/
>

[[alternative HTML version deleted]]

___
R-sig-phylo mailing list - R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/


Re: [R-sig-phylo] Interpretation of standard errors of parameter estimates in OUwie models

2018-04-04 Thread Brian O'Meara
Apparently the plot did not come through correctly. I'm attaching a PDF of
it.

Best,
Brian

___
Brian O'Meara, http://www.brianomeara.info, especially Calendar
<http://brianomeara.info/calendars/omeara/>, CV
<http://brianomeara.info/cv/>, and Feedback
<http://brianomeara.info/teaching/feedback/>

Associate Professor, Dept. of Ecology & Evolutionary Biology, UT Knoxville
Associate Head, Dept. of Ecology & Evolutionary Biology, UT Knoxville
Associate Director for Postdoctoral Activities, National Institute for
Mathematical & Biological Synthesis <http://www.nimbios.org> (NIMBioS)


On Wed, Apr 4, 2018 at 4:24 PM, Brian O'Meara <omeara.br...@gmail.com>
wrote:

> Hi, Rafael et al.
>
> ?OUwie has some information on the standard errors:
>
> The Hessian matrix is used as a means to estimate the approximate standard
>> errors of the model parameters and to assess whether they are the maximum
>> likelihood estimates. The variance-covariance matrix of the estimated
>> values of alpha and sigma^2 are computed as the inverse of the Hessian
>> matrix and the standard errors are the square roots of the diagonals of
>> this matrix. The Hessian is a matrix of second-order derivatives and is
>> approximated in the R package numDeriv. So, if changes in the value of a
>> parameter results in sharp changes in the slope around the maximum of the
>> log-likelihood function, the second-order derivative will be large, the
>> standard error will be small, and the parameter estimate is considered
>> stable. On the other hand, if the second-order derivative is nearly zero,
>> then the change in the slope around the maximum is also nearly zero,
>> indicating that the parameter value can be moved in any direction without
>> greatly affecting the log-likelihood. In such situations, the standard
>> error of the parameter will be large.
>>
>> For models that allow alpha and sigma^2 to vary (i.e., OUMV, OUMA, and
>> OUMVA), the complexity of the model can often times be greater than the
>> information that is contained within the data. As a result one or many
>> parameters are poorly estimated, which can cause the function to return a
>> log-likelihood that is suboptimal. This has great potential for poor model
>> choice and incorrect biological interpretations. An eigendecomposition of
>> the Hessian can provide an indication of whether the search returned the
>> maximum likelihood estimates. If all the eigenvalues of the Hessian are
>> positive, then the Hessian is positive definite, and all parameter
>> estimates are considered reliable. However, if there are both positive and
>> negative eigenvalues, then the objective function is at a saddlepoint and
>> one or several parameters cannot be estimated adequately. One solution is
>> to just fit a simpler model. Another is to actually identify the offending
>> parameters. This can be done through the examination of the eigenvectors.
>> The row order corresponds to the entries in index.matrix, the columns
>> correspond to the order of values in eigval, and the larger the value of
>> the row entry the greater the association between the corresponding
>> parameter and the eigenvalue. Thus, the largest values in the columns
>> associated with negative eigenvalues are the parameters that are causing
>> the objective function to be at a saddlepoint.
>>
>
>  You can get better (by which I mean more accurate) estimates by
> parametric bootstrapping (simulation).
>
> Doing a quick summary of your OUM results (code at end):
>
> Trait: contr
> nonfor: 0.12 (-0.12, 0.36)
> inter: 0.12 (-0.12, 0.36)
> fores: 0.13 (-0.13, 0.39)
> Phylogenetic half life: 0.06
> Tree height: 28.34
>
> Trait: cshd
> nonfor: 0.24 (-0.23, 0.72)
> inter: 0.33 (-0.32, 0.98)
> fores: 0.2 (-0.19, 0.6)
> Phylogenetic half life: 6.35
> Tree height: 28.34
>
> Trait: dors
> nonfor: 0.04 (-0.04, 0.12)
> inter: 0.15 (-0.14, 0.44)
> fores: 0 (0, 0.01)
> Phylogenetic half life: 3.68
> Tree height: 28.34
>
> Trait: vent
> nonfor: 0.05 (-0.05, 0.14)
> inter: 0.32 (-0.3, 0.93)
> fores: -0.03 (0.03, -0.08)
> Phylogenetic half life: 5.43
> Tree height: 28.34
>
> It does seem that you don't have a lot of ability to tell optima apart.
> The alpha value is pretty low, though not tremendously (the half life is a
> good way to look at that). But the raw data also suggest it's not three
> very clear clumps (beeswarm plot, regimes on the x axis, trait 1 value on
> the y). It's not phylogenetically corrected, but still a good gut check:
>
>
>
> I agree that for your data, it may be t

Re: [R-sig-phylo] Interpretation of standard errors of parameter estimates in OUwie models

2018-04-04 Thread Brian O'Meara
cat(paste0("\n",regimes[regime.index], ": ", result.vector[1], "
(", result.vector[2], ", ", result.vector[3], ")"))
if (trait.index==1) {
observations[[regime.index]] <-
bla$data[which(bla$data[,1]==regime.index),2]
}
}
cat(paste0("\nPhylogenetic half life: ", round(log(2)/data["alpha",
traits[trait.index]], 2)))
cat(paste0("\nTree height: ",
round(max(ape::branching.times(bla$phy)),2)))
}

points.to.plot <- bla$data[,1:2]
colnames(points.to.plot) <- c("regime", "trait1")
beeswarm(trait1~regime, data=points.to.plot, col="black", pch=20)




___
Brian O'Meara, http://www.brianomeara.info, especially Calendar
<http://brianomeara.info/calendars/omeara/>, CV
<http://brianomeara.info/cv/>, and Feedback
<http://brianomeara.info/teaching/feedback/>

Associate Professor, Dept. of Ecology & Evolutionary Biology, UT Knoxville
Associate Head, Dept. of Ecology & Evolutionary Biology, UT Knoxville
Associate Director for Postdoctoral Activities, National Institute for
Mathematical & Biological Synthesis <http://www.nimbios.org> (NIMBioS)


On Wed, Apr 4, 2018 at 4:07 PM, Jacob Berv <jakeberv.r.sig.ph...@gmail.com>
wrote:

> Even if those models may fit the data better, they may not necessarily
> help Rafael determine whether or not the parameter estimates of interest
> are different across regimes (though perhaps BMS might be informative).
>
> Rafael, maybe you could try fixing the ancestral regimes to match most
> likely states for each node from your SIMMAP summary? I wonder if that
> might decrease your ‘uncertainty’ in parameter estimates.
>
> I don’t have a great answer for your main question though, which is a good
> one.
>
> Jake
>
> > On Apr 4, 2018, at 8:59 PM, William Gearty <wgea...@stanford.edu> wrote:
> >
> > Hi Rafael,
> >
> > Have you tried running the BM1, BMS, or OU1 models? If so, how do the
> AICc
> > values for those compare to those of the OUM model? If not, you should
> make
> > sure you run those.
> > If you find that the these models fit your data better, this could
> indicate
> > that there isn't a large different across regimes and an OUM model isn't
> > necessarily suitable.
> >
> > Best,
> > Will
> >
> > On Wed, Apr 4, 2018 at 12:30 PM, Rafael S Marcondes <
> raf.marcon...@gmail.com
> >> wrote:
> >
> >> Dear all,
> >>
> >> I'm writing (again!) to ask for help interpreting standard errors of
> >> parameter estimates in OUwie models.
> >>
> >> I'm using OUwie to examine how the evolution of bird plumage color
> varies
> >> across habitat types (my selective regimes) in a tree of 229 tips. I was
> >> hoping to be able to make inferences based on OUMV and OUMVA models,
> but I
> >> was getting nonsensical theta estimates from those. So I've basically
> given
> >> up on them for now.
> >>
> >> But even looking at theta estimates from OUM models, I'm getting really
> >> large standard errors, often overlapping the estimates from other
> selective
> >> regimes. So I was wondering what that means exactly. How are these erros
> >> calculated? How much do high errors it limit the biological inferences I
> >> can make? I'm more interested in the relative thetas across regimes
> than on
> >> the exact values (testing the prediction that birds in darker habitats
> tend
> >> to adapt to darker plumages than birds in more illuminated habitats).
> >>
> >> I have attached a table averaging parameter estimates and errors from
> >> models fitted across a posterior distribution of 100 simmaps for four
> >> traits; and one exemplar fitted model from one trait in one of those
> >> simmaps.
> >>
> >> Thanks a lot for any help,
> >>
> >>
> >> *--*
> >> *Rafael Sobral Marcondes*
> >> PhD Candidate (Systematics, Ecology and Evolution/Ornithology)
> >>
> >> Museum of Natural Science <http://sites01.lsu.edu/wp/mns/>
> >> Louisiana State University
> >> 119 Foster Hall
> >> Baton Rouge, LA 70803, USA
> >>
> >> Twitter: @brown_birds <https://twitter.com/brown_birds>
> >>
> >>
> >> ___
> >> R-sig-phylo mailing list - R-sig-phylo@r-project.org
> >> https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
> >> Searchable archive at http:

Re: [R-sig-phylo] correlation of binary and multistate

2018-03-21 Thread Brian O'Meara
You can do this with corHMM by making an appropriate model with rayDISC.
For example, if character A has states 0,1 and character B has states
0,1,2, you then recode this into a single multistate character:

00 -> 0
01 -> 1
02 -> 2
10 -> 3
11 -> 4
12 -> 5

and then create a rate matrix that matches your hypotheses (i.e., force the
00->10 rate [the 0->3 rate in the new coding] to be the same as the 01->11
and 02->12 rates (so that the state of the second character doesn't affect
the rate of the first character) or and compare (or model average) with a
model where they are free to vary).

It wouldn't be hard to make a looping thing to do this automatically
(Jeremy Beaulieu has for two and three binary trait correlation in the same
package with corDISC but not for multistate yet) but as far as I know no
one has (pull requests welcome!).

There may also be this functionality in phytools: I remember some Pagel
correlation work was added, but I don't know about its limitations with
multistate.

Best,
Brian

PS: Of course, with all this stuff, I now worry about the Maddison and
FitzJohn issue: https://academic.oup.com/sysbio/article/64/1/127/2847997.


_______
Brian O'Meara, http://www.brianomeara.info, especially Calendar
<http://brianomeara.info/calendars/omeara/>, CV
<http://brianomeara.info/cv/>, and Feedback
<http://brianomeara.info/teaching/feedback/>

Associate Professor, Dept. of Ecology & Evolutionary Biology, UT Knoxville
Associate Head, Dept. of Ecology & Evolutionary Biology, UT Knoxville
Associate Director for Postdoctoral Activities, National Institute for
Mathematical & Biological Synthesis <http://www.nimbios.org> (NIMBioS)


On Wed, Mar 21, 2018 at 3:19 PM, John Denton <jden...@amnh.org> wrote:

> Hi all,
>
> Does anybody know of a package that does character correlations between
> binary and multistate characters (and even better with the possibility of
> missing data)?
>
> Thanks!
>
> ~John
> ___
> R-sig-phylo mailing list - R-sig-phylo@r-project.org
> https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
> Searchable archive at http://www.mail-archive.com/r-
> sig-ph...@r-project.org/
>

[[alternative HTML version deleted]]

___
R-sig-phylo mailing list - R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/


Re: [R-sig-phylo] How to test stratification of sampling across tree?

2017-03-15 Thread Brian O'Meara
One thing you could do is take all nonoverlapping pairs of taxa
(Felsenstein's other technique in the contrasts paper): that is, for a tree
(A,(B,(C,(D,E, you can look at D-E and B-C, *or* D-E and A-B, *or* D-E
and A-C, *or* A-B and C-D, etc. (so, still leaving out one taxon each time,
but only using each edge once) and then compare the states for pair of
taxa. If state 0 has freq p, and state 1 has freq q, you should see p^2 0-0
pairs, 2pq 0-1 pairs, and q^2 1-1 pairs with truly random sampling; if it's
maximally stratified, you should see only two of these pairs (i.e., if 1 is
less common, you should see only 0-1 and 0-0 pairs). You could rig up a
test statistic (prob comparing two multinomial models) from this.

I have some code that purportedly gets all independent such pairs in R at
https://r-forge.r-project.org/scm/viewvc.php/pkg/R/independentTaxa.R?view=markup=366=omearalab.
However, I haven't rigorously tested it (this was in the dark ages before
we all used testthat, too), so feel free to take, hack, republish, etc.,
but test first [anyone else should feel free to take this, too -- it could
naturally go into phytools, ape, or phangorn, for example, assuming it
actually works].

Best,
Brian


___
Brian O'Meara, http://www.brianomeara.info, especially Calendar
<http://brianomeara.info/calendars/omeara/>, CV
<http://brianomeara.info/cv/>, and Feedback
<http://brianomeara.info/teaching/feedback/>

Associate Professor, Dept. of Ecology & Evolutionary Biology, UT Knoxville
Associate Head, Dept. of Ecology & Evolutionary Biology, UT Knoxville
Associate Director for Postdoctoral Activities, National Institute for
Mathematical & Biological Synthesis <http://www.nimbios.org> (NIMBioS)
Communication Director, Society of Systematic Biologists

On Tue, Mar 14, 2017 at 1:37 PM, Ross Mounce <ross.mou...@gmail.com> wrote:

> So I tried a 12-taxon fully pectinate tree with Blomberg's K as calculated
> by picante::Kcalc()
>
> library(picante)
> library(ape)
> aa<-"(A,(B,(C,(D,(E,(F,(G,(H,(I,(J,(K,L)));"
> t1<-read.tree(text=aa)
> t4 <- compute.brlen(t1,method="Grafen",1)
> tipvals <- c(0,1,0,1,0,1,0,1,0,1,0,1)
> Kcalc(tipvals,t4)
>
> K = 0.3487135
>
>
> There are possible 924 permutations of 6 'painted' tips from 0011
> to 1100 (each of those two extreme distributions gives the maximum
> value of K for this particular tree & number [6] of painted tips: 2.37703)
>
> There are 336 discrete integer value results of K for this tree, and
> painting 6 tips:
>
> Min. 1st Qu.  MedianMean 3rd Qu.Max.
>  0.3438  0.3676  0.4489  0.5332  0.5945  2.3770
>
>
> A histogram of all 924 possible values of K (for 6 painted tips) shows that
> Blomberg's K in terms of value distribution is extremely positively skewed
> (it has a skewness of 2.671139), which is great if you're looking to test
> phylogenetic signal without false positives, but not so great if you're
> trying to assess "evenness" at the other tail of the distribution. In the
> case of this exact tree, because I've enumerated all possible permutations
> of 6 painted tips, I can calculate the 5% significance threshold as values
> of K that are 0.3480158 or less.  It seems some normalisation procedure
> might be needed before safely using Blomberg's K to assess the significance
> of evenness, if one is not going to exhaustively examine/enumerate all
> possible values (which I can't do for a 1000+ tip tree).
>
> Does that make sense?
>
> Certainly interesting...
>
>
> Ross
>
>
>
>
>
>
>
>
> On 14 March 2017 at 14:53, Ross Mounce <ross.mou...@gmail.com> wrote:
>
> > Thanks Dave,
> >
> > I'll try Blomberg's K with small simulated fully-bifurcating trees of
> > simple shape (e.g. fully pectinate), where I can easily paint the tips
> > myself in what I believe to be a "maximally stratified manner" e.g.
> > 010101010 to see if Blomberg's K does actually reach minimum (i.e.
> 0.0
> > ?) for such a distribution. If it does, great! This is the measure I
> need.
> >
> > I still wonder though, for a complex tree structure in terms of
> > balance/shape somewhere intermediate between fully balanced and fully
> > pectinate; how does one arrive empirically at _the_ most optimal
> > stratified/even sampling ('painting') of tips if say only 25% of tips
> > are/can be 'painted'. I guess a lot depends on how one defines what 'even
> > sampling' on a phylogeny actually is, does it include branch lengths et
> > cetera...
> >
> > I'll give it a try anyway,
> >
> > Thanks again,
> >
> > Ross
> >

Re: [R-sig-phylo] Use of a Phylocom tree for PGLS

2016-09-18 Thread Brian O'Meara
There also are several empirical plant chronograms available for reuse:
check TreeBase, OpenTree, and Dryad. Or you could use something like phlawd
or supersmart (just out in SystBio) to make a phylogeny for your taxa.
Phylomatic is convenient and will have all your taxa placed in some way,
but it'd make your conclusions more robust to just download a tree, make
sure the taxa match yours (look at the taxize package), prune out the taxa
that don't match (treedata() in Geiger), and rerun. If the conclusions are
qualitatively the same, you satisfy the reviewer; if not, you may have
saved yourself from an error. Either way the paper is improved. I haven't
reviewed your paper, but if the central point is using a phylogeny to look
at signal (rather than have this be just one small part of a paper focused
on something else), then doing even more to get a better tree might be
worth it -- the tree is THE thing used in every analysis, may as well get
it right (or at least, less wrong). If the paper is mostly on something
else, though, this might be a good enough solution.

Two notes for disclosure: I've been on papers that make empirical trees, so
I suppose there's a minor COI there (I might get one more citation). I'm
also part of a project, phylotastic, that is working to make getting trees
easier. We're doing a soft release soon, but there's nothing I'd suggest
you try using yet (we know people are going to form a first impression and
then decide to come back based on that first impression -- we're not quite
ready for our debut yet, though it is all open). My part is to make it
easier to get dates on trees: there's an R package for this (datelife, on
github) and a website, but I'm still debugging -- don't use yet for
something that matters, but it should make all this easier in the future.
But it does give me an incentive to say, "get a better tree", so take that
into consideration.

Best,
Brian


_______
Brian O'Meara, http://www.brianomeara.info, especially Calendar
<http://brianomeara.info/calendars/omeara/>, CV
<http://brianomeara.info/cv/>, and Feedback
<http://brianomeara.info/teaching/feedback/>

Associate Professor, Dept. of Ecology & Evolutionary Biology, UT Knoxville
Associate Head, Dept. of Ecology & Evolutionary Biology, UT Knoxville
Associate Director for Postdoctoral Activities, National Institute for
Mathematical & Biological Synthesis <http://www.nimbios.org> (NIMBioS)
Communication Director, Society of Systematic Biologists

On Sat, Sep 17, 2016 at 11:38 PM, Jacob Berv <jakeberv.r.sig.ph...@gmail.com
> wrote:

> Personally, I don’t think I’d have a problem with this approach
> (especially given the paper Liam cited) given that you are using a
> phylogeny (a model) to test a hypothesis, which is, after all, all we can
> ever do. You can always do more, so any threshold of phylogenetic tree
> “quality” is going to be somewhat arbitrary anyway. I don’t mean to sound
> defeatist — obviously you should do the best job that you can, and it
> sounds like you have done that given that the goal of your research is not
> to reconstruct a new phylogeny of the clade you’re studying. Perhaps you
> can try to explain this in your rebuttal.
>
> Or alternatively, use Liam’s awesome code (below) to generate all possible
> phylogenetic trees give your polytomies, and run your PGLS on all of them
> to see if its sensitive to topology. You could probably loop this up
> somehow without much effort.
> http://blog.phytools.org/2016/08/resolving-one-or-more-
> multifurcations.html?m=1 <http://blog.phytools.org/
> 2016/08/resolving-one-or-more-multifurcations.html?m=1>
>
> Jake
>
>
>
> > On Sep 17, 2016, at 6:46 PM, Oscar Valverde <os.valverd...@gmail.com>
> wrote:
> >
> > Dear colleages
> >
> > I am working on an phylogenetic signal and PGLS analysis using a database
> > with values for ~600 plant species. To construct my phylogeny I used the
> > backbone Phylomatic supertree (http://phylodiversity.net/phylomatic/)
> and
> > added branch lengths with the based on a fossil calibration for
> angiosperms
> > using the bladj function in phylocom, and resolved polytomies with the
> > multi2di function in phylotools.
> >
> > Now that I am trying to publish the paper, some reviewers indicated that
> > such tree is not suitable for statistical analysis because the level of
> > resolution of the tree is too low (to the family level maybe?) and the
> > uncertainty is too high to get any reliable result with respect to PGLS
> or
> > phylogenetic signal of the traits. Instead, they suggest I should build
> my
> > own tree based on sequences. Of course this is a major project to
> undertake
> > and in my opinion far

Re: [R-sig-phylo] Standard errors of theta estimates in OU models

2016-09-15 Thread Brian O'Meara
That's a really cool question. I don't know. I'm pretty sure that
practically, the answer is no (esp when you consider all the issues -- any
feasible OU is too simple a model to match reality, tree errors, data
errors, etc. -- that also come into an analysis). However, I could imagine
that in a perfect world you could look at expected standard errors created
using parametric bootstrapping from the MLE of the parameters and see if
the actual standard errors are different -- maybe it could reflect
plasticity or some other factor. But I really don't know. You might look at
some of the methods that can deal with intraspecific and interspecific
processes to get at this sort of biological question (the Felsenstein 2008
Contrasts revisited paper might be interesting as a start, though it's not
OU and comes at this from a different direction), but it's fun to think
about ways to go from what is treated as annoying noise to a meaningful
signal.

Best,
Brian

___
Brian O'Meara, http://www.brianomeara.info, especially Calendar
<http://brianomeara.info/calendars/omeara/>, CV
<http://brianomeara.info/cv/>, and Feedback
<http://brianomeara.info/teaching/feedback/>

Associate Professor, Dept. of Ecology & Evolutionary Biology, UT Knoxville
Associate Head, Dept. of Ecology & Evolutionary Biology, UT Knoxville
Associate Director for Postdoctoral Activities, National Institute for
Mathematical & Biological Synthesis <http://www.nimbios.org> (NIMBioS)
Communication Director, Society of Systematic Biologists

On Thu, Sep 15, 2016 at 5:37 PM, Xiaojing Wei <weixx...@umn.edu> wrote:

> Dear R-sig-phylo users,
>
> I wonder if the standard errors of the theta estimates in OU models have
> any biologically meaningful interpretations. For instance, could it give
> some indication of plasticity in traits, or the optimal level of plasticity
> in traits? Or does it simply reflect estimation uncertainty of the
> phylogeny or the OU model?
>
> Thanks a lot!
>
> --
> Xiaojing Wei
> PhD student
> Dept. of Ecology, Evolution, and Behavior
> University of Minnesota
> Rm. 211 Ecology Bld., 1987 Upper Buford Cir.
> St. Paul, 55108
>
> [[alternative HTML version deleted]]
>
> ___
> R-sig-phylo mailing list - R-sig-phylo@r-project.org
> https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
> Searchable archive at http://www.mail-archive.com/r-
> sig-ph...@r-project.org/
>

[[alternative HTML version deleted]]

___
R-sig-phylo mailing list - R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/


Re: [R-sig-phylo] Query about Phylogenetics in R

2016-08-31 Thread Brian O'Meara
A similar solution is to create a pdf:

pdf(file="AwesomeTree.pdf", width=10, height=10)
plot(phy)
dev.off()

As you increase the width and height, the labels become proportionally
smaller. You can probably mess around with cex options, too, but this is
the approach I normally use.

Best,
Brian

_______
Brian O'Meara, http://www.brianomeara.info, especially Calendar
<http://brianomeara.info/calendars/omeara/>, CV
<http://brianomeara.info/cv/>, and Feedback
<http://brianomeara.info/teaching/feedback/>

Associate Professor, Dept. of Ecology & Evolutionary Biology, UT Knoxville
Associate Head, Dept. of Ecology & Evolutionary Biology, UT Knoxville
Associate Director for Postdoctoral Activities, National Institute for
Mathematical & Biological Synthesis <http://www.nimbios.org> (NIMBioS)
Communication Director, Society of Systematic Biologists

On Wed, Aug 31, 2016 at 5:25 AM, Vojtěch Zeisek <vo...@trapa.cz> wrote:

> Hello,
> if onlythe readibility of the labels is the issue, what about using svg()
> function? It produces a SVG file which You can open in almost any vector
> editor (like Inkscape) andmove the labels around. Chech ?svg for possible
> settings and play with them little bit. They vary among operating systems.
> So You can do something like
> svg(...) # Any parameters inside, at least output file name
> # All the tree ploting commands...
> dev.off() # To save the SVG file to the disc
> Check also parameters of function tip.labels() - You can specify font size
> or replace the text with some symbol.
> (Sorry for being so brief, I don't have exact examples handy)
> Yours,
> V.
>
>
> 31. srpna 2016 11:11:33 CEST, "Bhuller, Ravneet" <
> ravneet.bhulle...@imperial.ac.uk> napsal:
> >Dear All,
> >
> >I am trying to construct a phylogenetic tree (neighbour joining) using
> >either APE or PHANGORN in R. But, since I have got 220 strains of
> >bacteria in my data, the resulted tree is not very clear. The branch
> >labels are so much overlapping that they cannot be read at all. Is
> >there any way, I can get a neat tree with clearly read labels? Any
> >guidance will be very much appreciated.
> >
> >Regards,
> >
> >Rav
> --
> Vojtěch Zeisek
> http://trapa.cz/cs
>
> ___
> R-sig-phylo mailing list - R-sig-phylo@r-project.org
> https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
> Searchable archive at http://www.mail-archive.com/r-
> sig-ph...@r-project.org/
>

[[alternative HTML version deleted]]

___
R-sig-phylo mailing list - R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/

Re: [R-sig-phylo] Measurement error for tips with only 1 measured specimen (OUwie)

2016-08-17 Thread Brian O'Meara
Another possibility is to do a regression to predict variance for species
with a single observation. Or even do a phylogenetic regression so species
nearer the ones with missing data matter more.

But all this stuff is minor tweaks: it's great you're incorporating
measurement error at all, and I hope your results are robust to any of
these suggestions for tweaks.

Best,
Brian

___
Brian O'Meara, http://www.brianomeara.info, especially Calendar
<http://brianomeara.info/calendars/omeara/>, CV
<http://brianomeara.info/cv/>, and Feedback
<http://brianomeara.info/teaching/feedback/>

Associate Professor, Dept. of Ecology & Evolutionary Biology, UT Knoxville
Associate Head, Dept. of Ecology & Evolutionary Biology, UT Knoxville
Associate Director for Postdoctoral Activities, National Institute for
Mathematical & Biological Synthesis <http://www.nimbios.org> (NIMBioS)
Communication Director, Society of Systematic Biologists

On Tue, Aug 16, 2016 at 11:53 PM, Liam J. Revell <liam.rev...@umb.edu>
wrote:

> Hi Santiago.
>
> This is identical to my suggestion, except that the pooled variance is a
> weighted mean in which weights (for better or worse) are proportional to
> the sample size of each species. If the variances are indeed homogeneous,
> this should be preferred because it gives greater weight to species whose
> variance we should know well. If not, then it risks giving high weight to a
> species with a well-estimated, but peculiarly high or low variance.
> Computing the straight mean, as you suggest, comes with exactly the
> opposite set of shortcomings since species with relatively small sample
> sizes may have very poorly estimated variances.
>
> All the best, Liam
>
> Liam J. Revell, Associate Professor of Biology
> University of Massachusetts Boston
> web: http://faculty.umb.edu/liam.revell/
> email: liam.rev...@umb.edu
> blog: http://blog.phytools.org
>
> On 8/16/2016 10:46 PM, Santiago Claramunt wrote:
>
>> Hi Rafael,
>>
>> Your method would underestimate the error associated with values derived
>> from single specimens because those values would have the highest errors,
>> not average errors.
>> What I have done in such cases is to estimate an average standard
>> deviation across species and use that average standard deviation as the
>> standard error of the species with single specimens.
>> I don't know of a formal description of this solution but it is mentioned
>> in one of my papers: http://rspb.royalsocietypublis
>> hing.org/content/279/1733/1567
>>
>> Best,
>>
>> Santiago
>>
>> Research Associate
>> Department of Ornithology
>> American Museum of Natural History
>> https://sites.google.com/site/sclaramuntuy/
>>
>>
>> On Aug 16, 2016, at 4:34 PM, Rafael S. Marcondes <raf.marcon...@gmail.com>
>>> wrote:
>>>
>>> Hi all,
>>>
>>> I’m using OUwie to fit multi-optima OU models and I have a question about
>>> incorporating measurement error into my analyses.
>>>
>>> I’m running my models with known measurement error (mserr=‘known’) and
>>> using the standard error (std.error()) as an estimate of it, as
>>> recommended
>>> by Ives et al (2007). However, for some (a minority) of my tips, I was
>>> only
>>> able to measure 1 specimen, so I have no standard error for them. So I’m
>>> not sure about how to deal with those. At first I thought about just
>>> setting their measurement error as 0, but then I figured that would
>>> introduce false confidence. So what I’m doing now is I’m setting
>>> measurement error for those tips as the mean of the errors of all the
>>> tips
>>> for which I did measure more than one specimen. I got that idea also from
>>> Ives et al when they mention averaging the error across species (jn the
>>> third-to-last paragraph), but that was in a different context. I can’t
>>> find
>>> any references that report dealing with the same problem, even though I
>>> assume it must not be an uncommon one. So I’m wondering if mine is really
>>> the best way to do it and, or if anyone has alternative suggestions?
>>>
>>> i hope I’ve made my problem clear, and thanks in advance for any
>>> suggestions.
>>>
>>>
>>> *--*
>>> *Rafael Sobral Marcondes*
>>>
>>> PhD Candidate (Systematics, Ecology and Evolution/Ornithology)
>>> Museum of Natural Science <http://sites01.lsu.edu/wp/mns/>
>>> Louisiana State University
>>> 119 Foster Hall
>>

Re: [R-sig-phylo] Chronos function in ape

2016-07-06 Thread Brian O'Meara
Geiger has congruify functions that can use pathd8 to make a tree
ultrametric (perhaps using an external chronogram to date it).  There's
some code in there to use r8s, as well, but it takes some digging.

Best,
Brian

___
Brian O'Meara
Associate Professor
Dept. of Ecology & Evolutionary Biology
U. of Tennessee, Knoxville
http://www.brianomeara.info



On Wed, Jul 6, 2016 at 1:02 PM, Marco Fracassetti <
marco.fracasse...@unibas.ch> wrote:

> Hi to all of you,
>
> I want to calibrated a relationship tree done with Treemix (Pickrell &
> Pritchard 2012) done with SNPs frequency. The branch length of the tree
> reflect the amount of genetic drift.
> The chronos function from the package ape want tree whose branch lengths
> are in number of substitution per sites.
> Is It correct to t use the chronos function ?
> There is another R function that could transform a relationship tree in
> ultrametric tree?
>
> Thanks in advance for the help,
>
> Marco
>
> [[alternative HTML version deleted]]
>
> ___
> R-sig-phylo mailing list - R-sig-phylo@r-project.org
> https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
> Searchable archive at
> http://www.mail-archive.com/r-sig-phylo@r-project.org/
>

[[alternative HTML version deleted]]

___
R-sig-phylo mailing list - R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/


Re: [R-sig-phylo] [function like ratebystate, but for discrete characters?]

2016-05-04 Thread Brian O'Meara
Under the hood, the methods basically treat (0,0), (0,1), (1,1), and (1,0)
as states 1, 2, 3, 4. The difference is that the rate matrix is (often)
constrained so that you can't change both traits instantaneously: the 00
<-> 11 rates are forced to be zero, as are the 01 <-> 10 rates. You can
change this assumption. I typically think it makes sense to keep this for
most traits: even if biologically you think 11 is unstable, if you think it
is an intermediate, then I'd keep it. There could be traits where 11 just
can't exist: then I might do a three state model of 00, 01, and 10. But I
agree with Liam; for your questions, these sorts of approaches are the
natural ones to use.

However, if you haven't already, read Maddison & FitzJohn 2015:
http://sysbio.oxfordjournals.org/content/64/1/127.long . It has some
important things to worry about regarding all these tests. My personal rule
of thumb at this point is that if you see traits changing many times on the
tree, it's ok to use methods like Pagel (1994), but still be worried; if
you have few changes, back away slowly and think about other potential
questions. This might change as we get new approaches, but I doubt we'll
ever get to the point where we can make strong conclusions from a few
changes.

Best,
Brian

_______
Brian O'Meara
Associate Professor
Dept. of Ecology & Evolutionary Biology
U. of Tennessee, Knoxville
http://www.brianomeara.info



On Tue, May 3, 2016 at 8:49 PM, Michael Foisy <
michael.fo...@mail.utoronto.ca> wrote:

>
> Hi Liam,
>
> Thank you for your quick reply!  Been playing around with
> fitPagel{phytools} and corDISC{corHMM}.  That makes sense;  state-dependent
> model, so just look at the transition rates between characters!
>
> This might be a silly question, but I'll ask it anyways.  For the traits
> that I'm looking at, it doesn't make biological sense to keep Trait A, if
> you gain Trait B.  Given this, one might expect the distribution of traits
> at the tips to be A or B, but rarely both A and B.  And so, what I'm really
> interested in is something that might look (in reality) more like (1, 0)
> --> (0, 1), and less like (1, 0) --> (1, 1) --> (0, 1), because the middle
> state is not evolutionarily stable.  So, I guess what I'm wondering, is:
> if Trait A facilitates the evolution of Trait B (1,0) -> (1,1), and Trait A
> is always lost quickly after the evolution of Trait B (1,1) -> (0,1), then
> would this be an issue for fitPagel and corDISC models?  I know they
> estimate all these transition rates, so maybe I'm just overthinking this.
> Or, might it be better to treat the two characters as one, multistate
> character? (they, along with trait C, serve the same function; they're just
> phenotypically distinct).  J!
>  ust want to make sure I'm using the right models for the traits I'm
> looking at!
>
>
> I hope that was clear... Thank you very  much!
> , Michael Foisy
>
> MSc. Student
> University of Toronto
> michael.fo...@mail.utoronto.ca
> 
> From: Liam J. Revell <liam.rev...@umb.edu>
> Sent: Tuesday, May 3, 2016 8:11:39 PM
> To: Michael Foisy; r-sig-phylo@r-project.org
> Subject: Re: [R-sig-phylo] [function like ratebystate, but for discrete
> characters?]
>
> Hi Michael.
>
> This is essentially the method of Pagel (1994). Though often interpreted
> as a test for correlated evolution of two binary characters, the model
> that is actually being fit is one in which the rate of transition 0->1
> (or vice versa) in character A is affected by the state of binary
> character B; or, conversely, if the rate of transition 0->1 (or vice
> versa) in character B is affected by the state of character A.
>
> This is implemented in fitPagel of phytools and I believe that this &
> other more sophisticated flavor are available as part of the package
> corHMM.
>
> All the best, Liam
>
> Liam J. Revell, Associate Professor of Biology
> University of Massachusetts Boston
> web: http://faculty.umb.edu/liam.revell/
> email: liam.rev...@umb.edu
> blog: http://blog.phytools.org
>
> On 5/3/2016 8:00 PM, Michael Foisy wrote:
> > Hello,
> >
> > Quick question.  I'm interested in whether the presence/absence of
> character 'A' affects the rate of evolution in character 'B'.  Is there a
> function that will do this for two or more discrete characters?
> >
> > I've seen ratebystate {phytools} for continuous data; but my data are
> all binary!
> >
> >
> > Thank you!
> > , Michael Foisy
> >
> > MSc. Student
> > University of Toronto
> > michael.fo...@mail.utoronto.ca
> >
> >
> >   [[alternative HTML version deleted]]
> >
> > __

Re: [R-sig-phylo] estimating the evolutionary rate of a continous trait

2016-04-15 Thread Brian O'Meara
Hi, Belinda.

One thing to watch is over-intepretation: BM is consistent with drift, OU
is consistent with selection, but various kinds of selection can also lead
to BM. [1]. A lot of the people involved in these methods (including me)
are guilty of sloppiness in language in this area, leading to confusion.

Also, another issue could be branch lengths: are they proportional to time
(or, arguably, something like number of generations)? All these methods are
basically stretching the tree in various ways (and, for OU, adjusting
expected means), so if branch lengths are arbitrary, so are the results. I
ask due to the lambda fit. Other things that can cause this are
unincorporated measurement error (b/c it looks like a lot of evolution
right at the tips, meaning little evolution along the branches). I'd
suggest incorporating this (you can give known estimates of measurement
error; some programs allow this to be estimated as well (I forget whether
we have the estimation bit exposed in OUwie, but known error should be
there at least)).

Surface can estimate different regimes on the tree, but IIRC it does not
estimate different rates, just OU means. I believe auteur (in geiger) and
BAMM can model different BM rates on different branches, which sounds like
your question. We have some code to try different OU models (including
perhaps ones with different rates) on different branches buried somewhere
in OUwie (we call it OUwie-dredge so people know to be cautious) but it
hasn't been tested or published yet.

One other caution: if you have a BM model, and are just modeling different
rates, it can be done well. Trying to estimate different rates with OU
models is harder to do well: there's interaction between alpha and sigma
that can make them difficult to distinguish (not formally nonidentifiable
[unlike OU means sometimes], just difficult). For me, I'd probably lean
towards one of the BM rate heterogeneity methods for your question.

Hope this helps,
Brian


[1] Hansen, T. F. and E. P. Martins. 1996. Translating between
microevolutionary process and macroevolutionary patterns: the correlation
structure of interspecific data. Evolution 50:1404-1417.

___
Brian O'Meara
Associate Professor
Dept. of Ecology & Evolutionary Biology
U. of Tennessee, Knoxville
http://www.brianomeara.info



On Fri, Apr 15, 2016 at 10:20 AM, Belinda Kahnt <belind...@gmx.de> wrote:

> Dear all,
> I am a newbie to this mailing list and phylogenetic analyses in R and hope
> you can comment on a question I have. I would like to estimate the rate of
> evolution of a continous trait and check if the trait evolves faster along
> some branches of the topology. I already checked with Pagels lambda if the
> trait evolves according to a Brownian motion model,(i.e. pure drift), which
> it did not (lambda ~ 0). Modelling the trait evolution under the
> Ornstein-Uhlenbeck (OU) model provided a significant better fit (p<- 0.01)
> with a very high alpha parameter of 8.7 (indicating a role of selection in
> trait evolution). In order to infer now the rate of trait evolution I am
> searching for a R package that is able to do this inference based on the OU
> model. However, so far I either just found methods based on the Brownian
> motion model (e.g. brownie.lite function in phytools, Motmot) or methods
> that also rely on the reconstruction of ancestral selective regimes
> beforehand (like mvMORPH or OUwie). I want to keep the model as simple as
> possible and therefore don't know if it is necessary to also model
> ancestral selective regimes if I just want to know if the evolutionary rate
> of the trait varies along branches and which branches show an accelerated
> rate?! Which programm would you recommend? I would be very thankful for
> your help and recommendations.
> Best wishes,
> Belinda
>
> ___
> R-sig-phylo mailing list - R-sig-phylo@r-project.org
> https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
> Searchable archive at
> http://www.mail-archive.com/r-sig-phylo@r-project.org/
>

[[alternative HTML version deleted]]

___
R-sig-phylo mailing list - R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/


Re: [R-sig-phylo] Accounting for phylogeny in binary predictor, binary response data

2016-02-10 Thread Brian O'Meara
There's also corDISC in corHMM (disclosure: I'm a coauthor on the package).
It does Pagel 1994 with two or three binary characters, allows for user set
root states (I believe the canonical Pagel (1994) assumes equal freq at the
root, I don't know if phytools uses this same assumption [a lot of programs
assume equilibrium freq]), and allows additional control over the rate
matrix. However, it requires you to specify what sort of rate matrix you
want (has equal rates, all rate different, etc., but you would have to
specify the dependent model matrix): the phytools code seems to loop over
the named Pagel matrices, which are a subset of the possible matrices. So,
I'd summarize it as phytools being more turnkey, corHMM having more control
but less easy to use.

Oh, and in either case, read this and be afraid before starting analyses:
Maddison & FitzJohn, 2014: http://sysbio.oxfordjournals.org/content/64/1/127
.

Best,
Brian

___
Brian O'Meara
Associate Professor
Dept. of Ecology & Evolutionary Biology
U. of Tennessee, Knoxville
http://www.brianomeara.info

Postdoc collaborators wanted: http://nimbios.org/postdocs/
Calendar: http://www.brianomeara.info/calendars/omeara

On Wed, Feb 10, 2016 at 12:41 PM, Heath Blackmon <coleo...@gmail.com> wrote:

> Grace,
>
> The package phytools includes a function fitPagel if that is the method
> that you want to use.  He has a post on his blog that discusses his
> implementation so it won't be a black box for you:
>
>
> http://blog.phytools.org/2014/12/r-function-for-pagels-1994-correlation.html
>
> cheers
>
> Heath
>
> On Wed, Feb 10, 2016 at 11:14 AM, Joe Felsenstein <j...@gs.washington.edu>
> wrote:
>
> > I do not know offhand whether there is an R implementation, but how about
> > Mark Pagel's 1994 method for testing whether two 0/1 characters changing
> > along a ohylogeny are changing independently?
> >
> > J.F.
> > -
> > j...@gs.washington.edu
> > Joe Felsenstein, Department of Genome Sciences and Department of Biology
> > Box 355065, University of Washington, Seattle, WA  98195-5065
> >
> > [[alternative HTML version deleted]]
> >
> > ___
> > R-sig-phylo mailing list - R-sig-phylo@r-project.org
> > https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
> > Searchable archive at
> > http://www.mail-archive.com/r-sig-phylo@r-project.org/
> >
>
>
>
> --
> *Heath Blackmon, Ph.D.*
>
> *Postdoctoral Researcher*
> *University of Minnesota*
> *coleoguy.github.io <http://coleoguy.github.io>*
> *@coleoguy* <https://twitter.com/coleoguy>
> *Phone 682-444-0538*
>
> [[alternative HTML version deleted]]
>
> ___
> R-sig-phylo mailing list - R-sig-phylo@r-project.org
> https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
> Searchable archive at
> http://www.mail-archive.com/r-sig-phylo@r-project.org/
>

[[alternative HTML version deleted]]

___
R-sig-phylo mailing list - R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/


Re: [R-sig-phylo] Fitting Brownian evolution with bounds

2015-11-30 Thread Brian O'Meara
There's alpha code to do this in TreEvo, but we're still readying this for
publication (and it's now supported by an NSF grant, so expect much more
development), and so I wouldn't advise using it yet. I don't know of
anything else at the moment.

Best,
Brian


___
Brian O'Meara
Associate Professor
Dept. of Ecology & Evolutionary Biology
U. of Tennessee, Knoxville
http://www.brianomeara.info

Postdoc collaborators wanted: http://nimbios.org/postdocs/
Calendar: http://www.brianomeara.info/calendars/omeara

On Mon, Nov 30, 2015 at 4:18 AM, saurabh <saurab...@ncbs.res.in> wrote:

> Hello,
>
> does anyone know if its possible and how to fit Brownian evolution with
> either reflecting or absorbing boundaries?
>
> And many thanks for creating this excellent resource!
> Saurabh
>
> --
> Saurabh Mahajan
> Research Scholar/Graduate Student
> Deepa Agashe's Lab
> National Center for Biological Sciences
> Bengaluru 560065
> INDIA
>
> Ph: +91 80 2366 6504
> Mobile: +918050001886
> Email: saurabh...@gmail.com
>
> ___
> R-sig-phylo mailing list - R-sig-phylo@r-project.org
> https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
> Searchable archive at
> http://www.mail-archive.com/r-sig-phylo@r-project.org/
>

[[alternative HTML version deleted]]

___
R-sig-phylo mailing list - R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/


Re: [R-sig-phylo] simulate Ne changes on a given phylogeny

2015-10-17 Thread Brian O'Meara
Dick Hudson's ms software can simulate gene trees along a species tree or
network with migration, changing population size, etc. The package phyclust
can call ms. You could then just simulate nucleotides on these gene trees.

Best,
Brian

___
Brian O'Meara
Associate Professor
Dept. of Ecology & Evolutionary Biology
U. of Tennessee, Knoxville
http://www.brianomeara.info

Postdoc collaborators wanted: http://nimbios.org/postdocs/
Calendar: http://www.brianomeara.info/calendars/omeara

On Sat, Oct 17, 2015 at 10:12 PM, Jacob Berv <jakeberv.r.sig.ph...@gmail.com
> wrote:

> Dear R-sig-phylo,
>
> I have a somewhat general simulation question and I was hoping someone on
> here might have some insight.
>
> I’m trying to figure out if it’s possible to simulate nucleotide sequence
> data (an arbitrary number of neutral loci under a multi species coalescent
> model), on an ultrametric input topology (where tips represent species),
> with user defined changes in effective population size at the start and end
> of a particular internal branch. In my searching I’ve come across some
> software by Deren Eaton (https://github.com/dereneaton/simLoci <
> https://github.com/dereneaton/simLoci>) that looks like it might do what
> I want - but I’m not sure. It looks like I can specify migration events
> between taxa, but perhaps not population size changes on internal branches.
> There are many other applications for simulating sequence data but I am not
> familiar with any of them. Any thoughts would be appreciated!
>
> Thanks,
>
> Jacob Berv
>
> Ph.D. Student
> Lovette Lab
> Cornell Laboratory of Ornithology
>
>
> [[alternative HTML version deleted]]
>
> ___
> R-sig-phylo mailing list - R-sig-phylo@r-project.org
> https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
> Searchable archive at
> http://www.mail-archive.com/r-sig-phylo@r-project.org/

[[alternative HTML version deleted]]

___
R-sig-phylo mailing list - R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/

Re: [R-sig-phylo] Ancestral Reconstruction with Polymorphic States

2015-07-06 Thread Brian O'Meara
Well, if you want to have polymorphic ancestors, one way would be to treat
it as an ordered character: 0 - 0/1 - 1, so that any movements from 0
to 1 have to go through a polymorphic ancestor. This would entail fixing
the rate matrix to have the 0-1 rates forced to be zero (which I *think*
is allowed by phytools). Then you could use make.simmap() or similar. If
you just want to have ancestors be 0 or 1 only, but treat 01 as a mixture
of two states at the tips, you could use corHMM; note, though, that it'll
give you estimated states at nodes but not a full stochastic mapping.
Another possibility for treating them as polymorphic, and allowing
polymorphic ancestors, is to use a biogeographic model (like in
biogeoBEARS). It'll require some futzing with the matrices, though, to make
all the changes happen along branches rather than at nodes -- unless you
think that's how speciation happens (i.e., a polymorphic species with 01
speciates by separating into a 0 monomorphic species and a 1 monomorphic
species).

Best,
Brian


___
Brian O'Meara
Assistant Professor
Dept. of Ecology  Evolutionary Biology
U. of Tennessee, Knoxville
http://www.brianomeara.info

Postdoc collaborators wanted: http://nimbios.org/postdocs/
Calendar: http://www.brianomeara.info/calendars/omeara

On Mon, Jul 6, 2015 at 1:19 PM, Jackson, Laura Marie 
laura.jack...@coyotes.usd.edu wrote:

 Hi all,

 Is it possible to generate an ancestral reconstruction (i.e., make.simmap
 in {phytools} using data with polymorphic states (01)? It seems that
 {phytools} thinks that 01 is a third state, which would be incorrect in
 the final mapping. When I try this on an example file and tree, it does not
 accurately color the branches to represent the polymorphic states. Any help
 with this would be greatly appreciated!


  simmap - make.simmap(tree, matrix, nsim=100)
 make.simmap is sampling character histories conditioned on the transition
 matrix
 Q =
 0   0/1 1
 0   -3.715682  2.376705  1.338976
 01  2.376705 -2.376705  0.00
 11.338976  0.00 -1.338976
 (estimated using likelihood);
 and (mean) root node prior probabilities
 pi =
 0   0/1 1
 0.333 0.333 0.333
 Done.


 ---
 Laura M. Jackson
 PhD Student
 Department of Biology
 185 Churchill Haines
 University of South Dakota
 414 E. Clark St.
 Vermillion SD 57069-1746
 Email: laura.jack...@usd.edumailto:laura.jack...@usd.edu



 [[alternative HTML version deleted]]

 ___
 R-sig-phylo mailing list - R-sig-phylo@r-project.org
 https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
 Searchable archive at
 http://www.mail-archive.com/r-sig-phylo@r-project.org/


[[alternative HTML version deleted]]

___
R-sig-phylo mailing list - R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/


Re: [R-sig-phylo] corDISC binary coding problems

2015-07-03 Thread Brian O'Meara
Hi, Franz. Sorry for the delay in answering. Much as I [selfishly] like
people to use corDISC, I'm not sure if it's appropriate to your question.
It sounds like you have some (unmentioned) binary state, let's call it foo,
that you're trying to correlate with a proportion, and so you convert the
proportion into two binary states -- is that correct? And you can't observe
00? Having an unobserved state can be problematic for rate estimates (the
MLE for 00-01 or 00-10 will be INF, the MLE for 10-00 and 01-00 will be
0). Why not use something like Ives and Garland (2010) to allow correlation
between a binary dependent variable and one or more independent variables
(implemented in packages such as phylolm)? If you do want do do a Pagel 94
like test, you could discretize the proportion as a multistate character:
i.e., 0, 1, 2, 3, 4 = 0:100 gymno:angio, 25:75 g:a, 50:50 g:a, 75:25 g:a,
100:0 g:a. You would then have 10 overall states: foo=0 with proportion 0,
foo 0 w prop 1, foo 0 w prop 2, ... foo 1 w prop 5. Convert this into a
rate matrix and constrain it (i.e., if you want to use the traditional
Pagel 94 assumption of not instantaneous changes in two char at once, set
f0p0 - f1p5 to have a rate of zero, etc.; also, do a limited set of
rates). If it were me, though, I'd do Ives and Garland or similar (GEE,
PGLS, etc.: others might weigh in on the most appropriate way to correlate
binary with proportion).

As to whether the different thresholds matter, it would depend (partially)
on how sensitive the coding is. If all the proportions are 0-10% or
90-100%, a range of cutoffs wouldn't matter. If they're more evenly
distributed, you're changing the data a lot, so it should cause different
results. It's great you're checking sensitivity to thresholds. If it were
me, I'd probably focus on the threshold I thought was most biologically
meaningful, but also say (in the pub, not bury in the supplement) that
results were very sensitive to the threshold used, and describe the range
of results that come from changing the threshold. It's possible that the
values you get back vary, but the main result is the same: maybe foo 1 is
always positively correlated with gymnosperm substrates, though the
magnitude of this varies with the threshold, for example. You're right that
you can't model average over different data, and the threshold is
effectively changing your data.

corDISC should be generally fine for uneven state frequencies, other than
the caveat that rate estimates can get very extreme if some states are very
low frequency under some models (if 00 is missing, but all rates are equal,
the rates will probably be ok, but in a model with free states, the rates
will be extreme). It will be less fine for biased sampling: that is, if
state 1 and state 0 are occur in frequency A:B in nature, but you have
happened to sample very different C:D frequency, you'd get biased
estimates. corHMM doesn't have a way to deal with this. You could simulate
under your true frequency, subsample down to the frequency in your data,
and run the analyses to get an idea of how your results might be affected.
If this were a diversification question both diversitree and hisse can deal
with uneven sampling that you specify; diversitree does have trait only
models, but I'm not sure if they incorporate sampling frequency.

Hope this helps.

Best,
Brian


___
Brian O'Meara
Assistant Professor
Dept. of Ecology  Evolutionary Biology
U. of Tennessee, Knoxville
http://www.brianomeara.info

Postdoc collaborators wanted: http://nimbios.org/postdocs/
Calendar: http://www.brianomeara.info/calendars/omeara

On Fri, Jun 26, 2015 at 3:45 AM, f.k...@mailbox.org f.k...@mailbox.org
wrote:

 Hello,

 I have used the Pagels 94’ extended version in corDISC with 3 binary
 traits and 1300 species.
 To be able to use it, I had to code two binary character out of a
 continuous.
 My continuous variable is substrate proportion data. So 100% is gymnosperm
 exclusivity, 0% is gymnosperm exclusivity.
 I made two binary character out of it, using different thresholds 10%,
 20%, 30%. 40%,  50%.
 The resulting two character were XY, where X= angiosperm occurrence, Y=
 gymnosperm occurrence. Therefore a 11 would be occurence on both and e.g. 10
 is occurrence primarily on a given substrate, where “primarily” is defined
 by the threshold. The third binary doesn’t matter in this case.


 So my first question is: Is it valid to have a state that is coded by two
 binaries, like 11 that means “occurrence on both” because the meaning of
 the coding depends on both binaries and is therefore not “independent”? I
 have a bad feeling about coding that has a “meaning together”...


 The other question is about the results. I ran corDISC with all 5
 different thresholds and the results are super different. Almost all of
 them range between 10 and 100 (100 was the upper bound).

So the general question is: Is the method not reliable for different
 codings

Re: [R-sig-phylo] Missing States in ML Reconstruction of Discrete Ancestral States (phytools, ape)

2015-06-17 Thread Brian O'Meara
corHMM should be able to do this as long as the states are defined in the
transition matrix (rate.mat.maker() to start, then modify). BioGeoBEARS
could do this (since it has to for biogeography: not all combos of
ancestral areas are observed at the tips), and one of its models might be
equivalent to what you want. I imagine diversitree can, too.

Of course, this all might work best with a model with somewhat constrained
rates. If rates are all free, and the model is unordered, the MLE for rate
of moving into the unseen state will be zero, and rate for moving out of it
will be Inf.

Best,
Brian


___
Brian O'Meara
Assistant Professor
Dept. of Ecology  Evolutionary Biology
U. of Tennessee, Knoxville
http://www.brianomeara.info

Postdoc collaborators wanted: http://nimbios.org/postdocs/
Calendar: http://www.brianomeara.info/calendars/omeara

On Wed, Jun 17, 2015 at 3:06 PM, David Bapst dwba...@gmail.com wrote:

 Liam,

 Ah, I see, rerootingMethod(), unlike ace, will accept a matrix
 representation of discrete trait values.

 However, it doesn't appear that one can define a model matrix still?
 Outside of molecular data, the missing state issue is often for
 ordered data, hence the need to define a model matrix.

 
 library(ape)
 library(phytools)
 data(bird.orders)

 #make a matrix for a 4 step trait
 model-matrix(0,4,4)
 for(i in 1:4){for(j in 1:4){
 if(abs(i-j)2){
 model[i,j]-0.1
 }
 }}
 rownames(model)-colnames(model)-0:3

 #simulate data where you have only observed endmembers
 trait - factor(c(rep(0, 5), rep(3, 18)))
 names(trait)-bird.orders$tip.label
 trait-to.matrix(trait,seq=0:3)

 rerootingMethod(tree=bird.orders, x=trait)
 #works
 rerootingMethod(tree=bird.orders, x=trait, model=model)
 #fails

 #

 Cheers,
 -Dave

 On Wed, Jun 17, 2015 at 12:06 PM, Liam J. Revell liam.rev...@umb.edu
 wrote:
  Hi David.
 
  Actually, if I understand correctly, this does work in phytools. I
 suspect
  it can be set up in phangorn as well - Klaus can probably explain how.
 
  So, if I understand correctly, you want to get marginal ancestral states
 for
  a character that can assume, say, states A, C, G,  T, but for
 which
  you have only observed states A, C,  G (for instance) in your tip
  taxa?
 
  The way to do this in rerootingMethod is as follows:
 
  library(phytools)
  ## simulate some data to work with:
  tree-pbtree(n=26,tip.label=letters,scale=1)
  Q-matrix(c(-2,1,1,1,-2,1,1,1,-2),3,3)
  rownames(Q)-colnames(Q)-c(A,C,G)
  x-sim.history(tree,Q)$states
  x ## our data
  ## now convert to a matrix representation:
  X-to.matrix(x,seq=c(A,C,G,T))
  ## reconstruct ancestral states
  fitER-rerootingMethod(tree,X,model=ER)
  fitER
 
  Note that as we have not observed any evidence that changes to T occur
 for
  this character, if we use a symmetric model instead, we will find that
 the
  marginal likelihoods at all internal nodes for state T are zero. This
 is
  because the maximum likelihood q[i,T]  q[T,i] for all i will be zero
  (sensibly). For instance:
 
  fitSYM-rerootingMethod(tree,X,model=SYM)
  fitSYM
 
  Let me know if this is what you had in mind.
 
  rerootingMethod just uses modified code from ace internally to do the
  calculations - so if ace does not permit this presently, it could easily
 be
  modified to allow it. phangorn too must allow this because for many
  nucleotide sites we want to reconstruct ancestral states - but we will
 have
  observed only a subset of the four nucleotides at that site.
 
  All the best, Liam
 
  Liam J. Revell, Assistant Professor of Biology
  University of Massachusetts Boston
  web: http://faculty.umb.edu/liam.revell/
  email: liam.rev...@umb.edu
  blog: http://blog.phytools.org
 
 
  On 6/17/2015 1:46 PM, David Bapst wrote:
 
  Hello all,
 
  (I'm a troublemaker today.)
 
  Sometimes, in ordered discrete data, there are states we know might
  exist as intermediary between observed states but aren't observed
  themselves. I suppose this is probably common for meristic data. At
  least to me, it seems like it should be possible to reconstruct node
  states in a scenario with 'missing' states in a ML approach by
  defining the corre
 
  Anyway, in a recent conversation with a friend, he noted that he
  couldn't get the function for ML ancestral reconstruction of discrete
  characters, specifically ace (in ape) or rerootingMethod (in phytools)
  to accept data with missing states. I was a little shocked by this,
  and I didn't believe him in my foolhardiness and investigated myself.
 
  #
 
  library(ape)
  data(bird.orders)
 
  #make an ordered matrix for a 4 state trait
  model-matrix(0,4,4)
  for(i in 1:4){for(j in 1:4){
   if(abs(i-j)2){
   model[i,j]-0.1
   }
   }}
  rownames(model)-colnames(model)-0:3
 
  #simulate data where you have only observed endmembers
  trait - factor(c(rep(0, 5), rep(3, 18

Re: [R-sig-phylo] Bug in reorder.phylo() (related to cleaning phylo objects)

2015-06-15 Thread Brian O'Meara
A kludgy solution to achieve Dave's original use case of fixing
not-quite-right phylo objects is to do:

phy - read.tree(text=write.tree(phy))

It converts the tree to Newick and reads it back in.

It's not as good a solution as using a validator (great addition, Emanuel!)
to find and fix the general problem, but it can help in a pinch.

Best,
Brian


___
Brian O'Meara
Assistant Professor
Dept. of Ecology  Evolutionary Biology
U. of Tennessee, Knoxville
http://www.brianomeara.info

Postdoc collaborators wanted: http://nimbios.org/postdocs/
Calendar: http://www.brianomeara.info/calendars/omeara

On Mon, Jun 15, 2015 at 11:29 AM, Emmanuel Paradis emmanuel.para...@ird.fr
wrote:

 Hi David,

 collapse.singles() seems to work correctly if the phylo object is
 correctly conformed. Some features of this class may seem annoying (and
 Klaus and I alredy discussed about this), but these help for other things.
 As a reminder the class phylo is defined here:

 http://ape-package.ird.fr/misc/FormatTreeR_24Oct2012.pdf

 Recently, I put a function on github to help code writters:

 https://github.com/emmanuelparadis/checkValidPhylo

 This could help you to detect problems that would be tough to find
 otherwise.

 Cheers,

 Emmanuel


 Le 12/06/2015 22:41, David Bapst a écrit :

 Hi Klaus (and others),

 Ah, I see! The real bug then appears to be in collapse.singles, as it
 does not reorder the ID numbers in $edge. Here's a quick function to
 return the root ID:

 getRootID-function(tree){
  uniqueNode-unique(tree$edge[,1])
  whichRoot-sapply(uniqueNode,function(x)
  (sum(x==tree$edge[,2])==0))
  rootID-uniqueNode[whichRoot]
  return(rootID)
  }

 And if we run it...

  getRootID(tree) #original tree

 [1] 151

 getRootID(tree1)   #after collapse.singles

 [1] 151

 getRootID(tree2)   #after reorder.phylo

 [1] 131

 And we can see that although the number of nodes certainly changed
 when collapse.singles was run, it didn't change the root node's ID.

 So I guess this is really a collapse.singles bug report. Thanks for
 the insight, Klaus!

 -Dave

 On Fri, Jun 12, 2015 at 2:27 PM, Klaus Schliep klaus.schl...@gmail.com
 wrote:

 Hi David,

 I found the bug. Somehow ape assumes on the one hand that the root is
 number
 of tips + 1.
 On the other hand the root of an phylo object is the node which is in
 tree$edge[,1], but not in tree$edge[,2],
 this is the case even in an unrooted tree. This annoys me for a long
 time.

 The root of tree1 with the definition above is actually node 151 and not
 131. Changing these solves your problem:
 Here is a small function to do this for you:

 switch.nodes- function(tree, a, b){
  ntips = length(tree$tip.label)
  tree$edge[tree$edge==a] = 0L
  tree$edge[tree$edge==b] = as.integer(a)
  tree$edge[tree$edge==0L] = as.integer(b)
  if(!is.null(tree$node.label)){
  tmp-tree$node.label[a-ntips]
  tree$node.label[a-ntips] = tree$node.label[b-ntips]
  tree$node.label[b-ntips] = tmp
  }
  tree
 }

 library(phangorn)
 attr(tree1, order) = NULL
 getRoot(tree1)

 tree2 = switch.nodes(tree1, 131, 151)
 plot(tree2) # now works for me

 Cheers and have a nice weekend,
 Klaus







 On Fri, Jun 12, 2015 at 2:53 PM, David Bapst dwba...@gmail.com wrote:


 Hello all,

 As those of you who directly manipulate the guts of phylo objects in
 your code (or construct new phylo objects whole cloth from
 un-phylo-like data structures) have probably experienced, it is
 sometimes easy to create $edge matrices that aren't accepted by ape
 functions (I often use plot.phylo as my litmus for this).

 When this occurs, various things can be done; I usually do the
 following sequence:

 collapse.singles()
 reorder.phylo(,cladewise)
 read.tree(text=write.tree(,file=NULL))

 ...or something along those lines.

 However, I've run into an issue where reorder.phylo() returns what is
 essentially a scrap of the original $edge matrix, without warning.
 Very worrisome! I'm using R version 3.2.0 and ape 3.3. Here's, my
 script, including a call to a saved object on my website, so that the
 error can be reproduced:

 (Why do I have an .Rdata object listed with a .txt extension, you
 might wonder? Because my school apparently sanitizes file extensions
 on our hosted websites that it thinks are executables, archives or
 doesn't recognize, but doesn't bother to check file innards when it
 does recognize the extensions. Its easily hacked, at least.)

 #

 library(ape)

 load(url(http://webpages.sdsmt.edu/~dbapst/weirdTree_06-12-15.txt;))

 #can I plot it?
 plot(tree)
 #nope

 tree1-collapse.singles(tree)

 #any single-child nodes left?
 sum(sapply(unique(tree1$edge[,1]),function(x)
 sum(x==tree1$edge[,1])==1))
 #nope

 #can I plot it?
 plot(tree1)
 #nope

 #now reorder
 tree2-reorder.phylo(tree1,cladewise)
 tree2$edge

 #now reorder with postorder
 tree2-reorder.phylo(tree1,postorder)
 tree2$edge

Re: [R-sig-phylo] clade-specific release and radiate model?

2015-05-29 Thread Brian O'Meara
Hi, Daniel. It's a bit arguable whether as alpha - 0, OU - BM: I think it
should, but IIRC in OUCH this doesn't happen, and that's a deliberate
choice. That said, I think that an OU with alpha near zero would be ok for
your question, though you might want to think about how to penalize
parameters (that is, for that clade there'd be an OU mean parameter that is
unidentifiable (alpha of zero, so no pull, so no evidence for it): should
you count this as a parameter when doing model comparison? I'd argue no:
you're doing OU with alpha of zero as a kludgy hack to treat it as BM).

Another approach would be to resuscitate the censored model of O'Meara et
al. 2006. Slice your tree on the edge leading to the released clade (I
guess this truly releases the clade to roam free of its relatives) so you
have the paraphyletic non-released set and the released clade as separate
trees. Then you can try fitting the same or different models to the two
trees. The downside of this is that you must use an additional ancestral
state (at the MRCA of the released clade); the upside is that any weird
changes happening on the edge leading to the released clade aren't in the
model and so don't affect the fit (you could imagine that whatever led the
clade to be ecologically released happened somewhere on the stem edge, but
you don't know where, and it could be associated with a major single shift
in your continuous trait, too). You could try an OU on the non-released
tree (let's call this A) and an OU on the released clade (B), OU on A and
BM on B, etc. The only practical difficulty with this is constraining the
cases where A and B are supposed to have the same model: by default,
optimization will happen separately in different trees, but you can create
a wrapper function that calls OUwie.fixed() separately on A and B but with
the same parameters and adds the likelihood and then optimize the
parameters in this wrapper function.

Hope this helps,
Brian


On Fri, May 29, 2015 at 2:02 AM, Daniel Fulop dfulop@gmail.com wrote:

 Hi All,

 Do you know if there are any methods out there for fitting ecological
 release and release and radiate models that are clade-specific?  That is,
 in which the change in mode (OU to BM) happens at the root of a clade
 instead of at specific time for the whole phylogeny (as in Slater 2013).

 As far as I know the closest out there are models in OUwie, say with a
 clade-specific second OU process with a very low alpha.  However, I don't
 think that biologically OU is a good model for the trait and clade in
 question (though it is for the rest of the tree).  I know that at the limit
 as alpha - 0 OU converts to BM, but I would ideally still like compare
 standard models' fits (including OUwie models) to the fits of
 clade-specific release models.

 Any leads or suggestions would be much appreciated, especially about how
 to implement these clade-specific models with current tools or about how to
 roll my own.

 Thanks!
 Dan.

 --
 Daniel Fulop, Ph.D.
 Postdoctoral Scholar
 Dept. Plant Biology, UC Davis
 Maloof Lab, Rm. 2220
 Life Sciences Addition, One Shields Ave.
 Davis, CA 95616

 ___
 R-sig-phylo mailing list - R-sig-phylo@r-project.org
 https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
 Searchable archive at
 http://www.mail-archive.com/r-sig-phylo@r-project.org/


[[alternative HTML version deleted]]

___
R-sig-phylo mailing list - R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/


Re: [R-sig-phylo] ancestor vs. change plots

2015-04-24 Thread Brian O'Meara
In Paleo, you can (well, arguably) see the ancestor and descendent pair, so
you can assess the amount of change and the ancestral state. Using just
info at the tips, you are inferring from N taxa N-1 ancestral states and
2N-2 rates of change. That seems a lot to ask from data, especially since
the rates and states affect each other and so aren't independent. You're
also using a model that assumes an even rate of change over the tree, so an
ancestor will tend to be between its descendants (exactly between, if you
have a two taxon tree with coeval taxa). You could try more exotic models
that allow multiple Brownian rates or even multiple Ornstein-Uhlenbeck
rates, but there's a limit there, as well. There are Bayesian approaches
that can paint a smear (estimated from post-burnin samples) of rates over
multiple branches, but I'd be careful interpreting them as robust estimates
of rate that are independent of the states.

If you have separate traits, you could ask does trait X lead to higher
rates (or OU mean, or OU alpha) of evolution in trait Y. Basically this
involves painting a tree with discrete trait X (generalist vs specialist
predator, say) and estimating different rates for the other trait (i.e.,
mouth volume) on the branches painted in different states (and seeing if
these rates are biologically and statistically significantly different).

Best,
Brian

___
Brian O'Meara
Assistant Professor
Dept. of Ecology  Evolutionary Biology
U. of Tennessee, Knoxville
http://www.brianomeara.info

Postdoc collaborators wanted: http://nimbios.org/postdocs/
Calendar: http://www.brianomeara.info/calendars/omeara

On Fri, Apr 24, 2015 at 3:14 PM, Milton Tan mzt0...@tigermail.auburn.edu
wrote:

 Hello all,

 I have a question that is perhaps esoteric, since it's on a method I don't
 see used often. I am looking at the dynamics of body size evolution, and
 have come upon ancestor-vs-change plots described in Alroy 2000
 (Understanding the dynamics of evolutionary trends, Paleobiology). This
 is interesting because it will allow me to see if rate of body size change
 depends on body size. I haven't seen this method widely used, so for anyone
 unaware how this works: for each branch, you plot the ancestral state vs.
 the amount/rate of change along the branch. In theory, by looking at a
 scatterplot of ancestral size vs. change, I can answer questions such as
 Are smaller taxa more likely to evolve to a larger size? I don't see too
 many people using this method, so I thought I'd ask why. Is there a
 particular reason for it not being used? Are there more powerful methods to
 answer this same question that have supplanted it?

 I also have a more specific question, assuming that ancestor-vs-change
 plots are valid. For my dataset, I have reconstructed ancestral states for
 my clade from tip data and pic, and then plotted ancestral states versus
 the rate of change along every branch (attached). While there are a couple
 outliers, you can see the distribution roughly hangs around 0 along the
 entire graph, but I'd like to fit a line so I can present if the slope
 significantly differs from 0 (ie. body size evolutionary rate truly does
 not depend on body size of ancestors). Alroy (2000) describes that the
 shape of the plot can be informative on different dynamics of evolutionary
 change, implying fitting of linear and polynomial lines, but doesn't really
 discuss how to test this statistically. Alroy seems to use this method and
 fits a polynomial line in Alroy 1998 (Cope's Rule and the Dynamics of Body
 Mass Evolution in North American Fossil Mammals, Science), but he doesn't
 really describe his statistics !
  in depth.

 So my question is how can I fit a line to this data? Problematically,
 since each branch has a sister branch that shares the same ancestral node,
 each ancestral state is represented twice. I think this makes the
 observations non-independent. If they are, this makes me think that a
 linear regression is inappropriate, but I'm not sure. I could average the
 amount of change along each branch for every ancestral node, but I'm not
 sure if that's the best way. Does anyone have any insight on an appropriate
 way to determine a best fit line statistically?

 Thanks in advance,

 Milton Tan
 Auburn University
 Department of Biological Sciences
 PhD Candidate

 [[alternative HTML version deleted]]

 ___
 R-sig-phylo mailing list - R-sig-phylo@r-project.org
 https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
 Searchable archive at
 http://www.mail-archive.com/r-sig-phylo@r-project.org/


[[alternative HTML version deleted]]

___
R-sig-phylo mailing list - R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/


Re: [R-sig-phylo] Unsual values of alpha in OU models

2015-03-27 Thread Brian O'Meara
You can select between models using AICc, but then you can look at the
reliability, more robustly than using eigenvalues, by using OUwie.boot() to
do parametric bootstrapping. Another very useful thing done by
Hansen, Bartoszek, and colleagues is to do a contour plot around the
maximum likelihood estimate to see how much a parameter like alpha can vary
without changing the likelihood much. As Aaron indicated, this can be a
horrifyingly large region. It's especially tough with OUMA and OUMVA where
there are multiple alphas.

One thing to consider is what kind of model is most likely to give you
results you trust and that are relevant for your question. It could be that
you care about OUM-type models, but don't really care if it's OUMA, OUM, or
OUMVA. You should certainly report that, say, OUMVA is the best fitting
under AICc, but perhaps if your biological question is about the optima
only you focus on OUM due to OUMVA seeming to have issues estimating any
parameters well (though of course also indicate if the results under OUMVA
generally agree or disagree with the conclusions from OUM -- you need to
show that it's not that OUMVA fit better but gave you an answer you didn't
like, so you cherry picked a different model to get the result you want).

Hope this helps,
Brian

___
Brian O'Meara
Assistant Professor
Dept. of Ecology  Evolutionary Biology
U. of Tennessee, Knoxville
http://www.brianomeara.info

Postdoc collaborators wanted: http://nimbios.org/postdocs/
Calendar: http://www.brianomeara.info/calendars/omeara

On Fri, Mar 27, 2015 at 12:57 PM, Diego Salazar Tortosa dsala...@ugr.es
wrote:

 Thank you for your answer Aaron.

 Clear to me that this situation is common in the OU models, but I don't
 know what criteria I should use when select between different OU models,
 AICc or reliability of parameters?

 Thanks in advance

 Regards

 Diego Salazar

 2015-03-27 11:08 GMT+00:00 Aaron King kin...@umich.edu:

  In many situations, the OU model parameter alpha is not well identified.
  This has been pointed out before, but it's quite common for the parameter
  values (and not just alpha, though that's typically the worst) to be
 poorly
  identified, even if the model selection is unambiguous.  OU model
  parameters can be well identified under some circumstances (detailed in
 the
  paper) but when they are not, estimates can be *very* unreliable.
 
  On Fri, Mar 27, 2015 at 4:07 AM, Diego Salazar Tortosa dsala...@ugr.es
  wrote:
 
  Hi,
 
  I am trying to analyze whether the evolution of a continuous trait is
  state-dependent of discrete trait with three levels, through OU models.
 My
  phylogeny has 113 species and 112 internal nodes.
 
  Clearly the most parsimonious models according AICc are OUMA and OUMVA
 but
  have alpha values between 2.5 and 4, are these values usual or too high?
 
  Also some eigenvuales obtained with diagn=T  are negative for both
 models,
  althought standard errors are very small. Are these models valid?
 
  Thanks in advance
 
  Diego Salazar
 
  --
  Diego Francisco Salazar Tortosa
  Ph student
  Departamento de Ecología
  Facultad de Ciencias
  Universidad de Granada
  Av. Fuente Nueva s/n
  18071 Granada
  Telefono: +34 958241000 ext 20007
  Movil: +34 634851132
  email: dsala...@ugr.es die...@correo.ugr.es
dftort...@gmail.com
 
  ---
  Este mensaje se dirige exclusivamente a su destinatario y puede
 contener
  información privilegiada o confidencial. Si no es Ud. el destinatario
  indicado, queda notificado de que la utilización, divulgación o copia
 sin
  autorización está prohibida en virtud de la legislación vigente. Si ha
  recibido este mensaje por error, se ruega lo comunique inmediatamente
 por
  esta misma vía y proceda a su destrucción.
 
  This message is intended exclusively for its addressee and may contain
  information that is CONFIDENTIAL and protected by professional
 privilege.
  If you are not the intended recipient you are hereby notified that any
  dissemination, copy or disclosure of this communication is strictly
  prohibited by law. If this message has been received in error, please
  immediately notify us via e-mail and delete it.
  --
 
  [[alternative HTML version deleted]]
 
  ___
  R-sig-phylo mailing list - R-sig-phylo@r-project.org
  https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
  Searchable archive at
  http://www.mail-archive.com/r-sig-phylo@r-project.org/
 
 
 
 
  --
  Aaron A. King, Ph.D.
  Ecology  Evolutionary Biology
  Mathematics
  Center for the Study of Complex Systems
  University of Michigan
  GPG Public Key: 0x15780975
 



 --
 Diego Francisco Salazar Tortosa
 Ph student
 Department of Ecology
 University of Granada
 Av. Fuente Nueva s/n
 18071 Granada
 Telefono: +34 958241000 ext 20007
 Movil

Re: [R-sig-phylo] assign multiple discrete states to single tip

2015-03-26 Thread Brian O'Meara
There's also the rayDISC function in the corHMM package. It can take in
polymorphic data (so you can have some taxa red, some white, some red 
white). Note that along the tree, it assumes that there's one state. If you
wanted ancestral nodes to be validly polymorphic (not just equal chances of
red and white, but a character red+white), you'd need to do as Liam
suggests and create a polymorphic trait and an ordered state matrix. This
is possible in ace() in ape, rayDISC() in corHMM, and probably others.

Best,
Brian

___
Brian O'Meara
Assistant Professor
Dept. of Ecology  Evolutionary Biology
U. of Tennessee, Knoxville
http://www.brianomeara.info

Postdoc collaborators wanted: http://nimbios.org/postdocs/
Calendar: http://www.brianomeara.info/calendars/omeara

On Wed, Mar 25, 2015 at 10:49 AM, Liam J. Revell liam.rev...@umb.edu
wrote:

 Hi Matt.

 What is the purpose of this? To visualize the states on the tree? Or to
 conduct some kind of analysis (modeling evolution, reconstructing ancestral
 states, etc.) on the tree?

 If the former, then you can use tiplabels in the ape package, and supply a
 matrix to the argument pie in which each row is a tip taxon and each column
 is a state. For non-polymorphic species you can just put a 1 in the column
 for the state of that taxon, and zero otherwise; and for polymorphic taxa
 you can put 1/(number of states in that species) - or the state
 frequencies, if you have them - for each state found in that species, and
 zero otherwise.

 If you're interested in, for example, reconstructing ancestral states,
 then it may in fact make the most sense to treat polymorphism as another
 state. In that case, you must decide if you will, for instance, require
 that transitions from A - B first pass through state AB, etc. This kind of
 constraint is fairly straightforward to implement in ape's ace function.

 Let us know  I will see if I can give you any more concrete suggestions.

 All the best, Liam

 Liam J. Revell, Assistant Professor of Biology
 University of Massachusetts Boston
 web: http://faculty.umb.edu/liam.revell/
 email: liam.rev...@umb.edu
 blog: http://blog.phytools.org


 On 3/25/2015 7:38 AM, Koski, Matthew H wrote:

 Hi,


 I'd like to assign multiple discrete character states to tips that are
 polymorphic. E.g., color in a given species is polymorphic (pink or white)
 so it is assigned two states for color. Is there a way to do this without
 creating an extra 'polymorphic' category?


 There is a similar question on stackoverflow with no responses yet:

 http://stackoverflow.com/questions/27413070/how-to-
 assign-multiple-states-for-taxa-in-discrete-character-data-set?

 ?

 ?Thanks in advance,

 Matt


 Matthew Koski
 University of Pittsburgh
 Biological Sciences
 Ashman Lab
 koskimatt.wix.com/matthewhkoskihttp://koskimatt.wix.com/matthewhkoski

 [[alternative HTML version deleted]]

 ___
 R-sig-phylo mailing list - R-sig-phylo@r-project.org
 https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
 Searchable archive at http://www.mail-archive.com/r-
 sig-ph...@r-project.org/


 ___
 R-sig-phylo mailing list - R-sig-phylo@r-project.org
 https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
 Searchable archive at http://www.mail-archive.com/r-
 sig-ph...@r-project.org/


[[alternative HTML version deleted]]

___
R-sig-phylo mailing list - R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/


Re: [R-sig-phylo] fitDiscrete across multiple datasets

2014-10-27 Thread Brian O'Meara
You can calculate the likelihood of the data under a given transformation:
transform the tree with a delta of 0.3 or whatever, then calculate the
likelihood of the data under a Brownian motion model using this transformed
tree. This is the same likelihood as calculating the likelihood of the data
under a delta model directly (assuming the same delta). However, what you
can do is apply the same transformation to all your datasets and add the
likelihood. This becomes a new likelihood function. You can then optimize
this (optim, nloptr, etc.). I can rig up a working example later tonight --
off to teach now.

Brian

___
Brian O'Meara
Assistant Professor
Dept. of Ecology  Evolutionary Biology
U. of Tennessee, Knoxville
http://www.brianomeara.info

Postdoc collaborators wanted: http://nimbios.org/postdocs/
Calendar: http://www.brianomeara.info/calendars/omeara

On Mon, Oct 27, 2014 at 9:34 AM, Luke Matthews 
lmatth...@activatenetworks.net wrote:

 HI Ben,
 I too am curious if anyone has an R answer to this question you pose.  One
 non-R way I can think of is to put the data into MrBayes, which has a
 number of different models available.  You could probably effectively test
 some models not baked in MrBayes by systematically manipulating the
 branchlengths you submit to it.  MrBayes provides various options for how
 tightly coupled the parameter estimates should be across the different
 genes.  It would seem something similar might exist within R but I'm not
 aware of any.
 Best
 Luke

 Luke J. Matthews | Senior Scientific Director | Activate Networks, Inc.

 --

 Message: 2
 Date: Fri, 24 Oct 2014 16:23:58 -0400
 From: Benjamin Furman benjamin.ls.fur...@gmail.com
 To: r-sig-phylo@r-project.org
 Subject: [R-sig-phylo] fitDiscrete across multiple datasets
 Message-ID:
 CACyKK5XATvW36iW_MfAR5x7LZzZjG+AAN+rtY2V7jO5VPBY5=
 g...@mail.gmail.com
 Content-Type: text/plain; charset=UTF-8

 Hello Everyone,

 I have a tree and discrete data (number of gene copies, for many
 genes) and would like to use the fitDiscrete function in geiger, or
 something similar. However, I would like to estimate the parameters given
 all of the datasets, not just with the data for each gene. For instance, if
 I was using the delta model to vary rates across the tree, I would like
 this delta value to reflect some sort of summary value across all datasets.
 Does anyone have an idea as to how this could be accomplished or perhaps
 point me in the right direction?

 Thank you for any guidance,
 Ben


 --
 Benjamin Furman, B.Sc. Specialization
 Ph.D. Candidate, Evans Lab http://benevanslab.wordpress.com
 McMaster University
 Twitter: @Xen_Ben
 Email: benjamin.ls.fur...@gmail.com, furma...@mcmaster.ca
 website: http://benjaminfurman.wordpress.com

 [[alternative HTML version deleted]]



 --

 ___
 R-sig-phylo mailing list
 R-sig-phylo@r-project.org
 https://stat.ethz.ch/mailman/listinfo/r-sig-phylo


 End of R-sig-phylo Digest, Vol 81, Issue 12

 ___
 R-sig-phylo mailing list - R-sig-phylo@r-project.org
 https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
 Searchable archive at
 http://www.mail-archive.com/r-sig-phylo@r-project.org/


[[alternative HTML version deleted]]

___
R-sig-phylo mailing list - R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/


Re: [R-sig-phylo] fitDiscrete across multiple datasets

2014-10-27 Thread Brian O'Meara
Well, a model with different rates (but same tree scaling) for different
characters is nested in a model with same rates (and same tree scaling) for
different characters, so you could do a likelihood ratio test to see which
model is better (though of course AIC et al. are also possible). See the
appendix of this article (http://www.genetics.org/content/177/3/1395.long)
for some discussion of this in the context of comparative methods (though
the method hasn't been used much, due to difficult implementation). Of
course, the field has been been doing partitioned and nonpartitioned models
for tree inference for years, just not for comparative methods as much.

In general, I like Ben's idea to use multiple characters to get at shared
rates; we should see more of this.

Best,
Brian


___
Brian O'Meara
Assistant Professor
Dept. of Ecology  Evolutionary Biology
U. of Tennessee, Knoxville
http://www.brianomeara.info

Postdoc collaborators wanted: http://nimbios.org/postdocs/
Calendar: http://www.brianomeara.info/calendars/omeara

On Mon, Oct 27, 2014 at 3:16 PM, Liam J. Revell liam.rev...@umb.edu wrote:

 Luke's points are correct. However, if we wanted to estimate a common
 transition matrix across characters we could just substitute optim.pml for
 a phyDat object of type=USER for ace internally; and if we wanted to
 estimate different parameter values across different data columns, we could
 just run fitTransform separately on each column of X and sum the
 log-likelihoods. I don't know whether either of these things (or the main
 exercise) are a good idea, but they are all certainly possible.

 All the best, Liam

 Liam J. Revell, Assistant Professor of Biology
 University of Massachusetts Boston
 web: http://faculty.umb.edu/liam.revell/
 email: liam.rev...@umb.edu
 blog: http://blog.phytools.org

 On 10/27/2014 2:57 PM, Benjamin Furman wrote:

 Hello All,

  I would also like to echo a thank you for the responses. They have
 been helpful. As for your summary Luke, I'll have to let the experts
 comment on that.

 Ben

 On Mon, Oct 27, 2014 at 1:49 PM, Luke Matthews
 lmatth...@activatenetworks.net mailto:lmatth...@activatenetworks.net
 wrote:

 Hi all,
 Thanks Brian and Liam for these very cool responses.  It does make
 sense that one should be able to iterate through the characters and
 add the likelihoods, which is elegant and seems appropriate.  It
 seems to me what this process accomplishes is:

 1. Completely couples the estimation of delta across the characters
 - thus we are now picking the value for delta that maximizes the
 likelihood across all characters not allowing any variation in delta
 across them.

 2. Leaves completely uncoupled the transition rate estimates across
 the different characters.  Thus, the different characters are freely
 estimating rates of transition from 2 to 3 gene copies, etc. such
 that the rate for one gene has no influence on the rate for another.

 Do I have that right?  Depending on Ben's hypothesized processes
 those may be reasonable assumptions.  I can also imagine processes
 where somewhat decoupling the delta estimation or somewhat coupling
 the rate estimations would be reasonable.

 Best
 Luke

 -Original Message-
 From: Liam J. Revell [mailto:liam.rev...@umb.edu
 mailto:liam.rev...@umb.edu]
 Sent: Monday, October 27, 2014 1:26 PM
 To: omeara.br...@gmail.com mailto:omeara.br...@gmail.com; Luke
 Matthews
 Cc: r-sig-phylo@r-project.org mailto:r-sig-phylo@r-project.org
 Subject: Re: [R-sig-phylo] fitDiscrete across multiple datasets

 Hi all.

 Here is a link to code that does what Brian suggests:

 http://blog.phytools.org/2014/10/optimizing-tree-
 transformations-for.html

 Note that (as noted in the blog post) I have arbitrarily used ace
 internally to compute the discrete character log-likelihood, because
 it is fast. You could instead change the code in minor ways and use
 fitDiscrete, which is slower but should be more robust.

 The demo is fairly explicit, but let us know if it works or if I
 have made any errors.

 All the best, Liam

 Liam J. Revell, Assistant Professor of Biology University of
 Massachusetts Boston
 web: http://faculty.umb.edu/liam.revell/
 email: liam.rev...@umb.edu mailto:liam.rev...@umb.edu

 blog: http://blog.phytools.org

 On 10/27/2014 7:54 AM, Brian O'Meara wrote:
   You can calculate the likelihood of the data under a given
 transformation:
   transform the tree with a delta of 0.3 or whatever, then
 calculate the
   likelihood of the data under a Brownian motion model using this
   transformed tree. This is the same likelihood as calculating the
   likelihood of the data under a delta model directly (assuming the
 same
   delta). However, what you can do is apply the same

Re: [R-sig-phylo] tip order in phylogeny - How does it work?

2014-10-08 Thread Brian O'Meara
Look at reorder.phylo() in ape and reorder() in phylo4: your desired tip
order may be there.

Best,
Brian

___
Brian O'Meara
Assistant Professor
Dept. of Ecology  Evolutionary Biology
U. of Tennessee, Knoxville
http://www.brianomeara.info

Postdoc collaborators wanted: http://nimbios.org/postdocs/
Calendar: http://www.brianomeara.info/calendars/omeara

On Wed, Oct 8, 2014 at 4:45 AM, Juan Antonio Balbuena j.a.balbu...@uv.es
wrote:

  Hello
 I am working on a simulation that requires adding a number of random tips
 to multiple phylogenies:

 treeP - read.nexus(file= my file.tre) # 5K trees used to build a
 consensus tree (treeP is thus a multiPhylo object)
 NT = 30 # number of taxa in phylogeny
 N.dup = 3 # number of tips (duplications) I want to add to the trees
 sites - sort(sample(1:NT, N.dup), decreasing=TRUE) # Random tips to
 duplicate
 dup.labels - treeP[[1]]$tip.label[sites]
 treeP - .uncompressTipLabel(treeP)
 #
 # Add duplicates to branches following a Liam Revell recipe published in
 his blog:
 #
 for (n in 1:length(treeP)) {
   for (i in 1:N.dup ) {
 pos - runif(n=1)*
 treeP[[n]]$edge.length[which(treeP[[n]]$edge[,2]==sites[i])]
 edl - runif(n=1)*
 treeP[[n]]$edge.length[which(treeP[[n]]$edge[,2]==sites[i])]
 treeP[[n]] - bind.tip(treeP[[n]], tip.label=paste(dup.labels[i], 1,
 sep=.),
edge.length=edl, where=sites[i], position=pos)
   }
 }
 #
 treeP- .compressTipLabel(treeP)
 # Check tip labels:
 treeP$tip.label
 $tip.label
  [1] S24   S25   S9S15   S3S21   S12   S17
 S19   S30   S22   S4S16   S27   S10   S23   S1
 S2S14
 [20] S6S8S26   S29   S13   S20   S7S28
 S18   S5S11   S18.1 S20.1 S16.1

 When I plot any tree in treeP, S18.1, S20.1 and S16.1 appear together with
 S18, S20 and S16, as intended. So why are they at the end of the array
 above? This behaviour makes makes my life difficult because it forces me to
 rearrange a number of matrices to make them match this tip label order. I
 would appreciate if a kind soul can explain why it is that way and whether
 is possible to reorder the tip labels, placing each added tip together with
 its respective counterpart?

 Thanks in advance for your help.

 Juan Antonio
 --

 Dr. Juan A. Balbuena
 Cavanilles Institute of Biodiversity and Evolutionary Biology
 University of Valencia
 http://www.uv.es/~balbuena
 P.O. Box 22085
 http://www.uv.es/cophylpaco
 http://www.uv.es/cavanilles/zoomarin/index.htm
 46071 Valencia, Spain
 e-mail: j.a.balbu...@uv.estel. +34 963 543 658fax +34 963 543 733
 
 *NOTE!* For shipments by EXPRESS COURIER use the following street address:
 C/ Catedrático José Beltrán 2, 46980 Paterna (Valencia), Spain.
 

 ___
 R-sig-phylo mailing list - R-sig-phylo@r-project.org
 https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
 Searchable archive at
 http://www.mail-archive.com/r-sig-phylo@r-project.org/


[[alternative HTML version deleted]]

___
R-sig-phylo mailing list - R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/

[R-sig-phylo] Tree-for-all hackathon

2014-06-10 Thread Brian O'Meara
This might be of interest for various R phylogenetics programmers: there's
currently no package to access this info.

Title
Tree-for-All: A hackathon to access OpenTree’s global phylogeny resources
Content
A global “tree of life” will transform biological research in a broad range
of disciplines from ecology to bioengineering. To help facilitate that
transformation, the OpenTree project [1] now provides online access to
4000 published phylogenies, and a newly generated tree covering more than
2.5 million species.

The next step is to build tools to enable the community to use these
resources.  To meet this aim, OpenTree, Arbor [2] and NESCent’s HIP working
groups [3] are staging a week-long hackathon September 15 to 19 at U.
Michigan, Ann Arbor.  Participants in this “Tree-for-all” will work in
small teams to develop tools that use OpenTree’s web services to extract,
annotate, or add data in ways useful to the community.  Teams also may
focus on testing, expanding and documenting the web services.

How could a global phylogeny be useful in your research or teaching?  What
other data from OpenTree would be valuable?  How could OpenTree web
services be integrated into familiar workflows and analysis tools?   How
could we add to the database of published trees, or enrich it with
annotations?

If you can imagine using these resources, and you have the skills to work
collaboratively to turn those ideas into products (as a coder, or working
side-by-side with coders), we invite you to apply for the hackathon.  The
full call for participation (http://bit.ly/1ioPPMc) provides instructions
for how to apply, and how to share your ideas with potential teammates
(strongly encouraged prior to applying).  Applications are due July 8th.
Travel support is provided.  Women and underrepresented minorities are
especially encouraged to apply.

If you have questions, contact Karen Cranston (karen.crans...@nescent.org,
@kcranstn, OpenTree), Arlin Stoltzfus (ar...@umd.edu, HIP), Julie Allen (
juli...@illinois.edu, HIP), or Luke Harmon (lu...@uidaho.edu, Arbor).


[1] http://www.opentreeoflife.org
[2] http://www.arborworkflows.com/
[3] http://www.evoio.org/wiki/HIP (Hackathons, Interoperability,
Phylogenies)

___
Brian O'Meara
Assistant Professor
Dept. of Ecology  Evolutionary Biology
U. of Tennessee, Knoxville
http://www.brianomeara.info

Postdoc collaborators wanted: Check NIMBioS' website
Calendar: http://www.brianomeara.info/calendars/omeara

[[alternative HTML version deleted]]

___
R-sig-phylo mailing list - R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/

Re: [R-sig-phylo] Phylogenetic signal, variation in rates and sample size

2014-05-05 Thread Brian O'Meara
For rate estimation between temperate and tropical, I'd suggest using
OUwie, as it's made for exactly this question: map tropical/temperate on
the tree, try the BMS model to estimate the rate of evolution in
temperate and tropical taxa (note that I'm one of the authors, so take this
bias into account). There are other models, too, to allow you to test
things like whether temperate and tropical taxa have different evolutionary
means, and you can compare these models to Brownian motion models,
depending on your question. Note that in theory, these methods should be
robust to random sampling of taxa.

For your first question, you could estimate the amount of phylogenetic
signal, create a much larger tree under some branching process, transform
the branch lengths to match your phylogenetic signal, simulate traits on
this transformed tree, then subsample back and estimate the phylogenetic
signal on the untransformed but pruned simulated tree. Ideally, the
parameter estimate you get from the simulated data matches the true
parameter, suggesting that sampling does not matter for phylogenetic
signal. Frankly, I'd be surprised if random sampling does matter for
signal, though some biased sampling (even from taxonomic bias, of not
separating two populations into different species until they are different
enough) could have an effect.

Best,
Brian

___
Brian O'Meara
Assistant Professor
Dept. of Ecology  Evolutionary Biology
U. of Tennessee, Knoxville
http://www.brianomeara.info

Postdoc collaborators wanted: Check NIMBioS' website
Calendar: http://www.brianomeara.info/calendars/omeara


On Mon, May 5, 2014 at 11:33 AM, dga...@huskers.unl.edu 
dga...@huskers.unl.edu wrote:

 Hi Imran,

 I'm by no means an expert on phylogenetic signal with incomplete
 phylogenies but I would check out the GEIGER package, specifically the
 mecca function. I think mecca is set up to do estimates based on
 incomplete tip sampling.

 Cheers,
 -Dan Gates
 PhD candidate
 Smith/Pilson Lab University of Colorado/Nebraska
 http://ec2-54-186-114-109.us-west-2.compute.amazonaws.com:8080/DanielJGates
 
 From: r-sig-phylo-boun...@r-project.org r-sig-phylo-boun...@r-project.org
 on behalf of imran khaliq imrankhal...@hotmail.com
 Sent: Monday, May 5, 2014 5:15 AM
 To: r-sig-phylo@r-project.org
 Subject: [R-sig-phylo] Phylogenetic signal, variation in rates and
 sample size

 Dear All,
 I am trying to understand the effects of incomplete phylogenies on the
 phylogenetic signal. I am using different trees and I have at least 100
 species in each tree but still these are only 2 % of total species. So how
 can I be sure that results are generalize enough to apply on the whole
 group? Secondly, if I want to compare varaition in trait's rates in two
 groups e.g. tropical and temperate, what methods are appropriate to
 estiamte the evolutionary rates. I tried Brownian rate parameter estimation
 by tranforming the branch lengths according to the lambda values of trait
 in each group.  It gives me lower varaince for the group that has higher
 lambda value and that is confusing. In fact with the same trait values, if
 I transform the branch lengths of the tree with different lamda values it
 gives me higher varaince for higher lambda value.
 Can anyone help me?

 Best regards,
 Imran

 [[alternative HTML version deleted]]

 ___
 R-sig-phylo mailing list - R-sig-phylo@r-project.org
 https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
 Searchable archive at
 http://www.mail-archive.com/r-sig-phylo@r-project.org/
 ___
 R-sig-phylo mailing list - R-sig-phylo@r-project.org
 https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
 Searchable archive at
 http://www.mail-archive.com/r-sig-phylo@r-project.org/


[[alternative HTML version deleted]]

___
R-sig-phylo mailing list - R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/


Re: [R-sig-phylo] Fit BM, OU (1 theta), OU (2 theta) with the same function

2014-02-11 Thread Brian O'Meara
There are at least three packages to do this: OUCH, SLOUCH, and OUwie. OUCH
and OUwie are on CRAN; for SLOUCH, go to
http://freshpond.org/software/SLOUCH/ .

Best,
Brian

___
Brian O'Meara
Assistant Professor
Dept. of Ecology  Evolutionary Biology
U. of Tennessee, Knoxville
http://www.brianomeara.info

Students wanted: Applications due Jan. 1, annually
Postdoc collaborators wanted: Check NIMBioS' website
Calendar: http://www.brianomeara.info/calendars/omeara


On Tue, Feb 11, 2014 at 3:05 PM, Santiago Sánchez santiago.snc...@gmail.com
 wrote:

 Hello list,

 Does someone know if there is a function out there that can fit (similar to
 fitContinuous() in geiger and compar.ou() in ape) OU models with multiple
 optima (theta) as well as a BM model.

 I'm not sure if fitContinuous() allows OU with multiple theta, and, as far
 as I know there is no compar.bm in ape.

 Any help is greatly appreciated.

 Cheers,
 Santiago

 --
 Santiago Sánchez-Ramírez
 Department of Ecology and Evolutionary Biology, University of Toronto
 Department of Natural History (Mycology), Royal Ontario Museum
 100 Queen's Park
 Toronto, ON
 M5S 2C6
 Canada
 Other email: santiago.sanc...@mail.utoronto.ca
 Tel. 416-586-8025

 [[alternative HTML version deleted]]


 ___
 R-sig-phylo mailing list - R-sig-phylo@r-project.org
 https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
 Searchable archive at
 http://www.mail-archive.com/r-sig-phylo@r-project.org/


[[alternative HTML version deleted]]

___
R-sig-phylo mailing list - R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/

Re: [R-sig-phylo] Fit BM, OU (1 theta), OU (2 theta) with the same function

2014-02-11 Thread Brian O'Meara
Make sure you're using version 1.39. Version 1.38 introduced this error; it
was fixed about a week later (just a few days ago), but that version may
not have made it out to all the cran mirrors yet.
http://cran.rstudio.comis the mirror I've been directing people to
most recently; it's updated
frequently, plus for developers it lets you see how often your packages are
downloaded.

Best,
Brian

___
Brian O'Meara
Assistant Professor
Dept. of Ecology  Evolutionary Biology
U. of Tennessee, Knoxville
http://www.brianomeara.info

Students wanted: Applications due Jan. 1, annually
Postdoc collaborators wanted: Check NIMBioS' website
Calendar: http://www.brianomeara.info/calendars/omeara


On Tue, Feb 11, 2014 at 9:09 PM, Santiago Sánchez santiago.snc...@gmail.com
 wrote:

 Hi Brian,

 Thank you for pointing these out. I've been playing around with OUwie and I
 ran into an error which I couldn't solve. This comes out when fitting any
 OU model implemented in OUwie, and it doesn't happen with BM models (they
 run fine):

 test - OUwie(phy, data, model=OU1)
 Initializing...
 Finished. Begin thorough search...
 Finished. Summarizing results.
 Error in svd(m) : infinite or missing values in 'x'

 I would appreciate if you could give me some suggestions.

 Cheers,
 Santiago





 2014-02-11 15:09 GMT-05:00 Brian O'Meara bome...@utk.edu:

  There are at least three packages to do this: OUCH, SLOUCH, and OUwie.
  OUCH and OUwie are on CRAN; for SLOUCH, go to
  http://freshpond.org/software/SLOUCH/ .
 
  Best,
  Brian
 
  ___
  Brian O'Meara
  Assistant Professor
  Dept. of Ecology  Evolutionary Biology
  U. of Tennessee, Knoxville
  http://www.brianomeara.info
 
  Students wanted: Applications due Jan. 1, annually
  Postdoc collaborators wanted: Check NIMBioS' website
  Calendar: http://www.brianomeara.info/calendars/omeara
 
 
  On Tue, Feb 11, 2014 at 3:05 PM, Santiago Sánchez 
  santiago.snc...@gmail.com wrote:
 
  Hello list,
 
  Does someone know if there is a function out there that can fit (similar
  to
  fitContinuous() in geiger and compar.ou() in ape) OU models with
 multiple
  optima (theta) as well as a BM model.
 
  I'm not sure if fitContinuous() allows OU with multiple theta, and, as
 far
  as I know there is no compar.bm in ape.
 
  Any help is greatly appreciated.
 
  Cheers,
  Santiago
 
  --
  Santiago Sánchez-Ramírez
  Department of Ecology and Evolutionary Biology, University of Toronto
  Department of Natural History (Mycology), Royal Ontario Museum
  100 Queen's Park
  Toronto, ON
  M5S 2C6
  Canada
  Other email: santiago.sanc...@mail.utoronto.ca
  Tel. 416-586-8025
 
  [[alternative HTML version deleted]]
 
 
  ___
  R-sig-phylo mailing list - R-sig-phylo@r-project.org
  https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
  Searchable archive at
  http://www.mail-archive.com/r-sig-phylo@r-project.org/
 
 
 


 --
 Santiago Sánchez-Ramírez
 Department of Ecology and Evolutionary Biology, University of Toronto
 Department of Natural History (Mycology), Royal Ontario Museum
 100 Queen's Park
 Toronto, ON
 M5S 2C6
 Canada
 Other email: santiago.sanc...@mail.utoronto.ca
 Tel. 416-586-8025

 [[alternative HTML version deleted]]


 ___
 R-sig-phylo mailing list - R-sig-phylo@r-project.org
 https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
 Searchable archive at
 http://www.mail-archive.com/r-sig-phylo@r-project.org/


[[alternative HTML version deleted]]

___
R-sig-phylo mailing list - R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/

Re: [R-sig-phylo] OUwie diagn=T Error in dim(edges)

2014-01-24 Thread Brian O'Meara
It's likely that something is wrong with your tree, especially mapping of
regimes. However, another possibility is that there is a problem computing
the Hessian matrix. Without your tree and data, it would be hard to tell.

You might also try doing contour plots. They take more time, but they're a
much better estimate of uncertainty than the Hessian approach. We used to
have code to do this in OUwie until CRAN maintainers got upset (the code
required akima, which has a more restrictive license than OUwie's). You (or
anyone else) can still get the contour plotting code at
https://r-forge.r-project.org/scm/viewvc.php/*checkout*/pkg/R/OUwie.contour.R?revision=133root=ouwie.

Best,
Brian

___
Brian O'Meara
Assistant Professor
Dept. of Ecology  Evolutionary Biology
U. of Tennessee, Knoxville
http://www.brianomeara.info

Students wanted: Applications due Jan. 1, annually
Postdoc collaborators wanted: Check NIMBioS' website
Calendar: http://www.brianomeara.info/calendars/omeara


On Fri, Jan 24, 2014 at 9:00 AM, Maria Ghazali m_ghaz...@ukr.net wrote:

 Hi,
 Please, explain me what I am doing wrong in calculating OUwie (OUwie
 version 1.37).  I want to run some additional diagnostics (with diagn=T) to
 estimate standard errors of alpha and sigma (the element solution.se).
 But an error notification appears.
 Example: data(tworegime)
 model.tworegime=OUwie(tree,trait,model=c(OUMV),root.station=TRUE,diagn=T)

 #Initializing... #Finished. Begin thorough search...
 #Finished. Summarizing results.
  #Error in dim(edges) : 'edges' is missing
 Thank you!
 [[alternative HTML version deleted]]


 ___
 R-sig-phylo mailing list - R-sig-phylo@r-project.org
 https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
 Searchable archive at
 http://www.mail-archive.com/r-sig-phylo@r-project.org/


[[alternative HTML version deleted]]

___
R-sig-phylo mailing list - R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/


Re: [R-sig-phylo] compar.ou

2013-11-12 Thread Brian O'Meara
Often if you have a trait constrained to be positive, it's appropriate to
log-transform it, which has the happy effect of making the theta for the
trait when converted back to a non-log scale constrained to be positive (as
well as probably being more appropriate for how your trait evolves).

You can have theta outside the range of observed values, but this should
only happen in something like OU1 if your tree is not ultrametric
(ultrametric = equal root to tip length, as with a chronogram of extant
species), and even then is unlikely (basically, it would only happen if you
have something like a trend).

You could do a constrained search by wrapping OUwie.fixed() in an optimizer
where you return a really bad, but finite, likelihood if you exceed one of
your bounds (or use an optimizer that allows setting bounds), or, perhaps
easier, modify the OUwie() function to return a bad likelihood for
particular values of theta. But my first suggestion is to think about
whether your data are better log-transformed (i.e, under Brownian motion,
is an increase or decrease of 10% of the present value equally likely
regardless of trait value (in which case, log transformation makes sense)).
You also may want to check that the branch lengths you are using make sense
for your problem (OU, BM, etc. models are largely (not entirely) about
transforming branch lengths on trees, so you want to make sure they are
sensible for your question to start with: do you care about rates with time
units or amount of molecular change?).

Best,
Brian


___
Brian O'Meara
Assistant Professor
Dept. of Ecology  Evolutionary Biology
U. of Tennessee, Knoxville
http://www.brianomeara.info

Students wanted: Applications due Dec. 15, annually
Postdoc collaborators wanted: Check NIMBioS' website
Calendar: http://www.brianomeara.info/calendars/omeara


On Tue, Nov 12, 2013 at 6:39 AM, sandra goutte gou...@mnhn.fr wrote:

 Dear all,

 Following up this conversation; Using OUwie, i get reasonable values for
 theta under several models for all my traits, but one. In this case, this
 is a trait which can only take positive values and i get negative theta
 values for all the different models, including BM and OU1. the standard
 error is not big enough to consider my theta value to be around zero. (for
 example: theta = -2.279, se= 0.209). Any idea of what could be causing this
 and if it is possible to force OUwie to estimate theta within certain
 boundaries?

 Thank you,
 Sandra.


 2013/10/29 Marguerite Butler mbutler...@gmail.com

  Hi Sandra and others,
 
  You can also assess confidence using parametric bootstrap, a procedure
  which we generally recommend for all users. ouch has built-in facilities
 to
  do so  (the bootstrap() and simulate() functions in addition to update()
 ).
   I think there are examples in my tutorial. If not, let me know and I can
  send you another.
 
  Marguerite
 
  On Oct 29, 2013, at 6:08 AM, Cecile Ane a...@stat.wisc.edu wrote:
 
   FYI, we have some theory to explain why alpha has large standard errors
  and in which conditions. As Brian says, it comes with a flat likelihood
  with respect with alpha.
   http://dx.doi.org/10.1214/13-AOS1105 or
  http://www.stat.wisc.edu/~ane/publis/2013HoAne_AoS.pdf
  
   On 10/29/2013 09:46 AM, Brian O'Meara wrote:
   In at least the OUwie paper we spent a lot of time doing simulations
 to
   determine this empirically (this may have been examined in other
 papers,
   too, though none come to mind). Alpha can be estimated, but sometimes
  with
   scarily large standard errors (but not always). This property should
  hold
   for related methods (it's a property of the likelihood surface for
 these
   models). You can just plot the alpha values vs likelihood to get a
  sense of
   this surface for your dataset (ideally, estimating the other
 parameters
  for
   each fixed alpha, which is possible, though the kludge of holding the
  other
   parameters at their MLE and just varying alpha will give you a sense
 of
  the
   surface, though it's a bit non-conservative). We had code in OUwie to
  do a
   contour plot of the likelihood surface but took it out (a small
  licensing
   difference between akima (university license) and OUwie (GPL) led to
  OUwie
   being pulled from CRAN by the CRAN maintainers until we resolved the
   difference).
  
   Brian
  
  
   ___
   Brian O'Meara
   Assistant Professor
   Dept. of Ecology  Evolutionary Biology
   U. of Tennessee, Knoxville
   http://www.brianomeara.info
  
   Students wanted: Applications due Dec. 15, annually
   Postdoc collaborators wanted: Check NIMBioS' website
   Calendar: http://www.brianomeara.info/calendars/omeara
  
  
   On Tue, Oct 29, 2013 at 10:21 AM, sandra goutte gou...@mnhn.fr
 wrote:
  
   Thank you Marguerite. Looking at OUwie and OUCH/SLOUCH, i see that
  alpha is
   estimated along the other parameters, whereas in Hansen 1997 and
 other
   papers

Re: [R-sig-phylo] compar.ou

2013-10-29 Thread Brian O'Meara
In at least the OUwie paper we spent a lot of time doing simulations to
determine this empirically (this may have been examined in other papers,
too, though none come to mind). Alpha can be estimated, but sometimes with
scarily large standard errors (but not always). This property should hold
for related methods (it's a property of the likelihood surface for these
models). You can just plot the alpha values vs likelihood to get a sense of
this surface for your dataset (ideally, estimating the other parameters for
each fixed alpha, which is possible, though the kludge of holding the other
parameters at their MLE and just varying alpha will give you a sense of the
surface, though it's a bit non-conservative). We had code in OUwie to do a
contour plot of the likelihood surface but took it out (a small licensing
difference between akima (university license) and OUwie (GPL) led to OUwie
being pulled from CRAN by the CRAN maintainers until we resolved the
difference).

Brian


___
Brian O'Meara
Assistant Professor
Dept. of Ecology  Evolutionary Biology
U. of Tennessee, Knoxville
http://www.brianomeara.info

Students wanted: Applications due Dec. 15, annually
Postdoc collaborators wanted: Check NIMBioS' website
Calendar: http://www.brianomeara.info/calendars/omeara


On Tue, Oct 29, 2013 at 10:21 AM, sandra goutte gou...@mnhn.fr wrote:

 Thank you Marguerite. Looking at OUwie and OUCH/SLOUCH, i see that alpha is
 estimated along the other parameters, whereas in Hansen 1997 and other
 papers it is suggested that this would lead to very large standard errors.
 Is that problem resolved in these functions?

 Best,
 Sandra.


 2013/10/26 Marguerite Butler mbutler...@gmail.com

  Dear Sandra,
 
  You might also want to look at the papers that go along with slouch,
 ouch,
  and ouwie. Here are some pdfs, along with some tutorials.
 
  On my Rcompstart, the ouch stuff starts around page 165
 
  Let me know if you need help with ouch.
 
  Marguerite
 
 
 
 
 
 
 
 
 
 
  On Oct 24, 2013, at 7:03 AM, sandra goutte gou...@mnhn.fr wrote:
 
  Dear list,
 
  My aim is to compare the fit of models for which *theta* is allowed to
  change at different nodes (different combinations of 1 ,2 or 3 nodes). I
  don't really understand the calculation of the deviance, but if i'm not
  mistaken the difference between the deviances of 2 models follows a
  *chi*square distribution with the difference of parameters in the
  models as
  degree of freedom. Now, how to compare two models with the same number of
  optimum changes but at different nodes?
 
  I am having a hard time understanding the details of the function with
  regards to Hansen's (1997) paper. What is the relationship between the
  deviance and the RSS from Hansen (1997)? Is it possible to calculate the
  RSS from the output of compar.ou ?
 
  I would appreciate any advice or insight on the subject!
  Sandra.
 
  --
  PhD Student
  Muséum National d'Histoire Naturelle
  Département Systématique et Évolution
  USM 601 / UMR 7205 Origine, Structure et Évolution de la Biodiversité
  Reptiles  Amphibiens - Case Postale 30
  25 rue Cuvier
  F-75005 Paris
 
  Tel :  +33 (0) 1 40 79 34 90
  Mobile: +33 (0) 6 79 20 29 99
 
  [[alternative HTML version deleted]]
 
  ___
  R-sig-phylo mailing list - R-sig-phylo@r-project.org
  https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
  Searchable archive at
  http://www.mail-archive.com/r-sig-phylo@r-project.org/
 
 
   
   Marguerite A. Butler
   Associate Professor
 
   Department of Biology
  University of Hawaii
  2538 McCarthy Mall, Edmondson Hall 216
  Honolulu HI 96822
 
  Office: 808-956-4713
  Dept: 808-956-8617
   Lab:  808-956-5867
  FAX:   808-956-9812
   http://www.hawaii.edu/zoology/faculty/butler.html
  http://www2.hawaii.edu/~mbutler
   http://www.hawaii.edu/zoology/
 
 
 
 
 
 
 
 
 
 


 --
 PhD Student
 Muséum National d'Histoire Naturelle
 Département Systématique et Évolution
 USM 601 / UMR 7205 Origine, Structure et Évolution de la Biodiversité
 Reptiles  Amphibiens - Case Postale 30
 25 rue Cuvier
 F-75005 Paris

 Tel :  +33 (0) 1 40 79 34 90
 Mobile: +33 (0) 6 79 20 29 99

 [[alternative HTML version deleted]]


 ___
 R-sig-phylo mailing list - R-sig-phylo@r-project.org
 https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
 Searchable archive at
 http://www.mail-archive.com/r-sig-phylo@r-project.org/


[[alternative HTML version deleted]]

___
R-sig-phylo mailing list - R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/

Re: [R-sig-phylo] Error in OUwie

2013-10-29 Thread Brian O'Meara
The error happens after it does the calculation for the point estimates, so
it's likely happening when it starts doing the calculations to look at the
curvature at the solution (to see if it's a maximum rather than a saddle
point and to get an estimate of standard errors).  Doing

OU1 - OUwie(tree, data, model = OU1, diagn=FALSE)

will turn off these extra diagnostics for now.

Note there's a dedicated OUwie support list at
https://groups.google.com/forum/#!forum/ouwie-discuss , too. People often
post to dedicated lists first before R-sig-phylo, but such lists can be
hard to find (I think that's true in this case) or are inactive. One other
advantage of R-sig-phylo posts for help is that you're more likely to hear
about other package alternatives (i.e., Have you tried mvSLOUCH or OUCH
instead?).

Best,
Brian

___
Brian O'Meara
Assistant Professor
Dept. of Ecology  Evolutionary Biology
U. of Tennessee, Knoxville
http://www.brianomeara.info

Students wanted: Applications due Dec. 15, annually
Postdoc collaborators wanted: Check NIMBioS' website
Calendar: http://www.brianomeara.info/calendars/omeara


On Tue, Oct 29, 2013 at 4:20 PM, Christopher D. Muir 
cdm...@biodiversity.ubc.ca wrote:

 All,

 This my first time posting to R-sig-anything, so apologies for any breach
 in etiquette. I searched through the mailing list and couldn’t find
 anything about this problem. A brief description of my data: I am running
 OUwie on a 528-taxon megatree generated using phylomatic. I made the tree
 dichotomously branching using ‘multi2di’, then assigned short branch
 lengths to these so that I could estimate ancestral character states
 (‘regimes’ in the OU model) to each node. OUwie works fine with my dataset
 for the BM1 and BMS models, but for all of the OU models, I get the same
 error with the singular-value decomposition function:

 OU1 - OUwie(tree, data, model = OU1)
 Initializing...
 Finished. Begin thorough search...
 Finished. Summarizing results.
 Error in svd(m) : infinite or missing values in 'x'

 I’m using R v. 2.15.1 and the latest OUwie v 1.33.

 I think this has something to do with the structure of my tree and/or the
 frequency of regime shifts. When I replace my data with Brownian Motion
 data simulated with my tree, I generally get the same error. The most
 salient feature I can think of is that ‘regime’ shifts are quite frequent
 within the tree and there are many short branches because of resolving
 polytomies. There are no NA’s or anything unusual in the trait data.

 — Chris

 ---
 Christopher D. Muir, Biodiversity PDF
 Office: BioDiv 237
 E-mail: cdm...@biodiversity.ubc.ca

 Biodiversity Research Centre
 University of British Columbia
 6270 University Blvd.
 Vancouver, BC, Canada
 V6T 1Z4


 [[alternative HTML version deleted]]


 ___
 R-sig-phylo mailing list - R-sig-phylo@r-project.org
 https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
 Searchable archive at
 http://www.mail-archive.com/r-sig-phylo@r-project.org/


[[alternative HTML version deleted]]

___
R-sig-phylo mailing list - R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/

Re: [R-sig-phylo] compar.ou (sandra goutte)

2013-10-25 Thread Brian O'Meara
You can also use OUwie for a variety of models (OU with different means
and/or different variances and/or different attraction values), but it
returns AIC and lnL scores but not RSS. It sounds, though, that mvSLOUCH
might be the best option in your case.

Best,
Brian

___
Brian O'Meara
Assistant Professor
Dept. of Ecology  Evolutionary Biology
U. of Tennessee, Knoxville
http://www.brianomeara.info

Students wanted: Applications due Dec. 15, annually
Postdoc collaborators wanted: Check NIMBioS' website
Calendar: http://www.brianomeara.info/calendars/omeara


On Fri, Oct 25, 2013 at 8:47 AM, Krzysztof Bartoszek krz...@chalmers.sewrote:

 Hi Sandra,
 I don't know about compar.ou but my mvSLOUCH package (returns RSS and
 allows you to have measurement error and missing data) and Butler  King's
 ouch package return you a lot of stastics that you can use for model
 comparison. Maybe in your case the AIC.C would be more appropriate?

 Best wishes
 Krzysztof Bartoszek

 ___
 R-sig-phylo mailing list - R-sig-phylo@r-project.org
 https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
 Searchable archive at
 http://www.mail-archive.com/r-sig-phylo@r-project.org/


[[alternative HTML version deleted]]

___
R-sig-phylo mailing list - R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/


Re: [R-sig-phylo] trouble estimating lambda with fitDiscrete

2013-09-30 Thread Brian O'Meara
I don't necessarily see it as a bad idea. The discussion of lambda mostly
concerns continuous traits, but it's just a way to stretch the tree to see
if transformed branchlengths provide a better fit. You could see if
discrete character change is based on observed branch lengths (lambda = 1)
or if there's no phylogenetic signal (lambda = 0). Phylogenetic signal
itself may be of dubious interest, but I don't see how discrete traits are
worse for this than continuous traits.

For a discussion of phylogenetic signal and why it might not be a good
measure for some processes, see Revell et al. 2008:
http://sysbio.oxfordjournals.org/content/57/4/591.full

For a discussion of tree stretching as a general class of methods that
can be used for various kinds of data, see O'Meara 2012
http://www.annualreviews.org/doi/abs/10.1146/annurev-ecolsys-110411-160331[yes,
self-promotion].

Best,
Brian

___
Brian O'Meara
Assistant Professor
Dept. of Ecology  Evolutionary Biology
U. of Tennessee, Knoxville
http://www.brianomeara.info

Students wanted: Applications due Dec. 15, annually
Postdoc collaborators wanted: Check NIMBioS' website
Calendar: http://www.brianomeara.info/calendars/omeara


On Mon, Sep 30, 2013 at 4:18 PM, Liam J. Revell liam.rev...@umb.edu wrote:

 I'm not sure of the wisdom (or interpretation) of fitting a lambda model
 to discrete traits; however it looks like your main problem is that you are
 using geiger 1.99-x. To get the old function output you will need to use
 the CRAN archive  install from source. You should be able to fit the
 lambda model in geiger =1.99 if you think this is a good idea - but the
 argument/value has changed from treeTransform=lambda to
 transform=lambda. Good luck. - Liam

 Liam J. Revell, Assistant Professor of Biology
 University of Massachusetts Boston
 web: 
 http://faculty.umb.edu/liam.**revell/http://faculty.umb.edu/liam.revell/
 email: liam.rev...@umb.edu
 blog: http://blog.phytools.org


 On 9/30/2013 12:15 PM, Katherine Brooks wrote:

 Hello,

 I am trying to estimate lambda using the fitDiscrete function so that I
 can
 obtain a measure of phylogenetic signal in a trait.  The trait has 5
 character states and is coded as letters A, B, C, D, E.

 When I run fitDiscrete, I do not get the appropriate output, as suggested
 by many phylogenetics wikis I've read.  I should get output that looks
 like
 this:

 Finding the maximum likelihood solution
 [050  100]
 []
 $Trait1
 $Trait1$lnl
 [1] -86.76405

 $Trait1$q
  [,1]
 [1,] -0.03717188

 $Trait1$treeParam
 [1] 0.9943398

 $Trait1$message
 [1] R thinks that this is the right answer.

 Perhaps I am making a mistake with the data input?  Here is what I
 enter and the output I get:


  newphy=read.nexus(S1078.wo.**perotensis.fig.tre)


  newphy


 Phylogenetic tree with 55 tips and 54 internal nodes.

 Tip labels:
 Ammospermophilus_harrisii, Ammospermophilus_interpres,
 Ammospermophilus_leucurus, Callospermophilus_lateralis,
 Callospermophilus_madrensis, Callospermophilus_saturatus, ...

 Rooted; includes branch lengths.


  SocGrCat=read.csv(file.choose(**),row.names=1)


  SocGrCat

SocGrade
 Ammospermophilus_harrisiiA
 Ammospermophilus_interpres   A
 Ammospermophilus_leucurusA
 Callospermophilus_lateralis  A
 Callospermophilus_madrensis  A
 Callospermophilus_saturatus  A
 Cynomys_gunnisoniD
 Cynomys_leucurus C
 Cynomys_ludovicianus E
 Cynomys_mexicanusD
 Cynomys_parvidensD
 Ictidomys_mexicanus  A
 Ictidomys_tridecemlineatus   B
 Ictidomys_parvidens  A
 Marmota_baibacinaE
 Marmota_bobakE
 Marmota_broweri  E
 Marmota_caligata E
 Marmota_camtschatica E
 Marmota_caudata  E
 Marmota_flaviventris D
 Marmota_himalayana   E
 Marmota_marmota  E
 Marmota_menzbieriE
 Marmota_monaxA
 Marmota_olympus  E
 Marmota_sibirica E
 Marmota_vancouverensis   E
 Neotamias_townsendii A
 Otospermophilus_beecheyi B
 Otospermophilus_variegatus   C
 Poliocitellus_franklinii A
 Spermophilus_citellusB
 Spermophilus_dauricusB
 Spermophilus_erythrogenysB
 Spermophilus_fulvus  C
 Spermophilus_major   B
 Spermophilus_pygmaeusB
 Spermophilus_suslicusB
 Spermophilus_xanthoprymnus   B
 Urocitellus_armatus  B
 Urocitellus_beldingi B
 Urocitellus_brunneus A
 Urocitellus_canus

Re: [R-sig-phylo] Pairwise tree distance matrix

2013-09-19 Thread Brian O'Meara
You may look into treedist in phangorn (but just does pairs of trees, I
believe) or dist.multiphylo in distory (polynomial time, but I don't think
it's RF distance), or dist.topo in ape (but again, pairwise and not RF).

There's also RMesquite (http://rmesquite.r-forge.r-project.org/) to have R
just call Mesquite functions.

Best,
Brian

___
Brian O'Meara
Assistant Professor
Dept. of Ecology  Evolutionary Biology
U. of Tennessee, Knoxville
http://www.brianomeara.info

Students wanted: Applications due Dec. 15, annually
Postdoc collaborators wanted: Check NIMBioS' website
Calendar: http://www.brianomeara.info/calendars/omeara


On Thu, Sep 19, 2013 at 6:41 PM, Rob Lanfear rob.lanf...@gmail.com wrote:

 Hi All,

 I'm looking for a method to calculate a pairwise distance matrix of RF
 distances between a set of trees.

 Specifically, I know it's possible to do this in linear time (relative to
 the number of taxa), using an algorithm proposed in 1985[1]. This algorithm
 is implemented in various places (e.g. TreeSetVis in Mesquite), but I
 couldn't find an implementation in R.

 If anyone knows of an implementation, or has ideas on where best to start
 building one, please let me know.

 Cheers,

 Rob


 [1] Day,W. H. E. 1985. Optimal algorithms for comparing trees with labeled
 leaves. J. Classification 2:7–28.

 --
 Rob Lanfear
 Research Fellow,
 Ecology, Evolution, and Genetics,
 Research School of Biology,
 Australian National University

 phone: +61 (0)2 6125 3611

 www.robertlanfear.com

 [[alternative HTML version deleted]]


 ___
 R-sig-phylo mailing list - R-sig-phylo@r-project.org
 https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
 Searchable archive at
 http://www.mail-archive.com/r-sig-phylo@r-project.org/


[[alternative HTML version deleted]]

___
R-sig-phylo mailing list - R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/

Re: [R-sig-phylo] Count Deep Coal in R?

2013-07-08 Thread Brian O'Meara
I think the package phybase has functions for looking at gene trees in
species trees that could be modified for this. The package has been purged
from CRAN (well, archived), but you can still install from source. I'm
CCing the package author, Liang Liu, to see if he has any ideas.

Best,
Brian

___
Brian O'Meara
Assistant Professor
Dept. of Ecology  Evolutionary Biology
U. of Tennessee, Knoxville
http://www.brianomeara.info

Students wanted: Applications due Dec. 15, annually
Postdoc collaborators wanted: Check NIMBioS' website
Calendar: http://www.brianomeara.info/calendars/omeara


On Mon, Jul 8, 2013 at 7:27 AM, Melisa Olave melizz...@hotmail.com wrote:

 just wondering if you found the way to count deep coal in R??
 I'm trying to do the same... with no success!

 thank you!
 Melisa

 ___
 R-sig-phylo mailing list - R-sig-phylo@r-project.org
 https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
 Searchable archive at
 http://www.mail-archive.com/r-sig-phylo@r-project.org/


[[alternative HTML version deleted]]

___
R-sig-phylo mailing list - R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/


Re: [R-sig-phylo] obtaining reasonable values from OUCH

2013-06-29 Thread Brian O'Meara
You're at best on the border of how many parameters you can estimate given
the size of your dataset. One thing you might try is running OUwie, which
also implements that model but uses a different optimizer and starting
values, though I wouldn't be surprised if you get similar answers. Trying a
grid of values for a contour plot can tell you something about the
likelihood surface you're trying to optimize on; it could be very flat. We
had this function in OUwie until a license conflict with akima made us have
to cut this functionality, but it's easy enough to do.

Best,
Brian


___
Brian O'Meara
Assistant Professor
Dept. of Ecology  Evolutionary Biology
U. of Tennessee, Knoxville
http://www.brianomeara.info

Students wanted: Applications due Dec. 15, annually
Postdoc collaborators wanted: Check NIMBioS' website
Calendar: http://www.brianomeara.info/calendars/omeara


On Sat, Jun 29, 2013 at 9:03 AM, Pascal Title pti...@umich.edu wrote:

 Hello,

 Given a phylogeny with 29 tips, and 5 traits that each fit a OU model
 better than a BM model of trait evolution (according to fitContinuous in
 GEIGER), I would like to fit a multivariate OU model to these 5 traits
 using the hansen() function in the OUCH package in R. The goal is to get
 multivariate rate values. I am using R v3.0.1 and version 2.8-2 of the OUCH
 package.

 Under default settings, I get an unsuccessful convergence error.
 By changing various settings, I stop getting a convergence error message.
 But the alpha and sigma2 values seem to be unrealistically large:

 *hansen(data = otd[colnames(data)], tree = ot, regimes = otd[regimes], *
 *sqrt.alpha = priors, sigma = priors, method = L-BFGS-B, *
 *maxit = 5000)*
 *
 *
 * ‘optim’  diagnostic message:  CONVERGENCE: REL_REDUCTION_OF_F =
 FACTR*EPSMCH*
 *alpha:*
 *   [,1]   [,2][,3]   [,4]   [,5]*
 *[1,]  61.385945 17.5563426 -33.7110759   3.697183  50.622310*
 *[2,]  17.556343 45.9034684  -0.3705461  27.995283  -6.633346*
 *[3,] -33.711076 -0.3705461  42.8040546   8.072352 -40.675454*
 *[4,]   3.697183 27.9952829   8.0723524  25.435097 -12.115062*
 *[5,]  50.622310 -6.6333460 -40.6754540 -12.115062  59.536768*
 *
 *
 *sigma squared:*
 *  [,1]  [,2]  [,3]  [,4]  [,5]*
 *[1,] 89.262795 32.572924  9.099638 28.769264 24.711740*
 *[2,] 32.572924 40.506308 13.271956 19.279625 -9.633395*
 *[3,]  9.099638 13.271956  4.387822  5.986198 -3.965904*
 *[4,] 28.769264 19.279625  5.986198 11.966763  2.241925*
 *[5,] 24.711740 -9.633395 -3.965904  2.241925 18.995603*

 I've been told that sigma2/(2*alpha) should be approximately equal to the
 variance of the trait. If I try this for the first trait, sigma2/(2*alpha)
 = 0.727, and the actual variance of the first trait is 1.304. So that
 doesn't match, and values for alpha and sigma2 are extremely large.

 My question is, where do I go from here? What can I do to get more
 realistic values? I've heard that you can give different starting values to
 the sqrt.alpha and sigma arguments of the hansen function, but here these
 are lower triangular matrices, so I don't know what to adjust, or what
 other starting values to give, or how to know when I've found the most
 realistic values I'm going to find.

 From the following link, you can download the tree, data and script to see
 for yourself:
 https://dl.dropboxusercontent.com/u/34644229/Pascal_OUCH.zip

 Any advice or suggestions would be greatly appreciated!

 Thank you,

 -Pascal




 --
 Pascal Title, MSc.
 PhD student, Rabosky Lab 
 http://www-personal.umich.edu/~drabosky/Home.html
 Dept of Ecology and Evolutionary Biology
 University of Michigan
 pti...@umich.edu

 [[alternative HTML version deleted]]


 ___
 R-sig-phylo mailing list - R-sig-phylo@r-project.org
 https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
 Searchable archive at
 http://www.mail-archive.com/r-sig-phylo@r-project.org/


[[alternative HTML version deleted]]

___
R-sig-phylo mailing list - R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/

Re: [R-sig-phylo] Difficulty getting posterior probabilities into ape?

2013-06-19 Thread Brian O'Meara
Phyloch has never been on CRAN as far as I know, but you can get it from
http://www.christophheibl.de/Rpackages.html .

Best,
Brian

___
Brian O'Meara
Assistant Professor
Dept. of Ecology  Evolutionary Biology
U. of Tennessee, Knoxville
http://www.brianomeara.info

Students wanted: Applications due Dec. 15, annually
Postdoc collaborators wanted: Check NIMBioS' website
Calendar: http://www.brianomeara.info/calendars/omeara


On Wed, Jun 19, 2013 at 11:14 PM, Tom Diggs jtdi...@mail.ad.uab.edu wrote:

 Thank you very much for the answer.  Unfortunately, that command is not
 working for me.  It apparently reads the file just like the read.nexus
 command does in my case.  The tree comes in fine, with the branch lengths,
 but no posterior values.

 So I guess that the read.nexus command ignores anything inside square
 brackets?  I imagine that's what the read.annotated.nexus was designed to
 deal with, but unless I'm doing something incredibly wrong, it's not
 working.

 I also read that phyloch was a way to get around this problem, but it's
 not in CRAN anymore either.

 Otherwise, should I go through and just eliminate the square brackets
 manually?  Replace them with regular parentheses?  Leave them out
 altogether?

 The phylogeny I'm currently dealing with isn't THAT big, and while it'd be
 a pain to do it manually, it wouldn't be prohibitive.

 Has anyone done that before?

 Thanks,
 Tom
 - Original Message - From: Jombart, Thibaut 
 t.jomb...@imperial.ac.uk
 To: Tom Diggs jtdi...@mail.ad.uab.edu; r-sig-phylo@r-project.org
 Sent: Wednesday, June 19, 2013 12:39 PM
 Subject: RE: [R-sig-phylo] Difficulty getting posterior probabilities into
 ape?



 Hello,

 Marc Suchard has recently implemented this in the package epibase - check
 the function read.annotated.nexus.

 All the best

 Thibaut.
 __**__
 From: r-sig-phylo-bounces@r-project.**orgr-sig-phylo-boun...@r-project.org[
 r-sig-phylo-bounces@r-**project.org r-sig-phylo-boun...@r-project.org]
 on behalf of Tom Diggs [jtdi...@mail.ad.uab.edu]
 Sent: 19 June 2013 18:33
 To: r-sig-phylo@r-project.org
 Subject: [R-sig-phylo] Difficulty getting posterior probabilities into ape?

 Hi all,

 I was wondering if anyone else has had this problem, and how they
 addressed it.

 I've been using BEAST to generate trees, and TreeAnnotator to summarize
 them.  In the output nexus file, the tree plainly has posterior
 probabilities for each node listed, but when I try to import the trees into
 R, my tree has no node labels (mytree$node.labels = NULL).

 I searched the archives for this listserve, and found one instance of this
 problem cropping up with MrBayes output. http://www.mail-archive.com/r-**
 sig-ph...@r-project.org/**msg02227.htmlhttp://www.mail-archive.com/r-sig-phylo@r-project.org/msg02227.html

 Unfortunately, the package phybase, listed as a potential solution to
 the problem, is no longer in CRAN.

 Am I missing something obvious?  Is ape having trouble reading the node
 labels for some reason?

 Any help would be much appreciated.  Thanks so much.

 Tom
[[alternative HTML version deleted]]

 __**_
 R-sig-phylo mailing list - R-sig-phylo@r-project.org
 https://stat.ethz.ch/mailman/**listinfo/r-sig-phylohttps://stat.ethz.ch/mailman/listinfo/r-sig-phylo
 Searchable archive at http://www.mail-archive.com/r-**
 sig-ph...@r-project.org/http://www.mail-archive.com/r-sig-phylo@r-project.org/

 __**_
 R-sig-phylo mailing list - R-sig-phylo@r-project.org
 https://stat.ethz.ch/mailman/**listinfo/r-sig-phylohttps://stat.ethz.ch/mailman/listinfo/r-sig-phylo
 Searchable archive at http://www.mail-archive.com/r-**
 sig-ph...@r-project.org/http://www.mail-archive.com/r-sig-phylo@r-project.org/


[[alternative HTML version deleted]]

___
R-sig-phylo mailing list - R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/


[R-sig-phylo] R postdoc with O'Meara and Carstens

2013-04-22 Thread Brian O'Meara
This should be coming in via EvolDir soon, but may be of interest to many
of the readers of this list. You can find out more about the project,
including looking at the R code, at http://www.brianomeara.info/phrapl .
Please forward to any good candidates, and let me know if you have any
questions (or suggestions).

Thanks,
Brian

Phylogeography often compares a select handful of models to answer an
empirical question. We have just been funded to continue development of
Phrapl, a way to examine thousands of potential phylogeographic models for
fit to data. It is a bit like ModelTest, but for models of population
subdivision and migration rather than substitution rates. The aim is for
empiricists to be able to let their data tell them which processes are
important in the phylogeographic history of a group. A postdoctoral
position is available to work on improving, testing, and applying this
approach. The current code is mostly in R but using other languages to
speed up the computation internally is quite possible. The position is in
the lab of PI Brian O’Meara at U. of Tennessee, Knoxville, but coordination
with Co-PI Bryan Carstens and occasional travel to Ohio State is expected.

The O’Meara lab currently hosts five postdocs (including four co-mentored
through NIMBioS) as well as a co-advised grad student. The EEB department
features ten research groups whose work includes
phylogenetics/phylogeography, and the presence of the National Institute
for Mathematical and Biological Synthesis further creates a robust
ecosystem of colleagues.

Knoxville, Tennessee, is just a half hour away from the US’ most visited
national park, Great Smoky Mountains NP. Cost of living is low, and the
Knoxville area can cater to interests ranging from farmers’ markets, local
music, and handmade crafts to SEC football and NASCAR. I strive to create a
lab group that is welcoming and open. We have good diversity in terms of
gender (57% female), life stage (i.e., 43% with young children) but (so
far) poor diversity on various other metrics. Mentoring to allow postdocs
to reach their career goals is a focus. Research in the lab ranges from
empirical work on plant taxonomy to quantitative trait evolution models to
models of protein evolution.

Applications should include
1) A cover letter (include expected completion date of PhD, if appropriate,
as well as relevant skills)
2) A CV
3) A short research statement
4) Contact information for two references
5) Link(s) to repositories with examples of code you have written or
attachments including such code.

Experience in programming is strongly preferred; experience in R is
desired, but not required. If in doubt, apply: please do not self-select
yourself out from what might be a mutually beneficial position.

Review of applications will begin on May 1 and continue until filled (start
dates are flexible). Presubmission inquiries are encouraged (I can give you
a copy of the proposal, point you to the code, and answer any questions you
may have).

All qualified applicants will receive equal consideration for employment
and admissions without regard to race, color, national origin, religion,
sex, pregnancy, marital status, sexual orientation, gender identity, age,
physical or mental disability, or covered veteran status.

Eligibility and other terms and conditions of employment benefits at The
University of Tennessee are governed by laws and regulations of the State
of Tennessee, and this non-discrimination statement is intended to be
consistent with those laws and regulations.

In accordance with the requirements of Title VI of the Civil Rights Act of
1964, Title IX of the Education Amendments of 1972, Section 504 of the
Rehabilitation Act of 1973, and the Americans with Disabilities Act of
1990, The University of Tennessee affirmatively states that it does not
discriminate on the basis of race, sex, or disability in its education
programs and activities, and this policy extends to employment by the
University.

Inquiries and charges of violation of Title VI (race, color, national
origin), Title IX (sex), Section 504 (disability), ADA (disability), Age
Discrimination in Employment Act (age), sexualorientation, or veteran
status should be directed to the Office of Equity and Diversity (OED), 1840
Melrose Avenue, Knoxville, TN 37996-3560, telephone (865) 974-2498 (V/TTY
available) or 974-2440. Requests for accommodation of a disability should
be directed to the ADA Coordinator at the Office of Equity and Diversity.

___
Brian O'Meara
Assistant Professor
Dept. of Ecology  Evolutionary Biology
U. of Tennessee, Knoxville
http://www.brianomeara.info

Students wanted: Applications due Dec. 15, annually
Postdoc collaborators wanted: Check NIMBioS' website
Calendar: http://www.brianomeara.info/calendars/omeara

[[alternative HTML version deleted]]

___
R-sig-phylo mailing list - R-sig-phylo@r-project.org
https

Re: [R-sig-phylo] question about analysing trait differences

2013-04-21 Thread Brian O'Meara
The problem is reconstructing overlap down the tree (it's possible to do,
but whether it's possible to do well is another question). One thing you
could do to avoid this is to use the *other* method from Felsenstein's 1985
independent contrasts paper, taking pairs of species independent of other
pairs. A way to do this:

1) Get a clade of size two (aka a cherry sensu Semple and Steel). Every
tree has at least one.
2) Difference in overlap and diference in mass for the pair of taxa making
up this clade becomes one point (perhaps standardize for time, and
positivize it such that you subtract them in the order that leaves the mass
difference positive).
3) Prune these two taxa off the tree.
4) Go back to step 1. When you're done, assuming no polytomies, you'll have
zero or one species left. [it's possible to do with polytomies, too: if
assumed to be hard, then just do a pair of taxa from a terminal polytomy
and leave the rest for the next step. If assumed to be soft, then just take
two from a terminal polytomy and discard the remaining taxa]

Contrasts gives you Ntax - 1 contrasts. This gives you floor(Ntax/2) but
avoids having to do reconstructions. I have code around to do this but
can't find it at the moment, but it's not hard to write -- post again if
you need help with it and no better ideas have been proposed.

Hope this helps,
Brian



___
Brian O'Meara
Assistant Professor
Dept. of Ecology  Evolutionary Biology
U. of Tennessee, Knoxville
http://www.brianomeara.info

Students wanted: Applications due Dec. 15, annually
Postdoc collaborators wanted: Check NIMBioS' website
Calendar: http://www.brianomeara.info/calendars/omeara


On Sun, Apr 21, 2013 at 10:26 AM, Langevelde, Frank van 
frank.vanlangeve...@wur.nl wrote:

 Dear all
 I would like to test whether there is a relationship between two pairwise
 trait differences, or to be more specific: overlap in geographical ranges
 between pairs of species (as dependent variable in regression) related to
 the body mass differences of the pairs (as independent variable). As I
 expect that these traits are not phylogeneticaly independent, I would like
 to add phylogeny in the analysis. I thought about using independent
 contrasts (as these are also calculated pairwise), but I do not know how to
 analyse. One suggestion would be to use the distance between the pairs of
 species (e.g. distance to the closest shared node in the tree) and include
 these as the second independent variable. Is this possible or does anyone
 has a better suggestion?

 Thanks for your help!

 Kind regards

 Frank van Langevelde

 Resource Ecology group
 Wageningen University
 The Netherlands
 http://www.resecol.wur.nl/staff/frank/index_FrankvL.htm

 ___
 R-sig-phylo mailing list - R-sig-phylo@r-project.org
 https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
 Searchable archive at
 http://www.mail-archive.com/r-sig-phylo@r-project.org/


[[alternative HTML version deleted]]

___
R-sig-phylo mailing list - R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/


Re: [R-sig-phylo] Estimating ancestral states of a trait under selection

2013-04-19 Thread Brian O'Meara
We have a method in R, TreEvo, that can allow you to do things like have a
model where the optimum shifts through time according to some factor (like
external data on climate) or where there are constraints that change
through time: it's up on R-forge, and you can install the package, but it
hasn't been published yet and so could have problems that won't appear
until peers have looked at it more thoroughly. If you're fine waiting to
publish with it until it's been vetted and is in press, feel free to play
with it.

Of course, another thing you could do is throw a prior on the root state
based on your knowledge of the climate then and then do a Bayesian
reconstruction. I don't know of any packages that do this at the moment,
but it wouldn't be hard to write.

Best,
Brian

___
Brian O'Meara
Assistant Professor
Dept. of Ecology  Evolutionary Biology
U. of Tennessee, Knoxville
http://www.brianomeara.info

Students wanted: Applications due Dec. 15, annually
Postdoc collaborators wanted: Check NIMBioS' website
Calendar: http://www.brianomeara.info/calendars/omeara


On Fri, Apr 19, 2013 at 4:00 PM, Eliot Miller eliotmil...@umsl.edu wrote:

 Hello all,

 I've read some past posts and tried to figure this one out myself. It
 appears that the short answer is if there are not some extinct lineages in
 the tree with known trait values, it's difficult to do.

 I have a data set of the current climatic niches of a continental radiation
 of taxa. If I reconstruct the ancestral state of the root using ace() with
 REML, I get a semi-believable answer, though obviously with wide CIs. My
 gut instinct is that the absolute value that is returned is slightly less
 than it should be.

 The issue is that I know the climate of the continent has changed radically
 (but at least consistently in one direction!) during the course of the
 radiation of the clade. An educated guess would put the magnitude of the
 change at 1200 to 550 mm rain per year over 60 million years. I'd guess
 that most lineages have felt this aridification as a slow and steady
 selective pressure to figure it out in increasingly dry conditions
 (there's probably also been a lot of extinction but let's skip that point
 for now).

 So, while I don't have extinct lineages with known trait values, I do have
 what I'd be comfortable modeling as a continuous selective pressure
 throughout the course of the radiation. I'm interested to see how doing
 this would change the actual absolute value returned by the ancestral state
 reconstruction. Any tips on how this might be best effected?

 Final note, my tree is from uncorrected molecular distances, so it's not
 ultrametric. The analysis I'm trying to run here is tangential to my main
 focus, so I'm ok with making the tree ultrametric with something like
 chronopl() just out of curiosity to see what comes out. Someday I'll have a
 time calibrated tree and then I can re-run this analysis.

 Best wishes,
 Eliot

 [[alternative HTML version deleted]]

 ___
 R-sig-phylo mailing list - R-sig-phylo@r-project.org
 https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
 Searchable archive at
 http://www.mail-archive.com/r-sig-phylo@r-project.org/


[[alternative HTML version deleted]]

___
R-sig-phylo mailing list - R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/


Re: [R-sig-phylo] R Dendrograms for Subclones in a Somatic Cell Population: Time Interval Sampling.

2013-02-28 Thread Brian O'Meara
An old parsimony-based approach to this is known as stratocladistics. There
are no R implementations, as far as I know, but you could wrap phangorn to
do this, I imagine, though it does require writing a new function.
Pseudocode:

#my.taxa.vector is character vector of tips
best.phy - rtree(length(my.taxa.vector), tip.label=my.taxa.vector)
best.trees - c(best.phy)
best.score - parsimony(best.phy, data) + strato(best.phy, times)
for (i in sequence(nsteps)) {
  new.phy - rSPR(best.trees[[1]])
  new.score - pars(new.phy, data) + strato(new.phy, times)
  if (new.score == best.score) {
best.trees - c((new.phy), best.trees)
  }
  if (new.score  best.score) {
best.trees - c(new.phy)
best.score - new.score
  }
}

this would do a greedy spr search: you'd want to restart from different
trees and such. The only tricky thing is figuring out the new strato
function: try a set of branch lengths and take as the best score the one
that implies the least amount of stratographic debt (ape's node.height
function could be useful for this; one thing that makes these easy is that
times and node heights must be integers (number of time points from root),
so even searching exhaustively is feasible, though almost
surely unnecessary). Some of these branch lengths could be zero,
indicating, in the case of a terminal branch length, a sample that is a
direct ancestor of another sample. The tree isn't quite rooted but it is
polarized, with nodes sampled further back in time pushed down the tree.
Writing the strato function really wouldn't be that bad to do.

Another approach is to do a likelihood search assuming a clock, but with
tips constrained to occur at time of sampling rather than being coeval.
Heibl and Cusimano's Lagopus package (not on CRAN, go to
http://www.christophheibl.de/mdt/mdtinr.html) calls PAML and multidivtime
to estimate a tree with age constraints. Multidivtime can use constraints
such that the tips are not coeval, but I'm not certain that Lagopus can
pass this information (it can certainly do node constraints, just not sure
about tip constraints). If it can, or could be modified to do so, this
would give you a tree with samples constrained to be at the right times and
with possibly zero length branches for truly ancestral samples. You could
then collapse these using di2multi in ape.

Someone else may know of other ways to attack this problem.

Hope this helps,
Brian

___
Brian O'Meara
Assistant Professor
Dept. of Ecology  Evolutionary Biology
U. of Tennessee, Knoxville
http://www.brianomeara.info

Students wanted: Applications due Dec. 15, annually
Postdoc collaborators wanted: Check NIMBioS' website
Calendar: http://www.brianomeara.info/calendars/omeara


On Thu, Feb 28, 2013 at 3:21 PM, sa...@math.berkeley.edu wrote:

 Estimating Ongoing Evolution by Repeated Sampling with Long Time
 Intervals.

 Is there a way to construct dendrograms similar to those used in
 phylogenetics but
 with 2 main differences:
  (1) Instead of observing at one time, small samples from a very large
 population are taken at regular intervals, so that some observed cells
 could
 easily correspond to an internal node rather than a leaf.
  (2) There is no obvious outgroup; root should if possible be
 estimated by
 presuming that observations from later time points are on average farther
 from
 root.

 More specifically, consider a large, heterogeneous, unstably evolving in
 vitro cell
 culture apparently not subject to a Hayflick limit. In our feasibility
 study, a
 sample of 20 cells were tested at t=0 for about 100 different numerical
 aspects of
 their karyotype (for each cell an ordered vector of 100 numbers is
 measured from
 the genome; the individual numbers all have the same order of magnitude).

 About 15 cell generations later the observation is repeated and similarly
 four more
 times for a total of 120 cells over a time span of about 60 cell
 generations. I
 would like to estimate the behavior of the major subclones – Are some
 spinning off
 new karyotypes? Which ones, if any, are in the process of taking over? Are
 some
 being outcompeted? And so on.

 Various difference matrices and binary dendrograms with the cells as
 leaves are
 easily constructed and are suggestive. For example at timepoint 5 one
 karyotype
 which was prominent, with lots of duplicates, for timepoints 1-4
 disappears from
 the samples. But the dendrograms themselves don’t really use the fact that
 observations were made at six consective times rather than simultaneously;
 and they
 require me to make a guess about where root is. There must be a better way
 to use
 the data. I assume people who work, say, on development of drug-resistant
 bacterial
 lineages have thought this through in some detail and developed R software
 for it
 but I wasn’t able to locate anything.

 Thanks in Advance, Ray Sachs, Dept. Math, UCB

 ___
 R-sig-phylo mailing list - R-sig

Re: [R-sig-phylo] Analysis with Multiple cores on Mac Workstation

2012-12-08 Thread Brian O'Meara
I agree with Daniel that going in parallel is probably overkill in this
case. However, if you do want to get into parallelization with R, a good
place to start is the CRAN task view on high performance and parallel
computing: http://cran.r-project.org/web/views/HighPerformanceComputing.html.
Like all task views, it provides an overview of the packages in that
domain and an easy way to install all of them at once (see
http://cran.r-project.org/web/views/ ). The built-in parallel package
(since R 2.14) makes it easy to use all your cores using mclapply(). If you
tend to think in for loops (as most of us do) rather than apply functions,
the foreach package combined with doMC is another easy way: there's a good
vignette for this at
http://cran.r-project.org/web/packages/doMC/vignettes/gettingstartedMC.pdf.

Best,
Brian


___
Brian O'Meara
Assistant Professor
Dept. of Ecology  Evolutionary Biology
U. of Tennessee, Knoxville
http://www.brianomeara.info

Students wanted: Applications due Dec. 15, annually
Postdoc collaborators wanted: Check NIMBioS' website
Calendar: http://www.brianomeara.info/calendars/omeara


On Sat, Dec 8, 2012 at 7:01 AM, Daniel Barker d...@st-andrews.ac.uk wrote:

 Dear Jose,

 Is this a problem in practice? The calculation of branch lengths Brian
 describes do not sound very time-consuming to run. If you're dealing with
 an enormous tree or enormous sample, they would take some time - but
 presumably, the greater bottleneck would be obtaining the sample of trees
 in the first place.

 Anything can become time-consuming if repeated, e.g. if you're dealing
 with thousands of separate samples of trees. If things are taking too long
 in that situation, the best solution would be to run serial processes, not
 multithreaded - but many of them, submitted to a batch queuing system
 running on the computer. Examples include SLURM, GridEngine, LSF, Unix
 'batch' (comes as part of OS X but I've never managed to set up the load
 threshold appropriately on OS X), or perhaps Xgrid (I don't know it).

 Except in very unusual situations, this kind of 'task farming', if
 properly set up, will always be at least as efficient as 'true'
 parallelisation like multithreading - often faster. It also doesn't
 require any special programming, and parallel programming can be fairly
 painful.

 An even simpler approach is to divide the task into n approximately
 equally time-consuming scripts (if you have n CPU cores) and launch them
 all at the same time - which uses all your cores, and doesn't even require
 a batch queuing system.


 Best wishes,

 Daniel

 On 08/12/2012 02:43, José Hidasi hidasin...@gmail.com wrote:

 Dear all,
 
 I want to create a consensus tree with branch lengths (Brian O'Meara's
 function on post *[R-sig-phylo] Why no branch lengths on consensus
 trees?*) using
 a Mac Workstation. However, if I only type the function on it, R will not
 use all cores for running the analysis. I would like to know if there is a
 function or any way to divide the analysis within the cores, or to use all
 cores for running the program.
 
 I know this is not the best forum to ask something like that (using
 multiple cores), but I imagined that someone might have the solution for
 that as some of you work with large databases.
 
 Best,
 José Hidasi
 
 
 
 *
 This is Brian O'Meara's function, that i am using to create the consensus
 tree, on the post *[R-sig-phylo] Why no branch lengths on consensus
 trees?:*
 
 I have a function to create a consensus tree with branch lengths. You
 feed it a given topology (often a consensus topology, made with ape), then
 a list of trees, and tell it what you want the branch lengths to
 represent. It could be the proportion of input trees with that edge (good
 for summarizing bootstrap or Bayes proportions) or the mean, median, or sd
 of branch lengths for those trees that have that edge. Consensus branch
 lengths in units of proportion of matching trees has obvious utility.
 As Daniel says, the average branch lengths across a set of trees is
 more difficult to see a use case for, but you could imagine doing
 something
 like taking the ratogram output from r8s on a set of trees and summarizing
 the rate average and rate sd on a given, best, tree as two sets of
 branch
 lengths on that tree.
 
 I've put the function source at
 
 https://r-forge.r-project.org/scm/viewvc.php/*checkout*/pkg/R/consensusBrl
 en.R?revision=110root=omearalab
 .
  You can source the file for the function (consensusBrlen() ) and
 other
 functions it needs. It also uses phylobase. Note that this is
 alpha-quality
 code -- it's been checked a bit, but verify it's doing what you want.
 
 Here's an example of how to use it
 
  library(ape)
 
 library(phylobase)
 
 phy.a-rcoal(15)
 
 phy.b-phy.a
 
 phy.b$edge.length-phy.b$edge.length+runif(length(phy.b$edge.length), 0,
 0.1)
 
 phy.c-rcoal(15)
 
 phy.list-list(phy.a, phy.b, phy.c)
 
 phy.consensus-consensusBrlen(phy.a, list

Re: [R-sig-phylo] Why no branch lengths on consensus trees?

2012-11-21 Thread Brian O'Meara
I have a function to create a consensus tree with branch lengths. You feed
it a given topology (often a consensus topology, made with ape), then a
list of trees, and tell it what you want the branch lengths to represent.
It could be the proportion of input trees with that edge (good for
summarizing bootstrap or Bayes proportions) or the mean, median, or sd of
branch lengths for those trees that have that edge. Consensus branch
lengths in units of proportion of matching trees has obvious utility. As
Daniel says, the average branch lengths across a set of trees is more
difficult to see a use case for, but you could imagine doing something like
taking the ratogram output from r8s on a set of trees and summarizing the
rate average and rate sd on a given, best, tree as two sets of branch
lengths on that tree.

I've put the function source at
https://r-forge.r-project.org/scm/viewvc.php/*checkout*/pkg/R/consensusBrlen.R?revision=110root=omearalab.
 You can source the file for the function (consensusBrlen() ) and
other
functions it needs. It also uses phylobase. Note that this is alpha-quality
code -- it's been checked a bit, but verify it's doing what you want.

Here's an example of how to use it

 library(ape)

library(phylobase)

phy.a-rcoal(15)

phy.b-phy.a

phy.b$edge.length-phy.b$edge.length+runif(length(phy.b$edge.length), 0,
0.1)

phy.c-rcoal(15)

phy.list-list(phy.a, phy.b, phy.c)

phy.consensus-consensusBrlen(phy.a, list(phy.a, phy.b, phy.c),
type=mean_brlen)


Best,
Brian


PS: Note that I am actively looking for grad students: info at
http://www.brianomeara.info/lab . Guaranteed five years support, subject to
decent performance.

___
Brian O'Meara
Assistant Professor
Dept. of Ecology  Evolutionary Biology
U. of Tennessee, Knoxville
http://www.brianomeara.info

Students wanted: Applications due Dec. 15, annually
Postdoc collaborators wanted: Check NIMBioS' website
Calendar: http://www.brianomeara.info/calendars/omeara


On Wed, Nov 21, 2012 at 11:09 AM, Daniel Barker d...@st-andrews.ac.ukwrote:

 Dear Scott,

 What should branch lengths on a consensus tree represent?

 They cannot be expected substitutions per residue. This would imply no
 evolution at points where uncertain branching patterns have been reduced
 to a multi-furcation - which is not what the multi-furcation is meant to
 imply. (Rather: there was evolution, but we aren't very certain about the
 branching pattern.)

 But, MrBayes does provide average lengths of some kind.

 Best wishes,

 Daniel

 On 21/11/2012 15:13, Scott Chamberlain myrmecocys...@gmail.com wrote:

 When making a consensus tree using ape::consensus the branch lengths are
 lost. Is there a way to not lose the branch lengths? Or to add them
 somehow
 to the consensus tree after making it.
 
 library(ape)
 
 cat(owls(((Strix_aluco:4.2,Asio_otus:4.1):4.1,Athene_noctua:7.3):6.3,Tyt
 o_alba:13.5);, file = ex1.tre, sep = \n)
 cat(owls(((Strix_aluco:1.2,Asio_otus:4.5):3.1,Athene_noctua:7.3):6.3,Tyt
 o_alba:13.5);, file = ex2.tre, sep = \n)
 cat(owls(((Strix_aluco:3.2,Asio_otus:4.7):8.1,Athene_noctua:7.3):6.3,Tyt
 o_alba:13.5);, file = ex3.tre, sep = \n) tree1 -
 read.tree(ex1.tre) tree2 - read.tree(ex2.tre) tree3 -
 read.tree(ex3.tre) trees - c(tree1, tree2, tree3) trees_con -
 consensus(trees) trees_con
 Phylogenetic tree with 4 tips and 3 internal nodes.
 Tip labels:[1] Strix_aluco   Asio_otus Athene_noctua Tyto_alba
 Rooted; no branch lengths.
 
 
 Thanks, Scott Chamberlain
 
[[alternative HTML version deleted]]
 
 ___
 R-sig-phylo mailing list
 R-sig-phylo@r-project.org
 https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
 


 --
 Daniel Barker
 http://bio.st-andrews.ac.uk/staff/db60.htm
 The University of St Andrews is a charity registered in Scotland : No
 SC013532

 ___
 R-sig-phylo mailing list
 R-sig-phylo@r-project.org
 https://stat.ethz.ch/mailman/listinfo/r-sig-phylo


[[alternative HTML version deleted]]

___
R-sig-phylo mailing list
R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo


Re: [R-sig-phylo] estimating mutation rates

2012-11-08 Thread Brian O'Meara
If you want the branch-specific substitution rate, you could estimate the
branch lengths on a topology constrained to match the chronogram, then
divide each of the molecular tree's branch lengths by the corresponding
chronogram branch lengths to get these rates. If you want a global rate,
you might constrain the molecular tree to have the same proportional branch
lengths as the chronogram. Look at pml() in phangorn for calculating the
likelihoods.

Best,
Brian

___
Brian O'Meara
Assistant Professor
Dept. of Ecology  Evolutionary Biology
U. of Tennessee, Knoxville
http://www.brianomeara.info

Students wanted: Applications due Dec. 15, annually
Postdoc collaborators wanted: Check NIMBioS' website
Calendar: http://www.brianomeara.info/calendars/omeara


On Thu, Nov 8, 2012 at 12:49 PM, john d dobzhan...@gmail.com wrote:

 Hi there,

 I have a well-supported phylogeny (with branch lengths proportional to
 absolute time) and I'm interested in estimating the mutation rate of
 an independently-obtained molecular dataset. Any suggestions?

 thanks!

 John

 ___
 R-sig-phylo mailing list
 R-sig-phylo@r-project.org
 https://stat.ethz.ch/mailman/listinfo/r-sig-phylo


[[alternative HTML version deleted]]

___
R-sig-phylo mailing list
R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo


Re: [R-sig-phylo] variation in rates over time

2012-09-17 Thread Brian O'Meara
I agree with the suggestions so far. I just wanted to point out a few more
alternatives:

You could use the geiger package to estimate the best scaling for the
tworatetree transformation to do this (should be equivalent to the earlier
solutions, though it would require running optimization).

You could also use the OUwie package with a tree that has been given simmap
mappings using phytools. The advantage of this is that you could evaluate
Brownian models but you could also look at various Ornstein-Uhlenbeck
models (though note that while there is information about different
Brownian rates before and after a time slice, info about different alphas
(strength of pull parameter) and thetas (the attractor, aka optimal
value) is rapidly (but not immediately) lost under an OU process).

For completeness, especially for citations, note that O'Meara et al. (2006)
and Thomas et al. (2006) independently arrived at essentially the same
method, so it is worth reading both papers.

Best,
Brian

___
Brian O'Meara
Assistant Professor
Dept. of Ecology  Evolutionary Biology
U. of Tennessee, Knoxville
http://www.brianomeara.info

Students wanted: Applications due Dec. 15, annually
Postdoc collaborators wanted: Check NIMBioS' website
Calendar: http://www.brianomeara.info/calendars/omeara


On Mon, Sep 17, 2012 at 9:11 PM, Jason S jas2...@yahoo.com wrote:



 Thanks, guys. That's exactly what I needed.


 
  From: Liam J. Revell liam.rev...@umb.edu
 To: Matt Pennell mwpenn...@gmail.com

 -project.org
 Sent: Monday, September 17, 2012 9:22 PM
 Subject: Re: [R-sig-phylo] variation in rates over time

 Hi Jason. Matt is absolutely correct. You can do this with phytools.
 Say, for instance, you have an ultrametric phylogeny with branches in
 millions of years (tree) and data vector containing the trait values for
 species (x) and you want to test the hypothesis that the last 3.4 my has
 a different rate of evolution than the rest of the tree, you could do
 this as follows:

 library(phytools) # load phytools
 tree-make.era.map(tree,c(0,max(nodeHeights(tree))-3.4))
 plotSimmap(tree,pts=F,lwd=3) # visualize
 fit-brownie.lite(tree,x) # fit model

 That's it. Good luck. Liam

 Liam J. Revell, Assistant Professor of Biology
 University of Massachusetts Boston
 web: http://faculty.umb.edu/liam.revell/
 email: liam.rev...@umb.edu
 blog: http://phytools.blogspot.com

 On 9/17/2012 7:46 PM, Matt Pennell wrote:
 
  Jason,
 
  I think the best way to do this is with the approach of O'Meara et al.
 2006
  Evolution Brownie.
 
  Liam Revell has implemented this in R in his package phytools. You can
  modify the steps taken in this tutorial here
 
 http://phytools.blogspot.com/2011/07/running-brownielite-for-arbitrarily.html
  perhaps
  in conjunction with the function make.era.map()
 
 http://phytools.blogspot.com/2011/10/new-function-to-plot-eras-on-tree.html
  though
  I admittedly have not tried this myself.
 
  Perhaps Liam or someone else has a better explanation but I hope this is
 at
  least somewhat helpful.
 
  cheers,
  matt
 

 
 
 
 
  Hello,
 
  I see that there are several interesting alternatives to test for
  different rates among clades. However, I was wondering if there is a
 method
  to test for varying rates over time. I'm aware of Pagel's delta and the
 EB
  model, but I was thinking more in terms of testing if there is a
 different
  rate for the entire tree after a specified point in time. For instance,
 if
  a snail predator colonizes an island 3.4 Mya, is there evidence for an
  increased rate of evolution in the prey after that point in time?
 Something
  like two lambdas, one for before and one for after that point in time.
 
  Thanks!
 
  Jason
   [[alternative HTML version deleted]]
 
  ___
  R-sig-phylo mailing list
  R-sig-phylo@r-project.org
  https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
 
 
  [[alternative HTML version deleted]]
 
  ___
  R-sig-phylo mailing list
  R-sig-phylo@r-project.org
  https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
 
 [[alternative HTML version deleted]]


 ___
 R-sig-phylo mailing list
 R-sig-phylo@r-project.org
 https://stat.ethz.ch/mailman/listinfo/r-sig-phylo



[[alternative HTML version deleted]]

___
R-sig-phylo mailing list
R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo


Re: [R-sig-phylo] Estimating MRCA dates

2012-02-02 Thread Brian O'Meara
Ape can do this, function chronopl. In general, one good way to find out
which package can do something is by looking at the task view for
phylogenetics: http://cran.r-project.org/web/views/Phylogenetics.html .
This will also allow you to install all the phylogenetics packages (at
least those on CRAN) with just a couple of commands. An updated version
should be coming out later today (and if someone notices other things
missing, please let me know).

Brian

___
Brian O'Meara
Assistant Professor
Dept. of Ecology  Evolutionary Biology
U. of Tennessee, Knoxville
http://www.brianomeara.info

Students wanted: Applications due Dec. 15, annually
Postdoc collaborators wanted: Check NIMBioS' website
Funding wanted: Want to collaborate on a grant?


On Thu, Feb 2, 2012 at 8:45 AM, Sean Downey s...@codexdata.com wrote:

 Greetings!

 I'm new to the list, thanks to Emanuel for bringing it to my attention. I
 have a general question for the group. Several years ago I used r8s for
 estimating mrca dates for a tree using an internal node date as a prior. I
 believe BEAST also does this. But is there an R package for this kind of
 relaxed-clock rate smoothing? I'm beginning a new project and want to make
 sure I'm using up-to-date methods.

 Thanks,
 Sean

 --
 Sean S. Downey
 Institute of Archaeology
 University College London
 31-34 Gordon Square
 London
 WC1H 0PY
 s.dow...@ucl.ac.uk mailto:s.dow...@ucl.ac.uk
 http://www.homepages.ucl.ac.**uk/~tcrnssd/http://www.homepages.ucl.ac.uk/~tcrnssd/
 http://www.homepages.ucl.ac.**uk/%7Etcrnssd/http://www.homepages.ucl.ac.uk/%7Etcrnssd/
 
 UK Cell: +44 (0)7980 284 574

 __**_
 R-sig-phylo mailing list
 R-sig-phylo@r-project.org
 https://stat.ethz.ch/mailman/**listinfo/r-sig-phylohttps://stat.ethz.ch/mailman/listinfo/r-sig-phylo


[[alternative HTML version deleted]]

___
R-sig-phylo mailing list
R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo


Re: [R-sig-phylo] Convert instantaneous transition rate to time.

2011-11-22 Thread Brian O'Meara
Another way (influenced by some of Elchanan Mossel's work, though it also
relates to Cecile's idea of looking for saturation) is to look at
information about state at the root. If the data are very informative about
it, one state will have most of the relative likelihood. If the rate * time
is high, there is less information about the state at the root, and the
proportion of relative likelihoods for different states will converge to
the equilibrium state frequencies (assuming a time-symmetric model). For
example, for a short tree for DNA data, the ratio of relative likelihoods
for A, T, G, and C at the root might be c(0.99, 0.01, 0, 0) [most weight on
A] but on a longer tree with less information the reconstruction might be
c(0.27, 0.25, 0.24, 0.24). If the equilibrium frequency is c(0.25, 0.25,
0.25, 0.25), this suggests the second tree has less signal about the root
state than the first.

What I'm not sure about is how to actually measure this distance. Let the
relative likelihoods (normalized to sum to one) of the reconstructed root
state be the vector root.empirical, and let the equilibrium frequencies be
root.equilibrium. You could do:

1) Shannon_entropy( root.empirical ): the flatter the distribution, the
more entropy, the less we know about the root state. The thing I worry
about here is what if the equilibrium frequency isn't flat? Then a
root.empirical that is far from root.equilibrium (indicating some
information) but is flat is judged as less informative than a
root.empirical that exactly equals the root.equilibrium vector, which
doesn't seem right. [note that Shannon_entropy() is pseudocode, though
I'm sure somewhere on CRAN there's a function of this sort]

2) Euclidean distance between root.empirical and root.equilibrium: but does
this put too much weight on what is happening at suboptimal states?

3) Distance between max(root.empirical) and
root.equilibrium[which.max(root.empirical)]: but then something that should
be maximally informative (max(root.empirical)==1) might not be depending on
root.equilibrium. I.e., if root.equilibrium=c(0.4, 0.1, 0.25, 0.25), the
information about the root from root.empirical.1=c(1,0,0,0) is 1 - 0.4 =
0.6, whereas the information from root.empirical.2=c(0, 1, 0, 0) is 1 -
0.1 = 0.9, even though for each the weight for the root state is as strong
as can be.

Perhaps someone else has an idea about a better way to measure this. In
practice, I bet all these measures are pretty highly correlated and so
which one is used might not have much practical importance, though this
would be worth investigating if actual conclusions depend on the results.

Best,
Brian

___
Brian O'Meara
Assistant Professor
Dept. of Ecology  Evolutionary Biology
U. of Tennessee, Knoxville
http://www.brianomeara.info

Students wanted: Applications due Dec. 15, annually
Postdoc collaborators wanted: Check NIMBioS' website
Funding wanted: Want to collaborate on a grant?
Calendar: http://www.brianomeara.info/calendars/omeara



On Tue, Nov 22, 2011 at 1:09 PM, Cecile Ane a...@stat.wisc.edu wrote:

 Hi Simon,

 One option could be to look at the expected number of substitutions over
 time t, and find the smallest time t for which you expect at least 1
 substitution. The idea here is that the first substitution is the one that
 most disrupts the signal.
 Technically, if Q is your rate matrix and p is the stationary distribution
 of states associated with Q, then the average number of substitutions over
 time t is t * r where r is the average jumping rate:
 r = sum over all states p(i) Q(i,i) in absolute value.
 So the time to the first substitution is 1/r.

 Cecile.


 On 11/22/11 09:01, Simon Greenhill wrote:

 Hi all,

 What is the best way to convert an instantaneous transition rate (such as
 that given by geiger's `fitDiscrete` method) into a measure of stability
 over time?

 So, I have a set of traits with a small number of states. I want to fit
 these onto a set of trees with branches proportional to substitutions. The
 instantaneous transition rate for these traits give me the relative
 stability, but I want to be able to quantify this over time e.g. what
 proportion of signal is remaining after time t? how could I represent this
 as a decay curve of signal over time?

 Any ideas/hints would be much appreciated!

 Simon


 __**_
 R-sig-phylo mailing list
 R-sig-phylo@r-project.org
 https://stat.ethz.ch/mailman/**listinfo/r-sig-phylohttps://stat.ethz.ch/mailman/listinfo/r-sig-phylo


[[alternative HTML version deleted]]

___
R-sig-phylo mailing list
R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo


Re: [R-sig-phylo] Unexpected behavior in birth death simulations

2011-05-18 Thread Brian O'Meara
As a general approach, when this sort of thing happens, I'll modify and load
a copy of the function that has been changed to produce a lot of debugging
info (lots of dput and print statements, for example). Then what happens at
each simulation step to identify where the problem occurs. [Better
developers would use R's built-in debugging functions like browser(),
debug(), and trace()].

Brian


___
Brian O'Meara
Assistant Professor
Dept. of Ecology  Evolutionary Biology
U. of Tennessee, Knoxville
http://www.brianomeara.info

Students wanted: Applications due Dec. 15, annually
Postdoc collaborators wanted: Check NIMBioS' website
Funding wanted: Want to collaborate on a grant?


On Wed, May 18, 2011 at 3:59 PM, Carl Boettiger cboet...@gmail.com wrote:

 Hi Tanja,list

 Yes, everything is going extinct, but as you say, that shouldn't happen
 when
 mu = 0.
 
 sim.bd.age(age=2,numbsim=3,lambda=2,mu=0,frac,mrca=FALSE,complete=FALSE)
 [[1]]
 [1] 0

 [[2]]
 [1] 0

 [[3]]
 [1] 0

 
 It's a rather malicious error in that, as you point out, it's a possible
 outcome when mu=0, so at first it appears nothing is wrong, but it happens
 every time that everything is extinct even for mu  lambda, or mu=0.

 It's curious that this same problem effects both TreeSim and Geiger, and
 that this problem appears isolated to certain architectures of computers.
 My guess is that it may be related to 32 bit architecture and the way ape
 is
 handling the tree, but really not sure how to debug this one.

 -Carl


 On Wed, May 18, 2011 at 12:26 PM, Stadler Tanja
 tanja.stad...@env.ethz.chwrote:

 
  Hi Carl,
 
  the TreeSim return (arrays of 0s) could be because TreeSim returns 0 when
  the tree goes extinct:
 
  treearray   Array of numbsim trees with the time since origin / most
  recent common ancestor being age. If tree goes extinct (only possible
 when
  mrca = FALSE), returns 0. If only one extant and no extinct tips, returns
 1.
 
  Maybe set  seed and compare runs with complete=FALSE and complete=TRUE?
 
  Cheers Tanja
 
 
 
 
  I get a similar problem when running the equivalent command in TreeSim:
 
 
   age-2
 lambda - 2.0
 mu - 0.5
 frac -0.6
 numbsim-3
 
 sim.bd.age(age,numbsim,lambda,mu,frac,mrca=FALSE,complete=FALSE)
  [[1]]
  [1] 0
 
  [[2]]
  [1] 0
 
  [[3]]
  [1] 0
 
 
 
  --
  Tanja Stadler (nee Gernhard)
 
  ETH Zürich
  Institute of Integrative Biology
  Universitätsstrasse 16
  8092 Zürich
 
  eMail: tanja.stad...@env.ethz.chmailto:tanja.stad...@env.ethz.ch
  Phone: +41 44 632 45 48
  Office: CHN H76.1
 
  http://www.tb.ethz.ch/people/tstadler
 
 


 --
 Carl Boettiger
 UC Davis
 http://www.carlboettiger.info/

[[alternative HTML version deleted]]


 ___
 R-sig-phylo mailing list
 R-sig-phylo@r-project.org
 https://stat.ethz.ch/mailman/listinfo/r-sig-phylo



[[alternative HTML version deleted]]

___
R-sig-phylo mailing list
R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo


Re: [R-sig-phylo] Model-Selection vs. Finding Models that Fit Well

2011-01-20 Thread Brian O'Meara
I think considering model adequacy is something that would be useful to do
and is not done much now. One general way to do this is to simulate under
your chosen model and see if the real data look very different from the
simulated data. For example, I might try a one rate vs. a two rate Brownian
motion model and find the latter fits better. If the actual true model is an
OU model with two very distinct peaks and strong selection, which is not in
my model set, I'll find that my simulated data under the two rate Brownian
model may look very different from my actual data, which will be fairly
bimodal. Aha, my model is inadequate. [but then what -- keep adding new
models, just report that your model is inadequate, ...?]

Of course, you need a method for evaluating how similar the data look.
There's been some work on this in models for tree inference using posterior
predictive performance (work by Jonathan Bollback and Jeremy Brown come to
mind) or using other approaches (some of Peter Waddell's work), but it
hasn't really taken off yet. It'd be easy to implement such approaches in R
for comparative methods given capabilities in ape, Geiger, and other
packages.

Brian

___
Brian O'Meara
Assistant Professor
Dept. of Ecology  Evolutionary Biology
U. of Tennessee, Knoxville
http://www.brianomeara.info

Students wanted: Applications due Dec. 15, annually
Postdoc collaborators wanted: Check NIMBioS' website
Funding wanted: Want to collaborate on a grant?


On Thu, Jan 20, 2011 at 12:27 PM, David Bapst dwba...@uchicago.edu wrote:

 Hello all,
 I'd like to pose a question to this group, as a bit of topical
 discussion. I apologize in advance if I should mangle a concept.

 In many model-based PCMs and some other analyses (such as paleoTS), we
 fit models to data by finding the ML estimates of the parameters
 associated, calculate the maximum support of each model and than
 compare between models with differing parameters using an information
 criterion (AICc being probably the most used). Akaike weights can be
 calculated if we want to consider the relative fit between our models.
 This is contrary to traditional statistics, where alternative
 hypotheses are tested against some null hypothesis. Obviously, the
 later approach has proven to be thorny because rejecting some null
 hypotheses are very difficult (such as a random walk) and some
 situations truly lack a clear null model.

 Recently, I have heard the opinion expressed from workers of disparate
 fields (philosophy, ecology, etc.) that model-choosing methods may
 choose the best model, but with no idea of whether any of the models
 considered fit well to the data or not. In other words, we may have
 fit models A-D, and the best model may have been model C, but none of
 the models compared could describe the 'true' process underlying the
 data at all.

 This view gives me mixed feelings. Certainly, if we are using a
 model-selection approach, we should attempt the range of models that
 make sense for our data, and should particularly include that set of
 simplistic models that we may accept as the most observed process
 (Brownian motion and Ornstein-Uhlenbeck with one optima, perhaps, in
 analyses of trait evolution). Of course, we cannot include models that
 we haven't even considered or are analytically intractable. That's a
 fundamental limitation of science, however, not model-selection based
 analyses.

 This counter-argument did not seem to satisfy the others, who still
 wanted a measure of absolute fit, like an R-squared. Now, perhaps
 I'm confused, but isn't R-squared technically a relative measure of
 fit between a linear model and a random scatter? I suppose the maximum
 support for a model is a measure of absolute fit, but it's not useful
 or interpretable unless I'm comparing it to the support for some other
 model.

 So, it seems like the desire for a measure of absolute fit is not
 well-founded, but maybe I'm wrong. Is there something more we can do
 to show how that the models we've picked aren't arbitrary? Opinions?
 -Dave Bapst, UChicago Geosci

 ___
 R-sig-phylo mailing list
 R-sig-phylo@r-project.org
 https://stat.ethz.ch/mailman/listinfo/r-sig-phylo



[[alternative HTML version deleted]]

___
R-sig-phylo mailing list
R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo


Re: [R-sig-phylo] Fst - DNA sequence data

2010-08-24 Thread Brian O'Meara
You should look at the CRAN task view for genetics:
http://cran.r-project.org/web/views/Genetics.html . I haven't used R
much in that domain, but there seems to be a lot of functionality.
Also look into Bioconductor.

Best,
Brian

On Mon, Aug 23, 2010 at 8:18 AM, blue jay bluejay...@gmail.com wrote:
 Hi, (sorry for cross posting!)

 I want to analyse DNA sequence data (mtDNA) in R as in calculate Fst,
 Heterozygosity and such summary statistics. Package Adagenet converts the
 DNA sequence into retaining only retaining the polymorphic sites and then
 calcuates Fst.. but is there any other way to do this? I mean analyse the
 DNA sequence as it is.. and calculate the statistics?


 Thanks!

        [[alternative HTML version deleted]]

 ___
 R-sig-phylo mailing list
 R-sig-phylo@r-project.org
 https://stat.ethz.ch/mailman/listinfo/r-sig-phylo




-- 
___
Brian O'Meara
Assistant Professor
Dept. of Ecology  Evolutionary Biology
U. of Tennessee, Knoxville
http://www.brianomeara.info

Students wanted: Applications due Dec. 15, annually
Postdoc collaborators wanted: Check NIMBioS' website

___
R-sig-phylo mailing list
R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo


Re: [R-sig-phylo] estimate likelihood of data for given model and parameters

2010-03-19 Thread Brian O'Meara
I have code for this in my TreEvo package (on R-forge -- see  
brownieneglnL in http://r-forge.r-project.org/plugins/scmsvn/viewcvs.php/pkg/R/brownie.R?rev=2root=treevoview=markup 
)  though note that the package itself is not all cleaned up into  
install.packages()-ready form. You could also dig into geiger's  
fitContinuous function and use that code (look for foo in  
fitContinuous.R) -- OUCH must have similar code. Odd if there's  
nothing more direct.


Best,
Brian

On Mar 19, 2010, at 9:47 PM, Carl Boettiger wrote:


Dear r-sig-phylo,

I'd like to be able to estimate the likelihood of a model with a given
Brownian rate parameter under a given data set.  For OU models, this  
can be
accomplished through the ouch package with the flag fit=FALSE.   
However
there does not seem to be a corresponding option for Brownian motion  
in
ouch/geiger/ape algorithms for fitting a Brownian motion model?   
Perhaps

I've overlooked something?

-Carl

--
Carl Boettiger
Population Biology, UC Davis
http://two.ucdavis.edu/~cboettig

[[alternative HTML version deleted]]

___
R-sig-phylo mailing list
R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo



Brian C. O'Meara
Asst. Prof., Dept. of Ecology and Evolutionary Biology
University of Tennessee, Knoxville
http://www.brianomeara.info
865-408-TREE (8733)

___
R-sig-phylo mailing list
R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo


[R-sig-phylo] plot.phylo leftwards in ape

2010-03-18 Thread Brian O'Meara
I'm trying to plot two trees opposite each other using ape  
(essentially as in Fig. 4.18 in Paradis' Analysis of Phylogenetics and  
Evolution with R, with some modifications). I was having trouble  
seeing the rootmost line on the plot on the right, so decided to try  
using x.lim as an argument for plot.phylo. Apparently, this doesn't  
work properly with direction=leftwards (it seems to override the  
direction, so it plots it rightwards with bad label placement). A  
quick glance through the code didn't suggest to me how to fix it. This  
will reproduce the behavior:

library(ape)
data(bird.orders)
plot.phylo(bird.orders,direction=r) #looks okay
plot.phylo(bird.orders,direction=r, x.lim=38) #still okay
plot.phylo(bird.orders,direction=l) #still okay, points left
plot.phylo(bird.orders,direction=l, x.lim=38) #problem

It's possible the solution to my original potting issue lies elsewhere  
(I'll keep poking around at it), but I wanted to point out the  
potential x.lim+direction=l bug in any case. I'm using ape 2.5.

Thanks,
Brian


--
Brian O'Meara
http://www.brianomeara.info
Assistant Prof.
Dept. Ecology  Evolutionary Biology
U. of Tennessee, Knoxville


[[alternative HTML version deleted]]

___
R-sig-phylo mailing list
R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo


Re: [R-sig-phylo] Bayesian dating in R vs. Multidivtime

2010-01-26 Thread Brian O'Meara
Emmanuel, you should check Lagopus (http://www.christophheibl.de/mdt/mdtinr.html 
), which seems to connect R with multidivtime (based on  
documentation -- I haven't used it).


Best,
Brian


On Jan 26, 2010, at 11:19 AM, Emmanuel Paradis wrote:


Dear all,

I'm planning to do some molecular dating using Bayesian methods, ie,  
what people usually do with Multidivtime. After reading the doc of  
Multidivtime, I'm thinking to do my own implementation in R since it  
is straightforward to specify priors, define and run the Markov  
chain, and calculate posteriors. I have two questions:


1. Has someone attempted a similar implementation?

2. Has someone written an interface between R and Multidivtime?

Many thanks in advance for any reply.

Best,

Emmanuel
--
Emmanuel Paradis
IRD, Montpellier, France
 ph: +33 (0)4 67 16 64 47
fax: +33 (0)4 67 16 64 40
http://ape.mpl.ird.fr/

___
R-sig-phylo mailing list
R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo


--
Brian O'Meara
http://www.brianomeara.info
Assistant Prof.
Dept. Ecology  Evolutionary Biology
U. of Tennessee, Knoxville

___
R-sig-phylo mailing list
R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo


Re: [R-sig-phylo] Phylogenetic reference trees

2009-08-20 Thread Brian O'Meara
There's not yet a good single source. TreeBase.org has some  
topologies, but not all studies go in there yet. The largest one I  
know of there has 1569 taxa . The PhyLoTA browser (http://loco.biosci.arizona.edu/pb/ 
) has trees for many groups and also allows you to download aligned  
data to make your own. You can download NCBI's taxonomy, which is a  
sort of tree, though not highly-resolved or with branch lengths, but  
at least pretty comprehensive. You can also download the tree used by  
the Tree of Life project (http://tolweb.org/tree/home.pages/downloadtree.html 
). There may be a tree available from http://www.timetree.org/,  
but I haven't been able to find it in a form other than a pdf image.


Having a place to download the single best current estimate of the  
full tree of life would be great, and there are a few research groups  
working towards this goal, but it's not ready yet, as far as I know.


Hope this helps,
Brian


Brian C. O'Meara
Asst. Prof., Dept. of Ecology and Evolutionary Biology
University of Tennessee, Knoxville
http://www.brianomeara.info
865-408-TREE (8733)


On Aug 20, 2009, at 7:31 AM, saikari keitele wrote:


Hi,



I'm trying to use the picante r package.

My aim is to use it to construct phylogenetic trees and calculate  
diversity
statistics from species occurrence data in different geographic  
regions.


So, I have a list of species names with abundance information  
(number of

occurrences) and would like to use Phylomatic to construct a tree.

The Phylomatic documentation mentions that a reference tree is  
needed and

different examples of megatrees (
http://svn.phylodiversity.net/tot/megatrees/) are given for this.  
However,
they apply only (I think) to trees and plants. My data includes  
other kinds
of organisms. Do you know of any other downloadable reference trees  
that

include the whole taxonomy?



Thank you very much.



Saikari

[[alternative HTML version deleted]]

___
R-sig-phylo mailing list
R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo


___
R-sig-phylo mailing list
R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo


[R-sig-phylo] Fwd: Other: Software.GLMM_R_package

2009-01-29 Thread Brian O'Meara
This may be of interest to some.

Brian

Begin forwarded message:

 From: evol...@evol.biology.mcmaster.ca
 Date: January 29, 2009 4:59:54 AM EST
 To: omeara.br...@gmail.com
 Subject: Other: Software.GLMM_R_package
 Reply-To: br...@helix.biology.mcmaster.ca



 Dear Evoldir members,

 I've written a GLMM package for R that is now downloadable
 (http://cran.r-project.org/web/packages/MCMCglmm/index.html ).

 It fits standard mixed models to several types of distribution  
 (Gaussian,
 Poisson, Exponential, Binary, Binomial, Multinomial, Zero- inflated
 Poisson) More than one response variable can be modeled at the same
 time and these can come from different distributions.  Pedigrees and
 Phylogenies can be fitted as random effects as in animal models and
 the comparative method.  Meta-analysis models can be fitted (even in
 conjunction with other terms, e.g. a phylogeny)

 It uses MCMC, as REML/ML procedures are not very robust  for fitting
 GLMM's.  However, I've kept it simple to use and made it fast.

 There are some other functions in the package for matrix comparisons,
 matrix visualisation, tensor analysis, random number generators on
 pedigrees/phylogenies  + more.

 Thanks,

 Jarrod

 Jarrod Hadfield j.hadfi...@ed.ac.uk




Brian O'Meara
NESCent
Durham, NC
http://www.brianomeara.info





[[alternative HTML version deleted]]

___
R-sig-phylo mailing list
R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo


Re: [R-sig-phylo] tree thievery

2008-11-06 Thread Brian O'Meara
 as a PNG or GIF

  OR adjust the PDF window so the figure fills it and take a snapshot
of the Window (on Ubuntu: alt-printscreen), save as PNG or GIF

  open g3data

  click on two points on the X and Y axis, fill in values

  click on points

  if you need to compress the display so that you can see the output
actions,
use the View menu or function keys to toggle display of zoom area
(F5),
axis settings (F6), or output properties (F7)

  for multiple series, either click on points in order (e.g. work  
left-

to-right
for each series), then edit your output to put tags on increasing
series,
or output each series to a separate data file

  note that by default g3data will save your data to a file named
after
your graphics file, e.g. mydata.png.dat -- which means that it will
show up in Windows as a file called mydata.png,  with a DAT file
type -- which may be confusing.

  reading into excel: use Data menu to separate into columns

  Wish list for g3data:

csv format output?
series tagging?
keyboard shortcuts for Save (Ctrl-S), Save As (Ctrl-A)?
built-in documentation?


plot(newtree2)

___
R-sig-phylo mailing list
R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo




Brian O'Meara
NESCent
Durham, NC
http://www.brianomeara.info

___
R-sig-phylo mailing list
R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo