Re: [R-sig-phylo] Model-averaging corHMM models?

Liam J. Revell Thu, 14 Nov 2024 18:14:03 -0800

Dear list.

I'm going to make a general pitch for model averaging ancestral state 
reconstruction -- without solving Rafa's problem that this is not yet 
implemented in /corHMM/.


Setting aside, for a moment, hidden-rate models -- imagine a set of 
standard M/k/ model for a binary trait. There are (reasonably) four such 
models: an ER model, an ARD model, a 0 → 1 model, and a 1 → 0 model.

Now imagine further an unlikely, but possible, scenario in which all of 
our weight of evidence fell on the 0 → 1 and a 1 → 0 models, but 51% 
landed on the former and 49% on the latter. In the former model our 
marginal reconstruction at the root will be unambiguously 0 (assuming 
both 0 & 1 are observed at the tips). Clearly, however, the chances that 
the root node is 0 or 1 are closer to 50:50 than 100:0 (since under our 
nearly equally well-supported 1 → 0 model the same node is unambiguously 
1). The same dilemma holds for more realistic scenarios -- indeed, I've 
seen it in empirical data!

Lastly, here's a suggestion on how to do model averaging with hidden 
states: first "merge" hidden levels, and then model-average using Akaike 
weights (or whatever your preferred approach is to model averaging). 
That is, if a node has a marginal scaled likelihood of "0A" of 0.4 and 
"0B" of 0.3 (for observed states 0/1 and hidden levels A/B), then the 
total probability that that node was in observed state "0" is just 0.4 + 
0.3 = 0.7. This is implemented natively in /phytools::ancr/; however, I 
don't recall if I built it to support hidden states. (For ancestral 
state reconstruction of a/"fitHRM"/ object in /phytools/, however, it is 
possible to "hide" hidden states after reconstruction, then 
model-averaging, including across hidden & "non-hidden" state models 
becomes quite easy.)

Hope this is helpful.

Best wishes. Liam

Liam J. Revell
Professor of Biology, University of Massachusetts Boston
Web: http://faculty.umb.edu/liam.revell/
Book: Phylogenetic Comparative Methods in R 
<https://press.princeton.edu/books/phylogenetic-comparative-methods-in-r> 
(/Princeton University Press/, 2022)


On 11/14/2024 1:19 PM, O'Meara, Brian C via R-sig-phylo wrote:
> CAUTION: EXTERNAL SENDER
>
> Hi, Rafa.
>
> One thing to look at is whether the models are effectively identical; if the 
> 3 state is 2 AICc worse than the 2 state it could be that it’s just adding a 
> single extra parameter with no improvement (i.e, the likelihoods might be the 
> same). In some of our other software we now have code to detect functionally 
> redundant models and remove this redundancy, but I don’t remember whether 
> it’s in corHMM yet.
>
> Assuming the models are somewhat different, much as I like model averaging 
> I’d use the one with lower AICc for ancestral state reconstruction (and yep, 
> glad you brought up all the uncertainties with anc recon). With hisse, with 
> model averaging one can basically model the turnover, net diversification, 
> and other diversification-flavored rates: this edge is 0A in this model, and 
> 0C in that model, but one can still just take the turnover rate from each. 
> It’s partly a benefit from hisse’s relative inattention to the variation in 
> transition rates. But corHMM is all about the transition rates: in one model, 
> we have rates 0A->1A, 0B->1B, while in the other we have 0A->1A, 0B->1B, and 
> 0C->1C: it’s not that the overall rate of 0->1 is the unweighted mean of the 
> 0*->1* rates in each model, as state B might be really rare in one model and 
> common in another. One can think about what happens at equilibrium with the 
> estimated rate matrix, or use the simmap functionality in corHMM to look at 
> empirical frequencies of going from 0->1, but it’s an area where I think we 
> need to think a bit more about how to summarize and present results (for one 
> thing, I think summarizing by wait times (reciprocal of rates) rather than 
> rates can be a better way to summarize, so an ephemeral state with a high 
> outgoing rate that a species is expected to occupy for only a short time does 
> not have a huge impact on the average parameter).
>
> I hope this helps; I’m CCing Jeremy Beaulieu and James Boyko in case they’re 
> not seeing R-sig-phylo posts.
>
> Best,
> Brian
>
> _________________________________________
>
> Brian O’Meara
> He/Him
> Professor, Dept. of Ecology & Evolutionary Biology
> University of Tennessee, Knoxville
>
>
> From: R-sig-phylo<r-sig-phylo-boun...@r-project.org> on behalf of Rafael S 
> Marcondes<raf.marcon...@gmail.com>
> Date: Thursday, November 14, 2024 at 12:11 PM
> To: r-sig-phylo<r-sig-phylo@r-project.org>
> Subject: [R-sig-phylo] Model-averaging corHMM models?
> Hi all,
>
> I have two corHMM models within 2 AICc units of each other. They differ in
> the number of hidden states (2 versus 3). I'm trying to decide how to
> proceed from here and wondering if model-averaging would be the way to go.
> I'm interested in drawing biological conclusions based on
> the parameter estimates, and in doing ancestral character estimation
> (strictly for visualization purposes only--I'm well aware of the caveats of
> over-interpreting ACE states).
>
> corHMM doesn't seem to have an in-built model averaging utility. But by
> analogy with the one in hisse (modelAveRates), it seems as if it's just a
> matter of averaging the parameter estimates based on model AIC weights. Is
> it really that simple? How would I deal with the param present in one model
> and not the other (related to the additional hidden state)? And then would
> it be appropriate to run ACE using the averaged rate parameters? Or would
> it be more appropriate to perform ACE separately for each model and then
> average the node likelihoods instead?
>
> Thanks,
>
> -Rafa
>
> *--*
> *Rafael S. Marcondes, Ph.D. (he/him)*
> (The R in Rafael is pronounced like the h in "hat")
> *https://www.rafaelmarcondes.com/ <https://www.rafaelmarcondes.com/>*
> Faculty Fellow in EEB
> Department of BioSciences
> Rice University
> Houston TX 77005
>
>
> *"Eu quase que nada não sei. Mas desconfio de muita coisa"*
> *"I almost don't know nothing. But I suspect many things"*
>    -João Guimarães Rosa, Brazilian novelist
> (Portuguese original and free English translation by me)
>
>          [[alternative HTML version deleted]]
>
> _______________________________________________
> R-sig-phylo mailing list -R-sig-phylo@r-project.org
> https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
> Searchable archive athttp://www.mail-archive.com/r-sig-phylo@r-project.org/
>
>          [[alternative HTML version deleted]]
>
> _______________________________________________
> R-sig-phylo mailing list -R-sig-phylo@r-project.org
> https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
> Searchable archive athttp://www.mail-archive.com/r-sig-phylo@r-project.org/

        [[alternative HTML version deleted]]

_______________________________________________
R-sig-phylo mailing list - R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/

Re: [R-sig-phylo] Model-averaging corHMM models?

Reply via email to