Re: [R-sig-phylo] Error in phytools rateshift

2023-05-17 Thread Lior Glick
Thanks again, that's very helpful and I'm going to try this out.
I have already tried a log10 transformation, and got a different error
message (using the latest version from github):
> bm_single = rateshift(tree, data, nrates=1)
Error in solve.default(sig * C + E) :
  Lapack routine dgesv: system is exactly singular: U[1,1] = 0
Calls: rateshift ... optim ->  -> fn -> solve -> solve.default
Execution halted

On Wed, May 17, 2023 at 8:27 PM Liam J. Revell  wrote:

> Dear Lior.
>
> I don't know what property of my data creates this phenomenon. Could this
> be related to the very broad range of trait values (spanning two orders of
> magnitude)?
>
> Based on the trait distribution, you might consider transforming your data
> to a log-scale.
>
> 1. Is there a way to let the software determine the optimal number of rate
> shifts? Something like what the SURFACE package does with OU models?
>
> *rateshift* is a likelihood method, so you can fit models with 0, 1, 2,
> 3, shifts (and so on), and then compare models using AIC or likelihood
> ratio tests. As we show in Chapter 5 of our book (
> https://press.princeton.edu/books/phylogenetic-comparative-methods-in-r),
> it's even possible to compare the *rateshift* model to other models, such
> as the "EB" model of *geiger::fitContinuous*, in which the rate of
> evolution changes as a continuous (rather than step) function of time.
>
> 2. Is it possible to test specific hypotheses, i.e. choose the positions
> of rate shifts a-priori and calculate the likelihood?
>
> Yes, this corresponds to the method of O'Meara et al. (2006; *Evolution*)
> and is implemented in the *phytools* function *brownie.lite*. In the case
> of a specific set of hypothesized rate shifts, one simply needs to first
> "paint" these on the tree using the *phytools* function* make.era.map*,
> and then fit your model of interest with *brownie.lite*. This model can
> *also* be compared to *rateshift* using standard methods. We assume that
> each *fitted* (rather than set) rate shift would consume one degree of
> freedom.
>
> 3. Can I forbid the software from placing shifts on branches leading to
> clades smaller than X species?
>
> One *could* constrain *rateshift* to not place a rate shift (say) any
> higher (above the root) than a certain total depth below the tips. (
> *rateshift* doesn't give this control to the user, but it would be an
> easy feature to add.) On the other hand, I'm not sure I would advise doing
> this simply because a rate shift very close to the present is probably
> indicative of unexplained variance in the tip taxa, rather than a genuine
> temporal rate shift. This could be due, for example, to measurement error /
> uncertainty in the species means for the trait we're studying.
>
> I hope this is helpful.
>
> All the best, Liam
> Liam J. Revell
> Professor of Biology, University of Massachusetts Boston
> Web: http://faculty.umb.edu/liam.revell/
> Book: Phylogenetic Comparative Methods in R
> 
> (*Princeton University Press*, 2022)
>
>
> On 5/17/2023 10:37 AM, Lior Glick wrote:
>
> CAUTION: EXTERNAL SENDER
> Hi Liam, and thanks a lot for the quick fix! I highly appreciate it.
> I was able to install from github and the commands now completes
> successfully. Unfortunately, convergence was not reached with 100
> iterations. This is probably not surprising if the likelihood surface is
> flat. I don't know what property of my data creates this phenomenon. Could
> this be related to the very broad range of trait values (spanning two
> orders of magnitude)?
>
> Some other questions related to this type of analysis:
> 1. Is there a way to let the software determine the optimal number of rate
> shifts? Something like what the SURFACE package does with OU models?
> 2. Is it possible to test specific hypotheses, i.e. choose the positions
> of rate shifts a-priori and calculate the likelihood?
> 3. Can I forbid the software from placing shifts on branches leading to
> clades smaller than X species?
>
> As far as I can tell, these features were not implemented in rateshift(),
> so my questions are, whether such analyses have been implemented somewhere
> else, and do you see any reason why this shouldn't work (if I make the
> relevant adjustments to your code).
>
> Thanks again.
> Lior
>
> On Tue, May 16, 2023 at 9:48 PM Liam J. Revell 
> wrote:
>
>> Hi Lior.
>>
>> I just updated *rateshift* in *phytools* on GitHub with these
>> adjustments. It seems to work well and will run your dataset without
>> failing.
>>
>> I did observe an odd thing with the two-rate model for your data which is
>> that the likelihood surface for the position of the rate-shift seems to be
>> totally flat. I don't know what would give your data this property. (In
>> fact, I don't know if I could simulate data with this attribute if I
>> tried.) I found this result to be so worrying that I took your original
>> tree & simulated data containing a 

Re: [R-sig-phylo] Error in phytools rateshift

2023-05-17 Thread Liam J. Revell
Dear Lior.

> I don't know what property of my data creates this phenomenon. Could 
> this be related to the very broad range of trait values (spanning two 
> orders of magnitude)?
Based on the trait distribution, you might consider transforming your 
data to a log-scale.

> 1. Is there a way to let the software determine the optimal number of 
> rate shifts? Something like what the SURFACE package does with OU models?
/rateshift/ is a likelihood method, so you can fit models with 0, 1, 2, 
3, shifts (and so on), and then compare models using AIC or likelihood 
ratio tests. As we show in Chapter 5 of our book 
(https://press.princeton.edu/books/phylogenetic-comparative-methods-in-r), 
it's even possible to compare the /rateshift/ model to other models, 
such as the "EB" model of /geiger::fitContinuous/, in which the rate of 
evolution changes as a continuous (rather than step) function of time.

> 2. Is it possible to test specific hypotheses, i.e. choose the 
> positions of rate shifts a-priori and calculate the likelihood?
Yes, this corresponds to the method of O'Meara et al. (2006; 
/Evolution/) and is implemented in the /phytools/ function 
/brownie.lite/. In the case of a specific set of hypothesized rate 
shifts, one simply needs to first "paint" these on the tree using the 
/phytools/ function/make.era.map/, and then fit your model of interest 
with /brownie.lite/. This model can *also* be compared to /rateshift/ 
using standard methods. We assume that each /fitted/ (rather than set) 
rate shift would consume one degree of freedom.

> 3. Can I forbid the software from placing shifts on branches leading 
> to clades smaller than X species?
One *could* constrain /rateshift/ to not place a rate shift (say) any 
higher (above the root) than a certain total depth below the tips. 
(/rateshift/ doesn't give this control to the user, but it would be an 
easy feature to add.) On the other hand, I'm not sure I would advise 
doing this simply because a rate shift very close to the present is 
probably indicative of unexplained variance in the tip taxa, rather than 
a genuine temporal rate shift. This could be due, for example, to 
measurement error / uncertainty in the species means for the trait we're 
studying.

I hope this is helpful.

All the best, Liam

Liam J. Revell
Professor of Biology, University of Massachusetts Boston
Web: http://faculty.umb.edu/liam.revell/
Book: Phylogenetic Comparative Methods in R 
 
(/Princeton University Press/, 2022)


On 5/17/2023 10:37 AM, Lior Glick wrote:
> CAUTION: EXTERNAL SENDER
> Hi Liam, and thanks a lot for the quick fix! I highly appreciate it.
> I was able to install from github and the commands now completes 
> successfully. Unfortunately, convergence was not reached with 100 
> iterations. This is probably not surprising if the likelihood surface 
> is flat. I don't know what property of my data creates this 
> phenomenon. Could this be related to the very broad range of trait 
> values (spanning two orders of magnitude)?
>
> Some other questions related to this type of analysis:
> 1. Is there a way to let the software determine the optimal number of 
> rate shifts? Something like what the SURFACE package does with OU models?
> 2. Is it possible to test specific hypotheses, i.e. choose the 
> positions of rate shifts a-priori and calculate the likelihood?
> 3. Can I forbid the software from placing shifts on branches leading 
> to clades smaller than X species?
>
> As far as I can tell, these features were not implemented in 
> rateshift(), so my questions are, whether such analyses have been 
> implemented somewhere else, and do you see any reason why this 
> shouldn't work (if I make the relevant adjustments to your code).
>
> Thanks again.
> Lior
>
> On Tue, May 16, 2023 at 9:48 PM Liam J. Revell  
> wrote:
>
> Hi Lior.
>
> I just updated /rateshift/ in /phytools/ on GitHub with these
> adjustments. It seems to work well and will run your dataset
> without failing.
>
> I did observe an odd thing with the two-rate model for your data
> which is that the likelihood surface for the position of the
> rate-shift seems to be totally flat. I don't know what would give
> your data this property. (In fact, I don't know if I could
> simulate data with this attribute if I tried.) I found this result
> to be so worrying that I took your original tree & simulated data
> containing a single rate shift -- but I found that I was able to
> recover the temporal position of this shift without an enormous
> amount of trouble (though optimization /is/ very slow). This
> suggests that I am not seeing the result of a software bug or an
> idiosyncratic problem with the tree.
>
> To update /phytools/ from GitHub you just need to have the
> /devtools/ CRAN package installed and then run
> /devtools:::install_github("liamrevell/phytools")/ 

Re: [R-sig-phylo] Error in phytools rateshift

2023-05-17 Thread Lior Glick
Hi Liam, and thanks a lot for the quick fix! I highly appreciate it.
I was able to install from github and the commands now completes
successfully. Unfortunately, convergence was not reached with 100
iterations. This is probably not surprising if the likelihood surface is
flat. I don't know what property of my data creates this phenomenon. Could
this be related to the very broad range of trait values (spanning two
orders of magnitude)?

Some other questions related to this type of analysis:
1. Is there a way to let the software determine the optimal number of rate
shifts? Something like what the SURFACE package does with OU models?
2. Is it possible to test specific hypotheses, i.e. choose the positions of
rate shifts a-priori and calculate the likelihood?
3. Can I forbid the software from placing shifts on branches leading to
clades smaller than X species?

As far as I can tell, these features were not implemented in rateshift(),
so my questions are, whether such analyses have been implemented somewhere
else, and do you see any reason why this shouldn't work (if I make the
relevant adjustments to your code).

Thanks again.
Lior

On Tue, May 16, 2023 at 9:48 PM Liam J. Revell  wrote:

> Hi Lior.
>
> I just updated *rateshift* in *phytools* on GitHub with these
> adjustments. It seems to work well and will run your dataset without
> failing.
>
> I did observe an odd thing with the two-rate model for your data which is
> that the likelihood surface for the position of the rate-shift seems to be
> totally flat. I don't know what would give your data this property. (In
> fact, I don't know if I could simulate data with this attribute if I
> tried.) I found this result to be so worrying that I took your original
> tree & simulated data containing a single rate shift -- but I found that I
> was able to recover the temporal position of this shift without an enormous
> amount of trouble (though optimization *is* very slow). This suggests
> that I am not seeing the result of a software bug or an idiosyncratic
> problem with the tree.
>
> To update *phytools* from GitHub you just need to have the *devtools*
> CRAN package installed and then run
> *devtools:::install_github("liamrevell/phytools")* from the command line
> of an interactive session in R.
>
> If you want to try to visualize the likelihood surface for your rate
> shift, you make want to try this *phytools* function
> *likeSurface.rateshift*. See more information here:
> http://blog.phytools.org/2016/08/function-to-plot-likelihood-surface-for.html.
> Unfortunately, it is also bound to be pretty slow for larger phylogenetic
> trees.
>
> Let us know if this is of help.
>
> Sincerely, Liam
> Liam J. Revell
> Professor of Biology, University of Massachusetts Boston
> Web: http://faculty.umb.edu/liam.revell/
> Book: Phylogenetic Comparative Methods in R
> 
> (*Princeton University Press*, 2022)
>
>
> On 5/16/2023 11:09 AM, Liam J. Revell wrote:
>
> CAUTION: EXTERNAL SENDER
>
> Hi Lior.
>
> I'm working on this for you now.
>
> Basically a lot of optimization iterations with /rateshift/ tend to
> fail, so I'm basically doing two things: (1) preventing a failed
> optimization from causing the whole run to fail using /try/; and (2)
> allowing the user to parallelize the multiple optimization iterations
> using /foreach/.
>
> When I get both of these fixes working I'll push the update to the
> /phytools/ GitHub and you can install the package from there.
>
> All the best, Liam
>
> Liam J. Revell
> Professor of Biology, University of Massachusetts Boston
> Web: 
> https://nam10.safelinks.protection.outlook.com/?url=http%3A%2F%2Ffaculty.umb.edu%2Fliam.revell%2F=05%7C01%7Cliam.revell%40umb.edu%7Ce62e1382110f40edf43908db561fa1e3%7Cb97188711ee94425953c1ace1373eb38%7C0%7C0%7C638198466088187811%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C=7MurBoMRR1PkMdQi3CFEwceft03vicc%2Br1Eyou7MOi8%3D=0
>  
> Book: Phylogenetic Comparative Methods in 
> R
>  
> (/Princeton University Press/, 2022)
>
>
> On 5/16/2023 9:57 AM, Lior Glick wrote:
>
> CAUTION: EXTERNAL SENDER
> Hello,
> I am trying to run the rateshift function from phytools on a rather
> large phylogeny (500 species).
> With nrates=1 (null model) everything works fine. Using nrates=2 also
> completed successfully, but did not reach convergence:
>
> bm_multi =