Re: [R] Finding starting values for the parameters using nls() or nls2()

2016-10-27 Thread dave fournier

>
>> On Oct 25, 2016, at 9:29 AM, dave fournier  
wrote:

>>
>>
>>
>> Unfortunately this problem does not appear to be well posed.
>>
>>Retention = (b0*Area^(th+1))^b
>>
>> If b0, th, and b are the parameter only the product (th+1)*b is 
determined.

>>
>> This comes from noting that powers satisfy
>>
>>
>>
>>(a^b)^c  =  a^(b*c)
>>
>>
>>
>> So your model can be written as
>>
>>
>>
>>  (b0^b) * Area^((th+1)*b)
>>
>
>... which nicely completes the thread since one model had:
>
>th1   =  9.1180e-01
> b01=5.2104e+00
> b11 =  -4.6725e-04
>(th1+1)*b11
>[1] -0.0008932885
>
>
> b0  = 5.2104466   ;b1 =   -0.0004672   ;  th =  0.9118029
>((th+1)*b1)
>[1] -0.0008931943
>
>So both the R's nls2 and AD_MOdel_Builder results yield that same 
predictions for any given data point at least up to four decimal places.

>
>So perhaps there was some other physical interpretation of those 
parameters and there exists an underlying theory that would support 
adding some extra constraints?

>
>--
>>David Winsemius
>Alameda, CA, USA

I'm not really sure what your point is. The OP has two models. One is well
posed and the other is not. I was discussing solution of the former model
which is well posed.  The form of the model, using a, b, and t and x,y to
simplify the expression is

y  = exp( a * exp(b * x^t) )

My point is that there are many models like this where the obvious
parameterization (something like the parameters the user is interested
in) leads to parameter estimates with highly correlated errors.
This does not necessarily mean that
the model is badly determined. The model exists independent of the
particular parameterization. To fix  the ideas assume there are n 
observations

(x_i,y_i)  and simplify by assuming that x_1<=x_2<=...<=x_n
(but not all equal lets hope)

A stable parameterization of the model can often be obtained by
picking as new parameters y1 and yn where y1 and yn are the predicted 
values for y_1 and y_n, that is


   y1  = exp( a * exp(b * x_1^t) )
   yn  = exp( a * exp(b * x_n^t) )

Then one solves for some of the original model parameters in terms of y1 
and yn.

The particular way this is done depends on the model. Often some has some
linear parameters and the procedure is easy. For this model as I showed
one can solve for a and b in terms of y1 and yn.
Then one can fit the model easily with AD Model Builder or
nls2 using this parameterization. nls2 provides the standard errors for
the parameter estimates.  The AD Model Builder solution provides the
estimated variance covariance matrix of the parameter estimates via the
standard maximum likelihood Hessian calculations. It also provides the
covariance for any desired dependent variable.  So one can fit the model
using y1,yn, and t and get the covariance matrix for a,b, and t
in one step. In nls2 one needs to fit the model using y1,yn and then
solve for a and b and run the model again from that point.  That is no big
deal, and I showed how to do it, but it is one more step for the user.

It is interesting to see the correlation matrix for the parameter estimates
and the dependent variables.
std dev  correlation
1  logt -9.233e-02 3.4026e-01  1.
2  c9.7164e-01 3.7006e-02 -0.2690  1.
3  d1.1010e+00 1.7852e-01 -0.7495  0.0325  1.000
4  t9.1180e-01 3.1025e-01  1. -0.2690 -0.749  1.
5  a5.2104e+00 4.3404e-01 -0.9781  0.4191  0.621 -0.9781  1.000
6  b   -4.6725e-04 1.1714e-03  0.9994 -0.2836 -0.726  0.9994 -0.984 1.00

Here the independent variable are c, d, and logt
where y1=c*y_1  y2=d*y2
That is just an additional thing so that one can start with c=1 and d=1
Also logt is used where t=exp(logt) which makes sure t>0.
Notice that the correlation between c and d is 0.0325 which is small.


If  a and b were the parameters of the model

4   t9.1180e-01 3.1025e-01   1.
5   a5.2104e+00 4.3404e-01   -0.9781  1.000
6   b   -4.6725e-04 1.1714e-030.9994 -0.984  1.00

One can see how close to 1 and -1 the correlations are.

Notice that the parameter b is very badly determined.
So rather than saying the model is no good one sees that
the model is no good if one want to get information about
the parameter b. In contrast the parameter a is fairly well
determined and the parameter t is "kind of" determined.

  Because of this parameter confounding nls2 is extremely sensitive
  to the starting parameter values using this parameterization.
  If I change the parameters by about 15% or less as below

  #b0start=5.210445
  #b1_start=-0.0004672373
  #th_start=0.911804
  b0_start=5.7
  b1_start=-0.00055
  th_start=1.05

I get the dreaded

Error in (function (formula, data = parent.frame(), start, control = 
nls

Re: [R] Finding starting values for the parameters using nls() or nls2()

2016-10-25 Thread dave fournier



 Unfortunately this problem does not appear to be well posed.

Retention = (b0*Area^(th+1))^b

If b0, th, and b are the parameter only the product (th+1)*b is determined.

This comes from noting that powers satisfy



(a^b)^c  =  a^(b*c)



So your model can be written as



  (b0^b) * Area^((th+1)*b)

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Finding starting values for the parameters using nls() or nls2()

2016-10-24 Thread dave fournier
I've spent quite a bit of time trying to convince people on various 
lists that the solution to these kinds of
problems lies in the stable parameterization of the model.  I write the 
solutions in AD Model Builder because it
is easy.  But R people are generally stuck in R (or mired) so the 
message somehow always gets lost.


 I thought I would bite the bullet and figure out how to do this in R. 
It turns out that one can fit this model
 with only 6 function evaluations using nls2.   I apologize in advance 
for my execrable R code. But it does do

the job.

The data were fit using 3 calls to nls2. For the first call only the 
parameter th was estimated. This converged
in 4 function evaluations. For the second call all three of the 
parameters were estimated
(but the other 2 parameters are different from b0, b1) .This converged 
to the solution in four function evaluations.


The final call to nls2 uses the converged values to calculate b0,b1 and 
theta and starts from there.
As you can see the model is already converged. This final call to nls2 
is used to get the standard error

estimates for the parameters.

>  nls.m1
Nonlinear regression model
  model: Retention ~ expFct1(Area, y1, yn, th)
   data: structure(list(Area = c(521.5, 689.78, 1284.71, 2018.8, 
2560.46, 524.91, 989.05, 1646.32, 2239.65, 2972.96, 478.54, 875.52, 
1432.5, 2144.74, 2629.2), Retention = c(95.3, 87.18, 44.94, 26.36, 
18.12, 84.68, 37.24, 33.04, 23.46, 9.72, 97.92, 71.44, 44.52, 24.44, 
15.26)), .Names = c("Area", "Retention"), row.names = c(NA, -15L), class 
= "data.frame")

th
0.9862
 residual sum-of-squares: 730

Number of iterations to convergence: 2
Achieved convergence tolerance: 1.616e-06

>  nls.m2
Nonlinear regression model
  model: Retention ~ expFct2(Area, y1, yn, c, d, th)
   data: structure(list(Area = c(521.5, 689.78, 1284.71, 2018.8, 
2560.46, 524.91, 989.05, 1646.32, 2239.65, 2972.96, 478.54, 875.52, 
1432.5, 2144.74, 2629.2), Retention = c(95.3, 87.18, 44.94, 26.36, 
18.12, 84.68, 37.24, 33.04, 23.46, 9.72, 97.92, 71.44, 44.52, 24.44, 
15.26)), .Names = c("Area", "Retention"), row.names = c(NA, -15L), class 
= "data.frame")

 c  d th
0.9716 1.1010 0.9118
 residual sum-of-squares: 686.8

Number of iterations to convergence: 4
Achieved convergence tolerance: 2.398e-06

>  nls.m
Nonlinear regression model
  model: Retention ~ expFct(Area, b0, b1, th)
   data: structure(list(Area = c(521.5, 689.78, 1284.71, 2018.8, 
2560.46, 524.91, 989.05, 1646.32, 2239.65, 2972.96, 478.54, 875.52, 
1432.5, 2144.74, 2629.2), Retention = c(95.3, 87.18, 44.94, 26.36, 
18.12, 84.68, 37.24, 33.04, 23.46, 9.72, 97.92, 71.44, 44.52, 24.44, 
15.26)), .Names = c("Area", "Retention"), row.names = c(NA, -15L), class 
= "data.frame")

b0 b1 th
 5.2104452 -0.0004672  0.9118040
 residual sum-of-squares: 686.8

Number of iterations to convergence: 0
Achieved convergence tolerance: 1.384e-06

Parameters:
 Estimate Std. Error t value Pr(>|t|)
b0  5.2104452  0.4999594  10.422 2.29e-07 ***
b1 -0.0004672  0.0013527  -0.345   0.7358
th  0.9118040  0.3583575   2.544   0.0257 *

So what is the stable parameterization for this model.  To simplify let 
x be the independent variable and y be the

 dependent variable and write t instead of th  So the model is

y = exp(b0*exp(b1*x^t) (1)

Now it would be nice to reorder the x's so that they are monotone 
increasing or decreasing, but in this case the
first x is almost the largest and the last x is almost the smallest 
(slight reach) so they will do. Call them x1 and xn.
The new parameters of the model are the predicted y's for x1 and xn.  
Call them y1 and yn.  Note that y1 and yn

are parameters, not observations.


The data were fit using 3 calls to nls2. For the first call only the 
parameter th was estimated. this converged
in 4 function evaluestions. for the second call all three of the 
parameters were estimated
(but the other 2 parameters are different from b0, b1) .This converged 
to the solution in four function evaluations.


The final call to nls2 uses the converged values to calculate b0,b1 and 
theta and starts from there.
as you can see the model is already converged. this final call to nls2 
is used to get the standard error

estimates for the parameters.

>  nls.m1
Nonlinear regression model
  model: Retention ~ expFct1(Area, y1, yn, th)
   data: structure(list(Area = c(521.5, 689.78, 1284.71, 2018.8, 
2560.46, 524.91, 989.05, 1646.32, 2239.65, 2972.96, 478.54, 875.52, 
1432.5, 2144.74, 2629.2), Retention = c(95.3, 87.18, 44.94, 26.36, 
18.12, 84.68, 37.24, 33.04, 23.46, 9.72, 97.92, 71.44, 44.52, 24.44, 
15.26)), .Names = c("Area", "Retention"), row.names = c(NA, -15L), class 
= "data.frame")

th
0.9862
 residual sum-of-squares: 730

Number of iterations to convergence: 2
Achieved convergence tolerance: 1.616e-06

>  nls.m2
Nonlinear regression model
  model: Retention ~ expFct2(Area, y1, yn, c, d, th)
   data: structure(list(Area = c(5

Re: [R] Finding starting values for the parameters using nls() or nls2()

2016-10-19 Thread dave fournier
Actually this converges very nicely if you use these starting values 
that I obtained with

AD Model Builder

   th 9.1180e-01
   b05.2104e+00
   b1   -4.6725e-04

The R result looks like

nls.m2
Nonlinear regression model
  model: Retention ~ expFct(Area, b0, b1, th)
   data: structure(list(Area = c(521.5, 689.78, 1284.71, 2018.8, 
2560.46, 524.91, 989.05, 1646.32, 2239.65, 2972.96, 478.54, 875.52, 
1432.5, 2144.74, 2629.2), Retention = c(95.3, 87.18, 44.94, 26.36, 
18.12, 84.68, 37.24, 33.04, 23.46, 9.72, 97.92, 71.44, 44.52, 24.44, 
15.26)), .Names = c("Area", "Retention"), row.names = c(NA, -15L), class 
= "data.frame")

b0 b1 th
 5.2104466 -0.0004672  0.9118029
 residual sum-of-squares: 686.8

Number of iterations to convergence: 1
Achieved convergence tolerance: 1.75e-06

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] nls in r

2015-08-12 Thread dave fournier


I believe that if your try these starting values the sum of squares is 
considerably smaller



a=1.0851e-06
b=1.4596e-01
delta=9.1375e-01

something like SS= 0.005236471 vs  SS= 0.01597071

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Hessian Matrix Issue

2011-09-03 Thread dave fournier


I wonder if your code is correct?

I ran your script until an error was reported. the data set
of 30 obs was


[1]  0  0  1  3  3  3  4  4  4  4  5  5  5  5  5  7  7  7  7  7  7  8  9 
10 11

[26] 12 12 12 15 16

I created a tiny AD Model Builder program to do MLE on it.

DATA_SECTION
  init_int nobs
  init_vector y(1,nobs)
PARAMETER_SECTION
  init_number log_mu
  init_number log_alpha
  sdreport_number mu
  sdreport_number tau
  objective_function_value f
PROCEDURE_SECTION
  mu=exp(log_mu);
  tau=1.0+exp(log_alpha);
  for (int i=1;i<=nobs;i++)
  {
f-=log_negbinomial_density(y(i),mu,tau);
  }
It converged quickly and

The eigenvalues of the Hessian were

4.71108977478.27632341

and the estimates and std devs of the parameters mu and tau were

index   name   value  std dev

 3   mu 6.6000e+00 7.7318e-01
 4   tau2.7173e+00 7.8944e-01

where tau is the variance divided by the mean.

This was all so simple that I suspect your (rather difficult to read)
R code is wrong, otherwise R must really suck at this kind of problem.

  Dave

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Complicated nls formula giving singular gradient message

2010-12-22 Thread dave fournier

I don't Think that viewing lack of convergence by some R routine
as a uuseful tool for diagnosing model or data inadequacy is a very 
useful approach. It is far better to fit the model. Then standard

techniques can be employed to investigate these matters. For the
model considered here there are 5 parameters and 96 observations.
So a priori no reason to suspect that the data are insufficient.
So where lies the problem?  Fitting the model and using the very
accurate Hessian approximation provided by AD Model Builder
provides some immediate clues. The eigenvalues of the Hessian are

3.943982727e-08104.6301825150.7527476203.044988959736.68735

so the condition number is about 1.e+13.  With such a badly scaled
problem it is difficult to fit with finite difference approximations
to the derivatives.  The approximate std devs of the parameter
estimates are

  index   name   value  std dev
   1   NS 1.1254e-02 7.1128e-03
   2   LogKi -8.8933e+00 8.2411e-02
   3   LogKi -5.2005e+00 9.2179e-02
   4   LogKi -7.2677e+00 7.7047e-02
   5   BMax   2.1226e+05 5.1699e+03

so there is no initial indication  that the parameter estimates
are badly determined.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Solution to differential equation

2010-12-17 Thread dave fournier

Ravi Varadhan wrote:

Because the numerical solution is more flexible. In the example I linked 
to the
population is being fished. This add an extra term which breaks your 
solution.
I don't know where the OP is going with this question, but flexibility 
might be
useful. Also I just like the idea of fitting models defined by DE's to 
data.



When you can obtain `exact' (but not closed-form) solution, why would you
want to use a numerical ODE solver, which has an approximation error of the
order O(dt) or O(dt^2), where `dt' is the time step? Furthermore, a
significant advantage of an exact solution is that you can compute the
solution at any given `t' in one shot, rather than having to march through
time from t=t0 to t=t.  Numerical time-marching schemes make more sense for
systems of nonlinear ODEs.

Ravi.

---
Ravi Varadhan, Ph.D.
Assistant Professor,
Division of Geriatric Medicine and Gerontology School of Medicine Johns
Hopkins University

Ph. (410) 502-2619
email: rvarad...@jhmi.edu


-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On
Behalf Of dave fournier
Sent: Friday, December 17, 2010 11:23 AM
To: r-help@r-project.org
Subject: Re: [R] Solution to differential equation


It is not very difficult to integrate this DE numerically.
For parameter estimation it is a good idea for
stability to use a semi-implicit formulation. The idea is
described here.

 http://otter-rsch.com/admodel/cc4.html

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.





__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Solution to differential equation

2010-12-17 Thread dave fournier


It is not very difficult to integrate this DE numerically.
For parameter estimation it is a good idea for
stability to use a semi-implicit formulation. The idea is
described here.

http://otter-rsch.com/admodel/cc4.html

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Complicated nls formula giving singular gradient message

2010-12-13 Thread dave fournier

I always enjoy these direct comparisons between different software packages.
I coded this up in AD Model Builder which is freely available at
http://admb-project.org   ADMB calculates exact derivatives via automatic
differentiation so it tends to be more stable for these difficult problems.

The parameter estimates are
# Number of parameters = 3
Objective function value = 307873.  Maximum gradient component = 1.45914e-06
# NS:
0.00865232633386
# LogKi:
-8.98700621813
# BMax:
237135.365156
The objective function is just least squares.
So it looks like SAS did pretty well before dying.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] autocorrelation in count data

2010-11-23 Thread dave fournier

You can fit this model with AD Model Builder's random effects module.
there is an example fitting a Poisson and negative binomial to the 
venerable

polio data set with ar(1) random effects at

   
http://admb-project.org/examples/count-data/negative-binomial-serially-correlated-counts


A big advantage of ADMB is flexible modeling of both the mean and 
overdispersion.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Mixed-effects model for overdispersed count data?

2010-10-25 Thread dave fournier

According to the documentation for glmmADMB if you fit
your model with a statment like


fit =glmm.admb(y~Base*trt+Age+Visit, ...  data=epil2,family="nbinom")

and that the parameter estimates are in

   fit$b  while their estimated standard deviations are
in

fit$stdbeta

so presumably  p values can be constructed from the
quotient

 fit$b/fit$stdbeta

by assuming a t distribution with (somehow) the correct
degrees of freedom.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Help: Maximum likelihood estimation

2010-10-22 Thread dave fournier
Actually it is not that difficult to parameterize the covariance matrix 
so that the
optimization is unconstrained. first parameterize the correlation matrix 
and the

standard deviations separately. the std devs can be parameterized as

  sigma_i=exp(x_i) 1<=i<=n

For the correlation matrix parameterize it in terms of its choleski 
decomposition.


this is a lower triangular matrix

   1  0   0
   a_1  a_20
   b _1 b_2 b_3
 


such that  the norm  of each row is 1

to ensure this  form start with

   1
  y_1  1
  y_2  y_3  1
...

 and normalize each row to have norm 1. There are n*(n-1)/2  of these y's.
 together with the n x_i's you have n*(n+1)/2 parameters as you should.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Using NLS with a Kappa function

2010-10-12 Thread dave fournier

Actually it just the parameterization that is causing trouble near k=0


let u = (x-z)/a

then the problematic part of your function is

  (1- k*u)^(1/k)

take the log to get

   log(1-k*u)/k

= -(k*u +k^2*u^2/2 + ...)/k

= -(u +k u^2/2 + ..)

so your function is  exp(-u - ku^2/2 - ...)

and use that for khttps://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] optim() not finding optimal values

2010-06-28 Thread dave fournier
 If you are going to make this program available for general
  use you want to take every precaution to make it bulletproof.

  This is a fairly informative data set. The model will undoubtedly
  be used on far less informative data.  While the model looks
  pretty simple it is  very challenging from a numerical point of view.
  I took a moment to code it up in AD Model Builder. The true minimum is
  1619.480495 So I think Ravi has finally arrived pretty close to the
answer.
  One way of judging the difficulty of a model is to look at the
  eigenvalues of the Hessian at the minimum. They are

   3.122884668e-09 1.410866202e-08  1866282.520 1.330233652e+13

  so the condition number is around 1.e+21. One begins to see why these
  models are challenging.  The model as formulated represents the state
  of the art in fisheries models circa 1985.
  A lot of progress has been made since that time.
  Using B_t for the biomass and C_t for the catch the equation
  in the code is

B_{t+1} = B_t + r *B_t*(1-B_t/K) -C_t  (1)
  First  notice that
  for (1) to make sense the following conditions must be satisfied

   B_t > 0 for all t
   r > 0
   K>0
  Strictly speaking it is not necessary that B_t<=K but if B_t>K and r
  is large then B_{t+1} could be <0.  So formulation (1) gives
  Murphys law a good chance.  How to fix it. Notice that (1) is really
  a rough approximation to the solution of a differential equation

  B'(t) =  r *B(t)*(1-B(t)/K) -C  (2)

  where in (2) C is a constant catch rate.  To fix (1) we use
  a semi-implicit differencing scheme. Because it is useful to
  allow smaller step sizes than one we denote them by d.

   B_{t+d} = B_t + d* r *B_t*(1-B_{t+d}/K) -d*C_t*B_{t+d}/B_t  (1)

  The idea is that the quantity  1-x with x>0 will be replaced by
  1/(1+x).  Expanding 2 and solving for B_{t+d} yields
  B_{t+d} = (1+d*r) B_t / (1+d*r*B_t/K +d*C_t/B_t)  (3)

   So long as r>0, K>0 C_t>0 then starting from an initial value
   B_0 > 0 ensures that B_t> 0 for all t>0.  We can let
   d=1/nsteps where nsteps is the number of steps in the
   approximate integration for each year
   which can be increased until the solution is judged to be close
   enough to the exact solution from (2)

   Notice that in (3) as C_t --> infinity  B_{t+d} --> 0
   So that you can never catch more fish than you have.

   I coded up this version of the model in AD Model Builder and
   fit it to the data. It is now much more resistant to bad
   starting values for the parameters etc.

   If anyone wants the tpl file for the model in ADMB they can
   contact me off list.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Robust estimation of variance components for a nested design

2010-03-13 Thread dave fournier

If you mean using random effects which have a fat-tailed distribution
this has been available in AD Model Builder's random effects package for
some time now. The general idea is to start with a random effect assumed
to be standard normal and then to transform it by the cumulative dist
function for the normal and then by the inverse of the
cumulative distribution function of the desired distribution function of
the desired distribution. See http://admb-project.org  There is
a list there where you should be able to get advice.



-- 
David A. Fournier
P.O. Box 2040,
Sidney, B.C. V8l 3S3
Canada
Phone/FAX 250-655-3364
http://otter-rsch.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Starting estimates for nls Exponential Fit

2009-12-07 Thread dave fournier
Thanks to Dennis Murphy who pointed out that ExponCycles
is undefined. It is an R gotcha. I had shortened the name but R still
remembered it so the script worked but only on my computer.
This should fix that.

ExponValues=c(2018.34,2012.54,2018.85,2023.52,2054.58,2132.61,2247.17,2468.32,27
78.47)
Expon=c(17,18,19,20,21,22,23,24,25)
# Example starting estimate calculation
E=1000.0
y1=2018
yn=2778.47
nobs=9


#keep y1 and yn fixed and get initial value for E
Esp1 <- optim(c(E=E),method ="BFGS",
  function(x)
  {
E=x[1]
a=(yn-y1)/(E^Expon[nobs]-E^Expon[1])
Y0=y1-a*E^Expon[1];
diff=ExponValues-(Y0+a*E^Expon)
return(1000*sum(diff*diff))
  })$par

E=Esp1[1]

Esp <- optim(c(y1=y1,yn=yn,E=E),method ="BFGS",
  function(x)
  {
E=x[3]
a=(x[2]-x[1])/(E^Expon[nobs]-E^Expon[1])
Y0=x[1]-a*E^Expon[1];
diff=ExponValues-(Y0+a*E^Expon)
return(1000*sum(diff*diff))
  })$par

y1=Esp[1]
y2=Esp[2]
E=Esp[3]

a=(y2-y1)/(E^Expon[nobs]-E^Expon[1])
Y0=y1-a*E^Expon[1];




-- 
David A. Fournier
P.O. Box 2040,
Sidney, B.C. V8l 3S3
Canada
Phone/FAX 250-655-3364
http://otter-rsch.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Starting estimates for nls Exponential Fit

2009-12-07 Thread dave fournier

I thought maybe my suggestion for reparameterizing this simple problem
was ignored because I didn't supply R code for the problem.
Here it is using optim for the optimization. It converges trivially with
an initial value for E of 1000.
As I stated before, there is nothing at all difficult about this
problem. You simply need to parameterize it properly. Of course that is
not to say that rescaling is not useful as well, but the important thing
is to parameterize the model properly.

ExponValues=c(2018.34,2012.54,2018.85,2023.52,2054.58,2132.61,2247.17,2468.32,27
78.47)
Expon=c(17,18,19,20,21,22,23,24,25)
# Example starting estimate calculation
E=1000.0
y1=2018
yn=2778.47
nobs=9


#keep y1 and yn fixed and get initial value for E
Esp1 <- optim(c(E=E),method ="BFGS",
  function(x)
  {
E=x[1]
a=(yn-y1)/(E^Expon[nobs]-E^Expon[1])
Y0=y1-a*E^Expon[1];
diff=ExponValues-(Y0+a*E^ExponCycles)
return(1000*sum(diff*diff))
  })$par

E=Esp1[1]

Esp <- optim(c(y1=y1,yn=yn,E=E),method ="BFGS",
  function(x)
  {
E=x[3]
a=(x[2]-x[1])/(E^Expon[nobs]-E^Expon[1])
Y0=x[1]-a*E^Expon[1];
diff=ExponValues-(Y0+a*E^ExponCycles)
return(1000*sum(diff*diff))
  })$par

y1=Esp[1]
y2=Esp[2]
E=Esp[3]

a=(y2-y1)/(E^Expon[nobs]-E^Expon[1])
Y0=y1-a*E^Expon[1];





-- 
David A. Fournier
P.O. Box 2040,
Sidney, B.C. V8l 3S3
Canada
Phone/FAX 250-655-3364
http://otter-rsch.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Starting estimates for nls Exponential Fit

2009-12-02 Thread dave fournier
Figuring out the best parameterization for this kind of model
is a bit tricky until you get the hang of it.
Let the function be

y_t = y_0 + alpha * E^t

where uppercase Y_t denotes an observed value
and lower case y_t is a predicted value.
Index the times by t_1  t_n
WLOG assume that t_1 is the smallest time and t_n is the largest time.

Now we already have a good idea what the predicted values  y_1 and y_n
should be as we have observations for them. We have the two equations

   y_1 = y_0 + alpha * E^t_1

   y_n = y_0 + alpha * E^t_n

we can solve these for alpha in terms of y_1,y_n,and E giving

 alpha = (y_n-y_1)/(E^t_n -E^t_1)   (1)

and solve for y_0  as

 y_0 = y_1 - alpha * E^t_1  using (1) for alpha

Now use the good estimates Y_1 and Y_n  as the starting values
for y_1 and y_n
and try some "reasonable value for E (say 0.1 < E < 100)

Do the minimization in two stages first holding y_1 and y_n fixed and
then estimate y_1,y_n,and E together. This converges in less than a second
using AD Model Builder and the starting values (large value for E.

2018.34 2778.47 exp(10.0)  where I deliberately
used exp(10) as a large initial value for E.
The parameter estimates together with the est std devs
are

  1   y_1 1.9994e+03 3.9177e-01
  2   y_n 2.7881e+03 6.7557e-01
  3   log_E  5.6391e-01 1.2876e-03
  4   alpha  6.0130e-04 1.9398e-05
  5   y_0 1.9906e+03 4.5907e-01
  6   E  1.7575e+00 4.3935e-02

There are problems for E near 1 which need to be dealt with if
you have to go there, but that is just a technicality.

This idea also works well for a logistic say 3 4 or 5 parameter.














-- 
David A. Fournier
P.O. Box 2040,
Sidney, B.C. V8l 3S3
Canada
Phone/FAX 250-655-3364
http://otter-rsch.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Maximum likelihood estimation of parameters make no biological sense

2009-09-24 Thread dave fournier

Hi,

Your results are do to using an unstable parameterization
of the Von Bertalanffy growth curve, combined with the unreliable
optimization methods supplied with R.  I coded up your model in
AD Model Builder which supplies exact derivatives through
AD.

I used your starting values and ran the model with no optimization steps
just to se that we had the same value for the -log-likelihood
Results are
# Number of parameters = 5  Objective function value = -11.6954  Maximum
gradien
t component = 0.0
# winf:
24.2720681300
# k:
0.046798440
# t0:
0.001000
# vhat:
0.01000
# b:
1.61760492000

However the R routine is stuck. When I let the ADMB code run it produced

# Number of parameters = 5  Objective function value = -13.8515  Maximum
gradient component = 9.41643e-05
# winf:
15.7188821203
# k:
0.118198731245
# t0:
-32.9089295327
# vhat:
0.00471832483493
# b:
184.999879271

Note that b--> infinity. I have it bounded at 185.
t0--> -infinity  so that the model is only using a small part of the
growth curve which happens to fit the data better.

The estimated correlation matrix for the parameter estimates tells the story

index name  valuestd dev   1   2   3   4   5
 1   winf 1.5719e+01 5.1252e+00   1.
 2   k1.1820e-01 2.7849e-02  -0.9832  1.
 3   t0  -3.2909e+01 7.6867e+00  -0.9748  0.9990  1.
 4   vhat 4.7183e-03 2.0119e-03   0.  0.  0.  1.
 5   b1.8500e+02 1.6374e+00  -0.0002  0.0003 -0.0094  0.  1.

You can see that several of the parameters are highly confounded.

Also the eigenvalues of the Hessian are

  0.01691149331  0.02045399106 963.2994413  2255.900979  4225373.963

So you have a condition number of about 10^8. Very difficult to work
with such a function with only approximate derivatives.

I think the moral of the story is that you should use a more stable
parameterization or an industrial strength estimation system or maybe
both.

  Cheers,

   Dave

   Cheers,

Dave







-- 
David A. Fournier
P.O. Box 2040,
Sidney, B.C. V8l 3S3
Canada
Phone/FAX 250-655-3364
http://otter-rsch.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Entire Organization Switching from SAS to R - Any experience?

2009-07-25 Thread dave fournier
It is true that R does not offer support for custom likelihood functions
in nonlinear mixed models. However you can switch to R and use
AD Model Builder's random effects module http://admb-project.org
This is freely available software and it is more flexible than
proc nlmixed. I'm sure there are people involved in the project who
would be happy to help.

   Dave




>We use SAS and R here (a biostat department and consulting unit), in
>part because there are some things SAS does that R doesn't.  In
>particular, we use SAS proc nlmixed with custom likelihood functions.  >R
>has similar capability but does not allow custom likelihood; the >authors
>say adding it would be non-trivial.


-- 
David A. Fournier
P.O. Box 2040,
Sidney, B.C. V8l 3S3
Canada
Phone/FAX 250-655-3364
http://otter-rsch.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] nlme: problem with fitting logistic function

2009-03-11 Thread dave fournier
I think you can do this very efficiently with AD Model Builder's
random effects module. The software is now freely available at

   http://admb-project.org

If you want, you can contact me directly to discuss the model.

Dave

-- 
David A. Fournier
P.O. Box 2040,
Sidney, B.C. V8l 3S3
Canada
Phone/FAX 250-655-3364
http://otter-rsch.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] generalized linear mixed models with a beta distribution

2009-02-26 Thread dave fournier
You can fit this kind of model (and negative binomial) and more
difficult mixed models   with AD Model Builder's random effects module
which is now freely available at

 http://admb-project.org/


-- 
David A. Fournier
P.O. Box 2040,
Sidney, B.C. V8l 3S3
Canada
Phone/FAX 250-655-3364
http://otter-rsch.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] AD Model Builder now freely available

2008-11-25 Thread dave fournier
Hi All,

Following Mike Praeger's posting on this list,
I'm happy to pass on that AD Model Builder is now freely available from
the ADMB Foundation.

 http://admb-foundation.org/

Two areas where AD Model builder would be especially useful to R users
are multi-parmater smooth optimization as in larg scale maximum
likelihood estimation and the analysis of general nonlinear random
effects models.

Cheers,

 Dave





-- 
David A. Fournier
P.O. Box 2040,
Sidney, B.C. V8l 3S3
Canada
Phone/FAX 250-655-3364
http://otter-rsch.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] lmer function :method="AGQ" and glmmADMB

2008-04-02 Thread dave fournier
The freely available R package glmmADMB can do Adaptive
Gaussian Quadrature for this type of model,
since it is built using AD Model Builder's random effects
module which incorporates this feature.

There is now a beta version of the software for
people using R on the Mac intel platform.

http://otter-rsch.com/admbre/examples/glmmadmb/glmmADMB.html

Cheers,

Dave

-- 
David A. Fournier
P.O. Box 2040,
Sidney, B.C. V8l 3S3
Canada
Phone/FAX 250-655-3364
http://otter-rsch.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] lmer function :method="AGQ" glmmADMB

2008-04-02 Thread dave fournier
The freely available R package glmmADMB can do Adaptive
Gaussian Quadrature for this type of model,
since it is built using AD Model Builder's random effects
module which incorporates this feature.

There is now a beta version of the software for
people using R on the Mac intel platform.

http://otter-rsch.com/admbre/examples/glmmadmb/glmmADMB.html

Cheers,

Dave

-- 
David A. Fournier
P.O. Box 2040,
Sidney, B.C. V8l 3S3
Canada
Phone/FAX 250-655-3364
http://otter-rsch.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] coxme frailty model standard errors?

2007-12-07 Thread dave fournier
While it may be true that for coxme models the "standard errors"
are not very good approximations, it is always useful to have them
to compare with other diagnostics such as likelihood ratios and
profile likelihoods.  It is interesting to hear that with the currently
used methodology

   "Computation of the se of the random effect turns out to be very hard"

because if you simply use AD Model Builders Random Effects module to
formulate the model you will get the standard errors calculated for free
without any more effort.

Cheers,

  Dave

-- 
David A. Fournier
P.O. Box 2040,
Sidney, B.C. V8l 3S3
Canada
Phone/FAX 250-655-3364
http://otter-rsch.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.