Re: [R] Finding starting values for the parameters using nls() or nls2()
> >> On Oct 25, 2016, at 9:29 AM, dave fournier wrote: >> >> >> >> Unfortunately this problem does not appear to be well posed. >> >>Retention = (b0*Area^(th+1))^b >> >> If b0, th, and b are the parameter only the product (th+1)*b is determined. >> >> This comes from noting that powers satisfy >> >> >> >>(a^b)^c = a^(b*c) >> >> >> >> So your model can be written as >> >> >> >> (b0^b) * Area^((th+1)*b) >> > >... which nicely completes the thread since one model had: > >th1 = 9.1180e-01 > b01=5.2104e+00 > b11 = -4.6725e-04 >(th1+1)*b11 >[1] -0.0008932885 > > > b0 = 5.2104466 ;b1 = -0.0004672 ; th = 0.9118029 >((th+1)*b1) >[1] -0.0008931943 > >So both the R's nls2 and AD_MOdel_Builder results yield that same predictions for any given data point at least up to four decimal places. > >So perhaps there was some other physical interpretation of those parameters and there exists an underlying theory that would support adding some extra constraints? > >-- >>David Winsemius >Alameda, CA, USA I'm not really sure what your point is. The OP has two models. One is well posed and the other is not. I was discussing solution of the former model which is well posed. The form of the model, using a, b, and t and x,y to simplify the expression is y = exp( a * exp(b * x^t) ) My point is that there are many models like this where the obvious parameterization (something like the parameters the user is interested in) leads to parameter estimates with highly correlated errors. This does not necessarily mean that the model is badly determined. The model exists independent of the particular parameterization. To fix the ideas assume there are n observations (x_i,y_i) and simplify by assuming that x_1<=x_2<=...<=x_n (but not all equal lets hope) A stable parameterization of the model can often be obtained by picking as new parameters y1 and yn where y1 and yn are the predicted values for y_1 and y_n, that is y1 = exp( a * exp(b * x_1^t) ) yn = exp( a * exp(b * x_n^t) ) Then one solves for some of the original model parameters in terms of y1 and yn. The particular way this is done depends on the model. Often some has some linear parameters and the procedure is easy. For this model as I showed one can solve for a and b in terms of y1 and yn. Then one can fit the model easily with AD Model Builder or nls2 using this parameterization. nls2 provides the standard errors for the parameter estimates. The AD Model Builder solution provides the estimated variance covariance matrix of the parameter estimates via the standard maximum likelihood Hessian calculations. It also provides the covariance for any desired dependent variable. So one can fit the model using y1,yn, and t and get the covariance matrix for a,b, and t in one step. In nls2 one needs to fit the model using y1,yn and then solve for a and b and run the model again from that point. That is no big deal, and I showed how to do it, but it is one more step for the user. It is interesting to see the correlation matrix for the parameter estimates and the dependent variables. std dev correlation 1 logt -9.233e-02 3.4026e-01 1. 2 c9.7164e-01 3.7006e-02 -0.2690 1. 3 d1.1010e+00 1.7852e-01 -0.7495 0.0325 1.000 4 t9.1180e-01 3.1025e-01 1. -0.2690 -0.749 1. 5 a5.2104e+00 4.3404e-01 -0.9781 0.4191 0.621 -0.9781 1.000 6 b -4.6725e-04 1.1714e-03 0.9994 -0.2836 -0.726 0.9994 -0.984 1.00 Here the independent variable are c, d, and logt where y1=c*y_1 y2=d*y2 That is just an additional thing so that one can start with c=1 and d=1 Also logt is used where t=exp(logt) which makes sure t>0. Notice that the correlation between c and d is 0.0325 which is small. If a and b were the parameters of the model 4 t9.1180e-01 3.1025e-01 1. 5 a5.2104e+00 4.3404e-01 -0.9781 1.000 6 b -4.6725e-04 1.1714e-030.9994 -0.984 1.00 One can see how close to 1 and -1 the correlations are. Notice that the parameter b is very badly determined. So rather than saying the model is no good one sees that the model is no good if one want to get information about the parameter b. In contrast the parameter a is fairly well determined and the parameter t is "kind of" determined. Because of this parameter confounding nls2 is extremely sensitive to the starting parameter values using this parameterization. If I change the parameters by about 15% or less as below #b0start=5.210445 #b1_start=-0.0004672373 #th_start=0.911804 b0_start=5.7 b1_start=-0.00055 th_start=1.05 I get the dreaded Error in (function (formula, data = parent.frame(), start, control = nls
Re: [R] Finding starting values for the parameters using nls() or nls2()
Unfortunately this problem does not appear to be well posed. Retention = (b0*Area^(th+1))^b If b0, th, and b are the parameter only the product (th+1)*b is determined. This comes from noting that powers satisfy (a^b)^c = a^(b*c) So your model can be written as (b0^b) * Area^((th+1)*b) __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Finding starting values for the parameters using nls() or nls2()
I've spent quite a bit of time trying to convince people on various lists that the solution to these kinds of problems lies in the stable parameterization of the model. I write the solutions in AD Model Builder because it is easy. But R people are generally stuck in R (or mired) so the message somehow always gets lost. I thought I would bite the bullet and figure out how to do this in R. It turns out that one can fit this model with only 6 function evaluations using nls2. I apologize in advance for my execrable R code. But it does do the job. The data were fit using 3 calls to nls2. For the first call only the parameter th was estimated. This converged in 4 function evaluations. For the second call all three of the parameters were estimated (but the other 2 parameters are different from b0, b1) .This converged to the solution in four function evaluations. The final call to nls2 uses the converged values to calculate b0,b1 and theta and starts from there. As you can see the model is already converged. This final call to nls2 is used to get the standard error estimates for the parameters. > nls.m1 Nonlinear regression model model: Retention ~ expFct1(Area, y1, yn, th) data: structure(list(Area = c(521.5, 689.78, 1284.71, 2018.8, 2560.46, 524.91, 989.05, 1646.32, 2239.65, 2972.96, 478.54, 875.52, 1432.5, 2144.74, 2629.2), Retention = c(95.3, 87.18, 44.94, 26.36, 18.12, 84.68, 37.24, 33.04, 23.46, 9.72, 97.92, 71.44, 44.52, 24.44, 15.26)), .Names = c("Area", "Retention"), row.names = c(NA, -15L), class = "data.frame") th 0.9862 residual sum-of-squares: 730 Number of iterations to convergence: 2 Achieved convergence tolerance: 1.616e-06 > nls.m2 Nonlinear regression model model: Retention ~ expFct2(Area, y1, yn, c, d, th) data: structure(list(Area = c(521.5, 689.78, 1284.71, 2018.8, 2560.46, 524.91, 989.05, 1646.32, 2239.65, 2972.96, 478.54, 875.52, 1432.5, 2144.74, 2629.2), Retention = c(95.3, 87.18, 44.94, 26.36, 18.12, 84.68, 37.24, 33.04, 23.46, 9.72, 97.92, 71.44, 44.52, 24.44, 15.26)), .Names = c("Area", "Retention"), row.names = c(NA, -15L), class = "data.frame") c d th 0.9716 1.1010 0.9118 residual sum-of-squares: 686.8 Number of iterations to convergence: 4 Achieved convergence tolerance: 2.398e-06 > nls.m Nonlinear regression model model: Retention ~ expFct(Area, b0, b1, th) data: structure(list(Area = c(521.5, 689.78, 1284.71, 2018.8, 2560.46, 524.91, 989.05, 1646.32, 2239.65, 2972.96, 478.54, 875.52, 1432.5, 2144.74, 2629.2), Retention = c(95.3, 87.18, 44.94, 26.36, 18.12, 84.68, 37.24, 33.04, 23.46, 9.72, 97.92, 71.44, 44.52, 24.44, 15.26)), .Names = c("Area", "Retention"), row.names = c(NA, -15L), class = "data.frame") b0 b1 th 5.2104452 -0.0004672 0.9118040 residual sum-of-squares: 686.8 Number of iterations to convergence: 0 Achieved convergence tolerance: 1.384e-06 Parameters: Estimate Std. Error t value Pr(>|t|) b0 5.2104452 0.4999594 10.422 2.29e-07 *** b1 -0.0004672 0.0013527 -0.345 0.7358 th 0.9118040 0.3583575 2.544 0.0257 * So what is the stable parameterization for this model. To simplify let x be the independent variable and y be the dependent variable and write t instead of th So the model is y = exp(b0*exp(b1*x^t) (1) Now it would be nice to reorder the x's so that they are monotone increasing or decreasing, but in this case the first x is almost the largest and the last x is almost the smallest (slight reach) so they will do. Call them x1 and xn. The new parameters of the model are the predicted y's for x1 and xn. Call them y1 and yn. Note that y1 and yn are parameters, not observations. The data were fit using 3 calls to nls2. For the first call only the parameter th was estimated. this converged in 4 function evaluestions. for the second call all three of the parameters were estimated (but the other 2 parameters are different from b0, b1) .This converged to the solution in four function evaluations. The final call to nls2 uses the converged values to calculate b0,b1 and theta and starts from there. as you can see the model is already converged. this final call to nls2 is used to get the standard error estimates for the parameters. > nls.m1 Nonlinear regression model model: Retention ~ expFct1(Area, y1, yn, th) data: structure(list(Area = c(521.5, 689.78, 1284.71, 2018.8, 2560.46, 524.91, 989.05, 1646.32, 2239.65, 2972.96, 478.54, 875.52, 1432.5, 2144.74, 2629.2), Retention = c(95.3, 87.18, 44.94, 26.36, 18.12, 84.68, 37.24, 33.04, 23.46, 9.72, 97.92, 71.44, 44.52, 24.44, 15.26)), .Names = c("Area", "Retention"), row.names = c(NA, -15L), class = "data.frame") th 0.9862 residual sum-of-squares: 730 Number of iterations to convergence: 2 Achieved convergence tolerance: 1.616e-06 > nls.m2 Nonlinear regression model model: Retention ~ expFct2(Area, y1, yn, c, d, th) data: structure(list(Area = c(5
Re: [R] Finding starting values for the parameters using nls() or nls2()
Actually this converges very nicely if you use these starting values that I obtained with AD Model Builder th 9.1180e-01 b05.2104e+00 b1 -4.6725e-04 The R result looks like nls.m2 Nonlinear regression model model: Retention ~ expFct(Area, b0, b1, th) data: structure(list(Area = c(521.5, 689.78, 1284.71, 2018.8, 2560.46, 524.91, 989.05, 1646.32, 2239.65, 2972.96, 478.54, 875.52, 1432.5, 2144.74, 2629.2), Retention = c(95.3, 87.18, 44.94, 26.36, 18.12, 84.68, 37.24, 33.04, 23.46, 9.72, 97.92, 71.44, 44.52, 24.44, 15.26)), .Names = c("Area", "Retention"), row.names = c(NA, -15L), class = "data.frame") b0 b1 th 5.2104466 -0.0004672 0.9118029 residual sum-of-squares: 686.8 Number of iterations to convergence: 1 Achieved convergence tolerance: 1.75e-06 __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] nls in r
I believe that if your try these starting values the sum of squares is considerably smaller a=1.0851e-06 b=1.4596e-01 delta=9.1375e-01 something like SS= 0.005236471 vs SS= 0.01597071 __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Hessian Matrix Issue
I wonder if your code is correct? I ran your script until an error was reported. the data set of 30 obs was [1] 0 0 1 3 3 3 4 4 4 4 5 5 5 5 5 7 7 7 7 7 7 8 9 10 11 [26] 12 12 12 15 16 I created a tiny AD Model Builder program to do MLE on it. DATA_SECTION init_int nobs init_vector y(1,nobs) PARAMETER_SECTION init_number log_mu init_number log_alpha sdreport_number mu sdreport_number tau objective_function_value f PROCEDURE_SECTION mu=exp(log_mu); tau=1.0+exp(log_alpha); for (int i=1;i<=nobs;i++) { f-=log_negbinomial_density(y(i),mu,tau); } It converged quickly and The eigenvalues of the Hessian were 4.71108977478.27632341 and the estimates and std devs of the parameters mu and tau were index name value std dev 3 mu 6.6000e+00 7.7318e-01 4 tau2.7173e+00 7.8944e-01 where tau is the variance divided by the mean. This was all so simple that I suspect your (rather difficult to read) R code is wrong, otherwise R must really suck at this kind of problem. Dave __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Complicated nls formula giving singular gradient message
I don't Think that viewing lack of convergence by some R routine as a uuseful tool for diagnosing model or data inadequacy is a very useful approach. It is far better to fit the model. Then standard techniques can be employed to investigate these matters. For the model considered here there are 5 parameters and 96 observations. So a priori no reason to suspect that the data are insufficient. So where lies the problem? Fitting the model and using the very accurate Hessian approximation provided by AD Model Builder provides some immediate clues. The eigenvalues of the Hessian are 3.943982727e-08104.6301825150.7527476203.044988959736.68735 so the condition number is about 1.e+13. With such a badly scaled problem it is difficult to fit with finite difference approximations to the derivatives. The approximate std devs of the parameter estimates are index name value std dev 1 NS 1.1254e-02 7.1128e-03 2 LogKi -8.8933e+00 8.2411e-02 3 LogKi -5.2005e+00 9.2179e-02 4 LogKi -7.2677e+00 7.7047e-02 5 BMax 2.1226e+05 5.1699e+03 so there is no initial indication that the parameter estimates are badly determined. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Solution to differential equation
Ravi Varadhan wrote: Because the numerical solution is more flexible. In the example I linked to the population is being fished. This add an extra term which breaks your solution. I don't know where the OP is going with this question, but flexibility might be useful. Also I just like the idea of fitting models defined by DE's to data. When you can obtain `exact' (but not closed-form) solution, why would you want to use a numerical ODE solver, which has an approximation error of the order O(dt) or O(dt^2), where `dt' is the time step? Furthermore, a significant advantage of an exact solution is that you can compute the solution at any given `t' in one shot, rather than having to march through time from t=t0 to t=t. Numerical time-marching schemes make more sense for systems of nonlinear ODEs. Ravi. --- Ravi Varadhan, Ph.D. Assistant Professor, Division of Geriatric Medicine and Gerontology School of Medicine Johns Hopkins University Ph. (410) 502-2619 email: rvarad...@jhmi.edu -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of dave fournier Sent: Friday, December 17, 2010 11:23 AM To: r-help@r-project.org Subject: Re: [R] Solution to differential equation It is not very difficult to integrate this DE numerically. For parameter estimation it is a good idea for stability to use a semi-implicit formulation. The idea is described here. http://otter-rsch.com/admodel/cc4.html __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Solution to differential equation
It is not very difficult to integrate this DE numerically. For parameter estimation it is a good idea for stability to use a semi-implicit formulation. The idea is described here. http://otter-rsch.com/admodel/cc4.html __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Complicated nls formula giving singular gradient message
I always enjoy these direct comparisons between different software packages. I coded this up in AD Model Builder which is freely available at http://admb-project.org ADMB calculates exact derivatives via automatic differentiation so it tends to be more stable for these difficult problems. The parameter estimates are # Number of parameters = 3 Objective function value = 307873. Maximum gradient component = 1.45914e-06 # NS: 0.00865232633386 # LogKi: -8.98700621813 # BMax: 237135.365156 The objective function is just least squares. So it looks like SAS did pretty well before dying. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] autocorrelation in count data
You can fit this model with AD Model Builder's random effects module. there is an example fitting a Poisson and negative binomial to the venerable polio data set with ar(1) random effects at http://admb-project.org/examples/count-data/negative-binomial-serially-correlated-counts A big advantage of ADMB is flexible modeling of both the mean and overdispersion. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Mixed-effects model for overdispersed count data?
According to the documentation for glmmADMB if you fit your model with a statment like fit =glmm.admb(y~Base*trt+Age+Visit, ... data=epil2,family="nbinom") and that the parameter estimates are in fit$b while their estimated standard deviations are in fit$stdbeta so presumably p values can be constructed from the quotient fit$b/fit$stdbeta by assuming a t distribution with (somehow) the correct degrees of freedom. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Help: Maximum likelihood estimation
Actually it is not that difficult to parameterize the covariance matrix so that the optimization is unconstrained. first parameterize the correlation matrix and the standard deviations separately. the std devs can be parameterized as sigma_i=exp(x_i) 1<=i<=n For the correlation matrix parameterize it in terms of its choleski decomposition. this is a lower triangular matrix 1 0 0 a_1 a_20 b _1 b_2 b_3 such that the norm of each row is 1 to ensure this form start with 1 y_1 1 y_2 y_3 1 ... and normalize each row to have norm 1. There are n*(n-1)/2 of these y's. together with the n x_i's you have n*(n+1)/2 parameters as you should. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Using NLS with a Kappa function
Actually it just the parameterization that is causing trouble near k=0 let u = (x-z)/a then the problematic part of your function is (1- k*u)^(1/k) take the log to get log(1-k*u)/k = -(k*u +k^2*u^2/2 + ...)/k = -(u +k u^2/2 + ..) so your function is exp(-u - ku^2/2 - ...) and use that for khttps://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] optim() not finding optimal values
If you are going to make this program available for general use you want to take every precaution to make it bulletproof. This is a fairly informative data set. The model will undoubtedly be used on far less informative data. While the model looks pretty simple it is very challenging from a numerical point of view. I took a moment to code it up in AD Model Builder. The true minimum is 1619.480495 So I think Ravi has finally arrived pretty close to the answer. One way of judging the difficulty of a model is to look at the eigenvalues of the Hessian at the minimum. They are 3.122884668e-09 1.410866202e-08 1866282.520 1.330233652e+13 so the condition number is around 1.e+21. One begins to see why these models are challenging. The model as formulated represents the state of the art in fisheries models circa 1985. A lot of progress has been made since that time. Using B_t for the biomass and C_t for the catch the equation in the code is B_{t+1} = B_t + r *B_t*(1-B_t/K) -C_t (1) First notice that for (1) to make sense the following conditions must be satisfied B_t > 0 for all t r > 0 K>0 Strictly speaking it is not necessary that B_t<=K but if B_t>K and r is large then B_{t+1} could be <0. So formulation (1) gives Murphys law a good chance. How to fix it. Notice that (1) is really a rough approximation to the solution of a differential equation B'(t) = r *B(t)*(1-B(t)/K) -C (2) where in (2) C is a constant catch rate. To fix (1) we use a semi-implicit differencing scheme. Because it is useful to allow smaller step sizes than one we denote them by d. B_{t+d} = B_t + d* r *B_t*(1-B_{t+d}/K) -d*C_t*B_{t+d}/B_t (1) The idea is that the quantity 1-x with x>0 will be replaced by 1/(1+x). Expanding 2 and solving for B_{t+d} yields B_{t+d} = (1+d*r) B_t / (1+d*r*B_t/K +d*C_t/B_t) (3) So long as r>0, K>0 C_t>0 then starting from an initial value B_0 > 0 ensures that B_t> 0 for all t>0. We can let d=1/nsteps where nsteps is the number of steps in the approximate integration for each year which can be increased until the solution is judged to be close enough to the exact solution from (2) Notice that in (3) as C_t --> infinity B_{t+d} --> 0 So that you can never catch more fish than you have. I coded up this version of the model in AD Model Builder and fit it to the data. It is now much more resistant to bad starting values for the parameters etc. If anyone wants the tpl file for the model in ADMB they can contact me off list. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Robust estimation of variance components for a nested design
If you mean using random effects which have a fat-tailed distribution this has been available in AD Model Builder's random effects package for some time now. The general idea is to start with a random effect assumed to be standard normal and then to transform it by the cumulative dist function for the normal and then by the inverse of the cumulative distribution function of the desired distribution function of the desired distribution. See http://admb-project.org There is a list there where you should be able to get advice. -- David A. Fournier P.O. Box 2040, Sidney, B.C. V8l 3S3 Canada Phone/FAX 250-655-3364 http://otter-rsch.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Starting estimates for nls Exponential Fit
Thanks to Dennis Murphy who pointed out that ExponCycles is undefined. It is an R gotcha. I had shortened the name but R still remembered it so the script worked but only on my computer. This should fix that. ExponValues=c(2018.34,2012.54,2018.85,2023.52,2054.58,2132.61,2247.17,2468.32,27 78.47) Expon=c(17,18,19,20,21,22,23,24,25) # Example starting estimate calculation E=1000.0 y1=2018 yn=2778.47 nobs=9 #keep y1 and yn fixed and get initial value for E Esp1 <- optim(c(E=E),method ="BFGS", function(x) { E=x[1] a=(yn-y1)/(E^Expon[nobs]-E^Expon[1]) Y0=y1-a*E^Expon[1]; diff=ExponValues-(Y0+a*E^Expon) return(1000*sum(diff*diff)) })$par E=Esp1[1] Esp <- optim(c(y1=y1,yn=yn,E=E),method ="BFGS", function(x) { E=x[3] a=(x[2]-x[1])/(E^Expon[nobs]-E^Expon[1]) Y0=x[1]-a*E^Expon[1]; diff=ExponValues-(Y0+a*E^Expon) return(1000*sum(diff*diff)) })$par y1=Esp[1] y2=Esp[2] E=Esp[3] a=(y2-y1)/(E^Expon[nobs]-E^Expon[1]) Y0=y1-a*E^Expon[1]; -- David A. Fournier P.O. Box 2040, Sidney, B.C. V8l 3S3 Canada Phone/FAX 250-655-3364 http://otter-rsch.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Starting estimates for nls Exponential Fit
I thought maybe my suggestion for reparameterizing this simple problem was ignored because I didn't supply R code for the problem. Here it is using optim for the optimization. It converges trivially with an initial value for E of 1000. As I stated before, there is nothing at all difficult about this problem. You simply need to parameterize it properly. Of course that is not to say that rescaling is not useful as well, but the important thing is to parameterize the model properly. ExponValues=c(2018.34,2012.54,2018.85,2023.52,2054.58,2132.61,2247.17,2468.32,27 78.47) Expon=c(17,18,19,20,21,22,23,24,25) # Example starting estimate calculation E=1000.0 y1=2018 yn=2778.47 nobs=9 #keep y1 and yn fixed and get initial value for E Esp1 <- optim(c(E=E),method ="BFGS", function(x) { E=x[1] a=(yn-y1)/(E^Expon[nobs]-E^Expon[1]) Y0=y1-a*E^Expon[1]; diff=ExponValues-(Y0+a*E^ExponCycles) return(1000*sum(diff*diff)) })$par E=Esp1[1] Esp <- optim(c(y1=y1,yn=yn,E=E),method ="BFGS", function(x) { E=x[3] a=(x[2]-x[1])/(E^Expon[nobs]-E^Expon[1]) Y0=x[1]-a*E^Expon[1]; diff=ExponValues-(Y0+a*E^ExponCycles) return(1000*sum(diff*diff)) })$par y1=Esp[1] y2=Esp[2] E=Esp[3] a=(y2-y1)/(E^Expon[nobs]-E^Expon[1]) Y0=y1-a*E^Expon[1]; -- David A. Fournier P.O. Box 2040, Sidney, B.C. V8l 3S3 Canada Phone/FAX 250-655-3364 http://otter-rsch.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Starting estimates for nls Exponential Fit
Figuring out the best parameterization for this kind of model is a bit tricky until you get the hang of it. Let the function be y_t = y_0 + alpha * E^t where uppercase Y_t denotes an observed value and lower case y_t is a predicted value. Index the times by t_1 t_n WLOG assume that t_1 is the smallest time and t_n is the largest time. Now we already have a good idea what the predicted values y_1 and y_n should be as we have observations for them. We have the two equations y_1 = y_0 + alpha * E^t_1 y_n = y_0 + alpha * E^t_n we can solve these for alpha in terms of y_1,y_n,and E giving alpha = (y_n-y_1)/(E^t_n -E^t_1) (1) and solve for y_0 as y_0 = y_1 - alpha * E^t_1 using (1) for alpha Now use the good estimates Y_1 and Y_n as the starting values for y_1 and y_n and try some "reasonable value for E (say 0.1 < E < 100) Do the minimization in two stages first holding y_1 and y_n fixed and then estimate y_1,y_n,and E together. This converges in less than a second using AD Model Builder and the starting values (large value for E. 2018.34 2778.47 exp(10.0) where I deliberately used exp(10) as a large initial value for E. The parameter estimates together with the est std devs are 1 y_1 1.9994e+03 3.9177e-01 2 y_n 2.7881e+03 6.7557e-01 3 log_E 5.6391e-01 1.2876e-03 4 alpha 6.0130e-04 1.9398e-05 5 y_0 1.9906e+03 4.5907e-01 6 E 1.7575e+00 4.3935e-02 There are problems for E near 1 which need to be dealt with if you have to go there, but that is just a technicality. This idea also works well for a logistic say 3 4 or 5 parameter. -- David A. Fournier P.O. Box 2040, Sidney, B.C. V8l 3S3 Canada Phone/FAX 250-655-3364 http://otter-rsch.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Maximum likelihood estimation of parameters make no biological sense
Hi, Your results are do to using an unstable parameterization of the Von Bertalanffy growth curve, combined with the unreliable optimization methods supplied with R. I coded up your model in AD Model Builder which supplies exact derivatives through AD. I used your starting values and ran the model with no optimization steps just to se that we had the same value for the -log-likelihood Results are # Number of parameters = 5 Objective function value = -11.6954 Maximum gradien t component = 0.0 # winf: 24.2720681300 # k: 0.046798440 # t0: 0.001000 # vhat: 0.01000 # b: 1.61760492000 However the R routine is stuck. When I let the ADMB code run it produced # Number of parameters = 5 Objective function value = -13.8515 Maximum gradient component = 9.41643e-05 # winf: 15.7188821203 # k: 0.118198731245 # t0: -32.9089295327 # vhat: 0.00471832483493 # b: 184.999879271 Note that b--> infinity. I have it bounded at 185. t0--> -infinity so that the model is only using a small part of the growth curve which happens to fit the data better. The estimated correlation matrix for the parameter estimates tells the story index name valuestd dev 1 2 3 4 5 1 winf 1.5719e+01 5.1252e+00 1. 2 k1.1820e-01 2.7849e-02 -0.9832 1. 3 t0 -3.2909e+01 7.6867e+00 -0.9748 0.9990 1. 4 vhat 4.7183e-03 2.0119e-03 0. 0. 0. 1. 5 b1.8500e+02 1.6374e+00 -0.0002 0.0003 -0.0094 0. 1. You can see that several of the parameters are highly confounded. Also the eigenvalues of the Hessian are 0.01691149331 0.02045399106 963.2994413 2255.900979 4225373.963 So you have a condition number of about 10^8. Very difficult to work with such a function with only approximate derivatives. I think the moral of the story is that you should use a more stable parameterization or an industrial strength estimation system or maybe both. Cheers, Dave Cheers, Dave -- David A. Fournier P.O. Box 2040, Sidney, B.C. V8l 3S3 Canada Phone/FAX 250-655-3364 http://otter-rsch.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Entire Organization Switching from SAS to R - Any experience?
It is true that R does not offer support for custom likelihood functions in nonlinear mixed models. However you can switch to R and use AD Model Builder's random effects module http://admb-project.org This is freely available software and it is more flexible than proc nlmixed. I'm sure there are people involved in the project who would be happy to help. Dave >We use SAS and R here (a biostat department and consulting unit), in >part because there are some things SAS does that R doesn't. In >particular, we use SAS proc nlmixed with custom likelihood functions. >R >has similar capability but does not allow custom likelihood; the >authors >say adding it would be non-trivial. -- David A. Fournier P.O. Box 2040, Sidney, B.C. V8l 3S3 Canada Phone/FAX 250-655-3364 http://otter-rsch.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] nlme: problem with fitting logistic function
I think you can do this very efficiently with AD Model Builder's random effects module. The software is now freely available at http://admb-project.org If you want, you can contact me directly to discuss the model. Dave -- David A. Fournier P.O. Box 2040, Sidney, B.C. V8l 3S3 Canada Phone/FAX 250-655-3364 http://otter-rsch.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] generalized linear mixed models with a beta distribution
You can fit this kind of model (and negative binomial) and more difficult mixed models with AD Model Builder's random effects module which is now freely available at http://admb-project.org/ -- David A. Fournier P.O. Box 2040, Sidney, B.C. V8l 3S3 Canada Phone/FAX 250-655-3364 http://otter-rsch.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] AD Model Builder now freely available
Hi All, Following Mike Praeger's posting on this list, I'm happy to pass on that AD Model Builder is now freely available from the ADMB Foundation. http://admb-foundation.org/ Two areas where AD Model builder would be especially useful to R users are multi-parmater smooth optimization as in larg scale maximum likelihood estimation and the analysis of general nonlinear random effects models. Cheers, Dave -- David A. Fournier P.O. Box 2040, Sidney, B.C. V8l 3S3 Canada Phone/FAX 250-655-3364 http://otter-rsch.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] lmer function :method="AGQ" and glmmADMB
The freely available R package glmmADMB can do Adaptive Gaussian Quadrature for this type of model, since it is built using AD Model Builder's random effects module which incorporates this feature. There is now a beta version of the software for people using R on the Mac intel platform. http://otter-rsch.com/admbre/examples/glmmadmb/glmmADMB.html Cheers, Dave -- David A. Fournier P.O. Box 2040, Sidney, B.C. V8l 3S3 Canada Phone/FAX 250-655-3364 http://otter-rsch.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] lmer function :method="AGQ" glmmADMB
The freely available R package glmmADMB can do Adaptive Gaussian Quadrature for this type of model, since it is built using AD Model Builder's random effects module which incorporates this feature. There is now a beta version of the software for people using R on the Mac intel platform. http://otter-rsch.com/admbre/examples/glmmadmb/glmmADMB.html Cheers, Dave -- David A. Fournier P.O. Box 2040, Sidney, B.C. V8l 3S3 Canada Phone/FAX 250-655-3364 http://otter-rsch.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] coxme frailty model standard errors?
While it may be true that for coxme models the "standard errors" are not very good approximations, it is always useful to have them to compare with other diagnostics such as likelihood ratios and profile likelihoods. It is interesting to hear that with the currently used methodology "Computation of the se of the random effect turns out to be very hard" because if you simply use AD Model Builders Random Effects module to formulate the model you will get the standard errors calculated for free without any more effort. Cheers, Dave -- David A. Fournier P.O. Box 2040, Sidney, B.C. V8l 3S3 Canada Phone/FAX 250-655-3364 http://otter-rsch.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.