Thank you for the helpful answers! I summarise them below and add my experiences:
The numerical differentiation ----------------------------- > Check out ?fdHess and run the example! This was the solution. help(fdHess, package="nlme")
> help(numericDeriv,package="nls") This is a good solution, too. For my case, the above was more practical.
Running optim(..., method="SANN")
---------------------------------
> You don't need to do any numerical differentiation in "optim", by
> default it will automatically compute the derivatives via numerical
> differentiation.
I was experimenting with optim a lot, and I found that "SANN" does not calculate derivatives.
> For the other four methods 'optim' will do > numerical differentiation for you if a gradient is not provided. This agrees with my observations.
> 'optim' does not require any differentiation of the objective function
> for the "SANN" method.
True, however, providing a 'parscale' based on the derivatives for "SANN" vastly accelerated its convergence. See below.
Role of 'parscale' optim(..., control=list(parscale=g, ...)) ------------------------------------------------------------
For my function to optimise this was the solution:
library(nlme)
fd<-fdHess(start.values, modell.2)
g <- 1/fd$gradient
out<-optim(start.values, modell.2, method="SANN", hessian=TRUE, control=list(trace=2, parscale=g))
> the 'parscale' argument has nothing to do with > differentiation. As far as I know, it is used to scale the values of > the parameters before choosing candidates (so that they are roughly > comparable). Differentiation was useful to examine the scales.
> The help sais: > `parscale' A vector of scaling values for the parameters. > Optimization is performed on `par/parscale' and these should > be comparable in the sense that a unit change in any element > produces about a unit change in the scaled value.
So, yes, 'parscale' is used to scale the parameters before choosing candidates. But choosing candidates seems to be critical: setting parscale to the reciprocials of the gradient values calculated at a good guess of the optimal parameters accelerated the convergence immensely.
In my case parscale values were very diverse, ranging from 1e-07 to 1e+05. Without letting the optimisation procedure know these differences in the scales, it generated poor candidates.
Thanks you once more, and I hope you found my experiences useful.
G�bor
-- Gabor BORGULYA MD MSc Semmelweis University of Budapest, 2nd Dept of Paediatrics Hungarian Paediatric Cancer Registry
______________________________________________ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help
