Regarding Newton-Raphson, I would add that there are certain kinds of function for which the method *cannot* work.
N-R is essentially a root-finding technique, and when used for finding a minimum (or maximum) it is applied to finding the root of the deriviative. As an example of a class of functions for which N-R will fail to find a root, consider f(x) = -(-x)^a for x < 0 f(x) = x^a for x > 0 f(0) = 0 where 0 < a <= 1/2. With x[1] as starting point, if a = 1/2 then the N-R iteration will alternate between +x[1] and -x[1]. If a < 1/2, then x[n+1] = -x[n]*(1-a)/a and will take values with alternate signs which are at increasing distances (by the factor (1-a)/a > 1) from the root x=0. Hence in all these cases N-R will not converge. Only when a > 1/2 will it work. And the above is only one special class of functions for which, for similar reasons, N-R will not converge. The corresponding class of functions for which N-R would not find a minimum are such that f(x) above is proportional to the derivatve, e.g. F(x) = (-x)^(a+1) for x < 0 F(x) = x^(a+1) for x < 0 F(0) = 0 Ted. On 22-Jun-10 20:52:26, Achim Zeileis wrote: > John, > > thanks for the comments, very useful. > > Just three short additions specific to the ZINB case: > 1. zeroinfl() doesn't uses BFGS (by default) and not Newton-Raphson > because we have analytical gradients but not an analytical > Hessian. > 2. If the default starting values do not work, zeroinfl() offers EM > estimation of the starting values (which is typically followed by > a single iteration of BFGS only). EM is usually much more robust > but slower, hence it's not the default in zeroinfl(). > 3. I pointed the original poster to this information but he still > insisted on Newton-Raphson for no obvious reason. As I didn't > want to WTFM again on the list, I stopped using up bandwidth. > > thx, > Z > > On Tue, 22 Jun 2010, Prof. John C Nash wrote: > >> I have not included the previous postings because they came out very >> strangely on my mail reader. However, the question concerned the >> choice of >> minimizer for the zeroinfl() function, which apparently allows any of >> the >> current 6 methods of optim() for this purpose. The original poster >> wanted to >> use Newton-Raphson. >> >> Newton-Raphson (or just Newton for simplicity) is commonly thought to >> be the >> "best" way to approach optimization problems. I've had several people >> ask me >> why the optimx() package (see OptimizeR project on r-forge -- probably >> soon >> on CRAN, we're just tidying up) does not have such a choice. Since the >> question comes up fairly frequently, here is a response. I caution >> that it is >> based on my experience and others may get different mileage. >> >> My reasons for being cautious about Newton are as follows: >> 1) Newton's method needs a number of safeguards to avoid singular or >> indefinite Hessian issues. These can be tricky to implement well and >> to do so >> in a way that does not hinder the progress of the optimization. >> 2) One needs both gradient and Hessian information, and it needs to be >> accurate. Numerical approximations are slow and often inadequate for >> tough >> problems. >> 3) Far from a solution, Newton is often not very good, likely because >> the >> Hessian is not like a nice quadratic over the whole space. >> >> Newton does VERY well at converging when it has a "close enough" >> start. If >> you can find an operationally useful way to generate such starts, you >> deserve >> awards like the Fields. >> >> We have in our optimx work (Ravi Varadhan and I) developed a prototype >> safeguarded Newton. As yet we have not included it in optimx(), but >> probably >> will do so in a later version after we figure out what advice to give >> on >> where it is appropriate to apply it. >> >> In the meantime, I would suggest that BFGS or L-BFGS-B are the closest >> options in optim() and generally perform quite well. There are updates >> to >> BFGS and CG on CRAN in the form of Rvmmin and Rcgmin which are all-R >> implementations with box constraints too. UCMINF is a very similar >> implementation of the unconstrained algorithm that seems to have the >> details >> done rather well -- though BFGS in optim() is based on my work, I >> actually >> find UCMINF often does better. There's also nlm and nlminb. >> >> Via optimx() one can call these, and also some other minimizers, or >> even >> "all.methods", though that is meant for learning about methods rather >> than >> solving individual problems. >> >> JN >> >> ______________________________________________ >> [email protected] mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide >> http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. >> > > ______________________________________________ > [email protected] mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. -------------------------------------------------------------------- E-Mail: (Ted Harding) <[email protected]> Fax-to-email: +44 (0)870 094 0861 Date: 22-Jun-10 Time: 23:21:27 ------------------------------ XFMail ------------------------------ ______________________________________________ [email protected] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.

