Re: [NLopt-discuss] Runtime errors for large gradients

schattenpflanze Wed, 09 Mar 2011 00:54:50 -0800

Assuming Peter is using standard terminology, automatic
differentiation does *not* mean numerical differentiation with finite
differences.  Automatic differentiation means calculating the exact
derivative analytically, just using a computer program rather than
doing it by hand: an AD differentiaties your source code
symbolically.

Yes, that is precisely what I do.

There is a theorem that you can always compute the gradient of f(x)
(any function from R^n to R) with about as many additional operations
as are required to compute f(x) **once**. The technique for doing so
is called an "adjoint method", or "reverse accumulation".

I do use a library which allows for reverse accumulation (namely CppAD),

and I think it is quite efficient. In particular, it usually requiresonly a constant factor of additional computational time compared to the

function evaluation. This factor is roughly equal to 10 in my case.
The overhead is due to the fact that all arithmetic operations have to
be recorded to a 'tape' first, Taylor coefficients are computed and
propagated, and then an additional reverse sweep is performed. By
design, the function has to be evaluated before any differentiation can
be done.

Furthermore, CppAD works by overloading functions with a different datatype, not by code generation. This is essential, since my functionscontain external templated libraries relying on (large amounts of)conditional expressions. Thus, the function I differentiate actuallyvaries with the parameters. Differentiating once and for all, which isusually the goal of AD, does not work.

So, while a factor of 10 is great compared to the factor of 50 or even100 (for 50 variables) you would get with finite differences, it stillmeans that most of the time is spent for the gradient.

why are you using automatic
differentiation instead of using a gradient free algorithm? Are the
gradient free algorithims in NLOPT really so poor for your
application that automatic differentiation + a with-gradient
algorithm is superior?

Actually, for most mathematical applications, there is a huge
performance difference between gradient free algorithms and gradient

based methods. If it is possible to obtain a gradient by any means(sometimes even finite differences), the gradient based methods shouldeasily beat other methods. This is particularly true if you have morethan, say, 5 variables, even with the overhead for differentiation. For50 variables in a non-linear problem, gradient-free methods will perhapsnever converge.

In contrast, AD programs that use "forward mode" or "forward
accumulation" differentiation are poor for differentiating
multi-variate functions (R^n to R), since they have cost that grows
proportional to n. (Forward accumulation is good for differentiating
functions from R to R^m.)

Exactly ;-).


Best regards,
Peter

_______________________________________________
NLopt-discuss mailing list
[email protected]
http://ab-initio.mit.edu/cgi-bin/mailman/listinfo/nlopt-discuss

Re: [NLopt-discuss] Runtime errors for large gradients

Reply via email to