Re: [math] differentiation framework

Gilles Sadowski Fri, 30 Nov 2012 13:05:45 -0800

Hi.

> > [...]
> > 
> > Of course, _I_ just have to start reading about the subject in order to
> > understand; you are not expected to provide the background within the
> > Javadoc! :-)
> 
> If you want some background, read the paper referenced in the API. It is
> really a good paper.


I've already printed it! ;-)

> > [...]
> >>
> >> It was both introduced in 3.1 and deprecated at the same time. It's not
> >> the converter per se which is considered wrong, it is the
> >> DifferentiableMultivariateVectorFunction interface which is deprecated,
> >> hence the converter using this interface is deprecated as a consequence.
> > 
> > I know.
> > 
> > However, you suggested above that we look at the converter code in order to
> > figure out how to switch to the new API. Of course, that's what I've done
> > all day, which made me wonder: If users need to do the same over and over
> > again, why not keep the code in CM indeed?
> 
> Fair enough.

Thus I'd propose to review the existing interface to "externally"
defined derivatives (gradient and Jacobian) and decide what to keep
and what to drop.
AFAICS, something similar to the "toMultivariateDifferentiableFunction"
and "toDifferentiableMultivariateVectorFunction" converters could be
used in the optimizers base classes. I.e. we'd have two "optimize" methods
-----
public PointValuePair optimize(final int maxEval,
                               final MultivariateDifferentiableFunction f,
                               final GoalType goalType,
                               final OptimizationData... optData) {
    return optimizeInternal(maxEval, f, goalType, optData);
}

public PointValuePair optimize(final int maxEval,
                               final DifferentiableMultivariateFunction f,
                               final GoalType goalType,
                               final OptimizationData... optData) {
    final MultivariateDifferentiableFunction converted
        = FunctionUtils.toMultivariateDifferentiableFunction(f);
    return optimizeInternal(maxEval, converted, goalType, optData);
}
-----

> > 
> >>
> >>>> 4. A "FiniteDifferencesDifferentiator" operator currently exists
> >>>> but only for univariate functions.
> >>>> Unles I'm mistaken, a direct extension to multiple variables won't
> >>>> do:
> >>>> * because the implementation uses the symmetric formula, but in
> >>>> some cases (bounded parameter range), it will fail, and
> >>>> * because of the floating point representation of real values, the 
> >>>> delta for sampling the function should depend on the magnitude of 
> >>>> of the parameter value around which the sampling is done whereas
> >>>> the "stepSize" is constant in the implementation.
> >>>>
> >>>> Questions:
> >>
> >>>> 1. Shouldn't we keep the converters so that users can keep their
> >>>> "home-made" (first-order) derivative computations? [Converters
> >>>> exist for gradient of "DifferentiableMultivariateFunction" and
> >>>> Jacobian of "DifferentiableMultivariateVectorFunction".]
> >>
> >> We could decide to not deprecate the older interface and hence not
> >> deprecate these converters. We could also set up a converter that would
> >> not use a DifferentiableMultivariateVectorFunction but could use instead
> >> two parameters: one MultivariateVectorFunction and one
> >> MultivariateMatrixFunction.
> >>
> >> I'm not sure about this. I would really like to get rid of the former
> >> interface which is confusing as we introduced the new one.
> > 
> > IIUC, the "problem" or "barrier" is (as I hinted at in my previous post)
> > that the "DerivativeStructure" does not bring anything when we just fill it
> > with precomputed values. (is that correct?).
> 
> Almost, but not exactly. If you only uses the DerivativeStructure and do
> not use it in further computation, then you are right: it can be seen as
> a simple container, overly engineered. However DerivativeStructure are
> very interesting when they are used as the *input* of further
> computation, because in these computations you will get all derivatives
> composition without complex development. Going back to one of my example
> in this thread, consider v = f(u) and u = g(x, y, z). If g is really
> really complex, you can use the provided finite differences wrapper to
> have an approximate estimation of value u and all partial derivatives
> du/dx, du/dy, d2u/dx2, d2u/dxdy, d2u/dy2 for example, all packed in one
> DerivativeStructure variable. I g is simple, you could compute the
> differentials by yourself and pack them into a DerivativeStructure,
> which here would be used only as output and would look simply as a
> container for all theses values. Suppose now that f is something
> simpler, say f(u) = sqrt(1+exp(u^3-sin(u))). I could differentiate it
> manually, but would fear to be asked to develop by myself the code to
> compute df/dx, df/dy, d2f/dx2, d2f/dxdy, d2f/dy2! This is where
> DerivativeStructure helps. Here, I would simply use the methods provided
> by DerivativeStructure to do the computation, i.e I would simply compute
> one sine, one cube, one subtraction ... and I would get all derivatives
> magically out (not for free, since as Konstantin pointed out, the
> computation is done underneath), but it certainly save development time
> from the human who implements f.

It would be so nice to have an this example thoroughly explained in the user
guide! And with working code. :-)

> 
> > It just makes the interface very unfamiliar (using a complex structure)
> > where "double[]" and "double[][]" were fairly straightforward to represent
> > "multivariate" values and Jacobian, respectively. [Maybe this point can be
> > alleviated by an extensive example in the user guide.]
> 
> In the other sub-thread, I will propose to remove this hurdle from
> optimization framework completely.

I did not get what this means.

> > [...]
> >> derivatives. It is sufficient to compute the abscissae correctly by
> >> changing the single line :
> >>
> >>   final double xi = t0 + stepSize * (i - 0.5 * (nbPoints - 1));
> > 
> > What about:
> >   double xi = t0 * (1 + stepSize * (i - 0.5 * (nbPoints - 1)))
> > ?
> > [In order to automatically adapt to the accuracy loss due to floating
> > point. I.e. the "stepSize" would be relative.]
> 
> It would not work when t0 = 0 and would look strange when called for t0
> varying in sizes. In many cases in my work for example, t0 is really a
> time offset in seconds from the beginning of a simulation, so I have
> loops where t0 starts at 0 and ends at 1000000. Despite t0 order of
> magnitude changes, the dynamics does not change and the evolution rate
> near 0 is the same as it is near 1000000. In this case, having a
> relative step size would lead to a decreasing accuracy as time flows.

Zero is a particular case indeed. In my case, when the (absolute) step is
smaller than some value (variable- and problem-dependent), it is replaced by
a fixed value.

Also, in my case, the partial derivative is not taken at points largely
apart for the same parameter, but for parameters that have different
"amplitudes" (e.g. eccentricity in [0, 1] and temperature in [2000, 70000]).
Or different units as you pointed out previously.
Do you object to the usually recommended rule for "delta" in first-order
derivative formula being "x * sqrt(eps_machine)"?

> [...]
> > -----
> > I don't understand the lines which I marked with an arrow.
> 
> When I presented this framework to a co-worker of mine, he also told me
> this was weird. It is.
> 
> > Why 3 variables, why that constructor (cf. remark above)? In fact, why is it
> > not:
> > -----
> >      new DerivativeStructure(2, 1, 0, x),
> >      new DerivativeStructure(2, 1, 1, y)
> 
> This test case was intended to stress out and detect wrong computations,
> both in variables indices and in derivatives values, taking into account
> the fact that the variables can themselves be functions of other variables.
> 
> So here, we have three unspecified variables (lets call them a, b and
> c), and we are in the middle of a computation with two functions x(a, b,
> c) and y(a, b, c), with x and y value being specified by the loops and
> dx/da=1, dx/db=2, dx/dc=3, dy/da=4, dy/db=5, dy/dc=6.

Maybe this is where the application to optimization starts to become
incomprehensible (how do you _know_ these derivative values?).

> 
> The change you suggest above would simply have computed the partial
> derivatives with respect to x and y, whereas the test does compute the
> composed derivatives with respect to a, b and c.

Again, some form of this should abolutely go to the user guide, as it would
help us, little users, sort out what were the intended application realms of
"DerivativeStructure" (and deduce what is not applicable in our particular
case).

> 
> > -----
> > ? [And consequent changes in the calls to "getPartialDerivative(int...)".]
> > 
> > I didn't figure out why the test pass, or what should be result of the
> > computation: "h" does not contain the first partial derivatives of "hypot"
> > at the given (x, y) ...
> > 
> > 
> > Thanks for bits of enlightenment,
> 
> Hope this helps,

We are on the way, ;-)
Gilles

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org

Re: [math] differentiation framework

Reply via email to