[jira] [Commented] (MATH-874) New API for optimizers

Luc Maisonobe (JIRA) Wed, 24 Oct 2012 06:54:17 -0700

    [ 
https://issues.apache.org/jira/browse/MATH-874?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13483231#comment-13483231
 ]


Luc Maisonobe commented on MATH-874:
------------------------------------

Well, in fact there is not really new CM code here, only a small glue code. The 
code that really changes, is user code...

What changes is how users provided the Jacobian. With the former API, the user 
had to provide two interlinked implementation. An implementation of the 
DifferentiableMultivariateVectorFunction interface, which itself was a mean to 
retrieve an implementation of the MultivariateMatrixFunction interface. These 
two implementations had to be in different classes, as they both defined a 
method named "value" and having a single double[] parameter, one method 
returning a double[] and the other returning a double[][]. A common way to do 
this was to use a top level class for one interface and an internal class for 
the second interface.

With the newer API, users provide a single class implementing two functions. 
The first function is the same as in the former API and computes the value 
only. The second function is able to merge value, the Jacobian and in fact 
could also provide higher order derivatives or derivatives with respect to 
other variables if this function were appended after other functions.

The optimizers do handle both cases in the same way after the initialization. 
With the former API, the optimizer stores a reference to both users objects 
(the one returning double[] and the one returning double[][]). In the newer 
API, the optimizer stores a reference to the user object and a reference to a 
wrapper around the user object that extract the Jacobian from the second 
method. The underlying optimization engine is exactly the same.


What it means for users is the following:

* the part of user code dedicated to set up and call the optimizer is not 
changed at all
* the part of user code dedicated to compute the function value is not changed 
at all
* the part of user code dedicated to compute the function Jacobian is changed

For closed form functions, the changes to Jacobians computation is in fact a 
simplification. Users are not required to apply the chain rules by themselves, 
they simply have to change double variables into DerivativeStructure variables 
and change accordingly the +, -, * ... operators into calls to add, subtract, 
multiply ...

Here is an example, reworked from the unit tests:

{code:title=FormerAPI}
public class Brown implements DifferentiableMultivariateVectorFunction {

  public double[] value(double[] variables) {
    double[] f = new double[m];
    double sum  = -(n + 1);
    double prod = 1;
    for (int j = 0; j < n; ++j) {
      sum  += variables[j];
      prod *= variables[j];
    }
    for (int i = 0; i < n; ++i) {
      f[i] = variables[i] + sum;
    }
    f[n - 1] = prod - 1;
    return f;
  }

  public MultivariateMatrixFunction jacobian() {
      return new Internal();
  }

  private class Internal implements MultivariateMatrixFunction {
    public double[][] value(double[] variables) {
      double[][] jacobian = new double[m][];
      for (int i = 0; i < m; ++i) {
        jacobian[i] = new double[n];
      }

      double prod = 1;
      for (int j = 0; j < n; ++j) {
        prod *= variables[j];
        for (int i = 0; i < n; ++i) {
          jacobian[i][j] = 1;
        }
        jacobian[j][j] = 2;
      }

      for (int j = 0; j < n; ++j) {
        double temp = variables[j];
        if (temp == 0) {
          temp = 1;
          prod = 1;
          for (int k = 0; k < n; ++k) {
            if (k != j) {
              prod *= variables[k];
            }
          }
        }
        jacobian[n - 1][j] = prod / temp;
      }

      return jacobian;

    }

  }

}
{code}

{code:title=NewerAPI}
public class Brown implements MultivariateDifferentiableVectorFunction {

  public double[] value(double[] variables) {
    double[] f = new double[m];
    double sum  = -(n + 1);
    double prod = 1;
    for (int j = 0; j < n; ++j) {
      sum  += variables[j];
      prod *= variables[j];
    }
    for (int i = 0; i < n; ++i) {
      f[i] = variables[i] + sum;
    }
    f[n - 1] = prod - 1;
    return f;
  }

  public DerivativeStructure[] value(DerivativeStructure[] variables) {
    DerivativeStructure[] f = new DerivativeStructure[m];
    DerivativeStructure sum  = variables[0].getField().getZero().subtract(n + 
1);
    DerivativeStructure prod = variables[0].getField().getOne();
    for (int j = 0; j < n; ++j) {
      sum  = sum.add(variables[j]);
      prod = prod.multiply(variables[j]);
    }
    for (int i = 0; i < n; ++i) {
      f[i] = variables[i].add(sum);
    }
    f[n - 1] = prod.subtract(1);
    return f;
  }

} 
{code}

You can note that with the newer API, creating the second method (with 
DerivativeStructure) from the first method (with double), is straightforward. 
It is mainly copy/paste then change the variable types and fix all operators 
calls (and this is what Commons Nabla attempts to do automatically at bytecode 
level). 
                
> New API for optimizers
> ----------------------
>
>                 Key: MATH-874
>                 URL: https://issues.apache.org/jira/browse/MATH-874
>             Project: Commons Math
>          Issue Type: Improvement
>    Affects Versions: 3.0
>            Reporter: Gilles
>            Assignee: Gilles
>            Priority: Minor
>              Labels: api-change
>             Fix For: 3.1, 4.0
>
>         Attachments: optimizers.patch
>
>
> I suggest to change the signatures of the "optimize" methods in
> * {{UnivariateOptimizer}}
> * {{MultivariateOptimizer}}
> * {{MultivariateDifferentiableOptimizer}}
> * {{MultivariateDifferentiableVectorOptimizer}}
> * {{BaseMultivariateSimpleBoundsOptimizer}}
> Currently, the arguments are
> * the allowed number of evaluations of the objective function
> * the objective function
> * the type of optimization (minimize or maximize)
> * the initial guess
> * optionally, the lower and upper bounds
> A marker interface:
> {code}
> public interface OptimizationData {}
> {code}
> would in effect be implemented by all input data so that the signature would 
> become (for {{MultivariateOptimizer}}):
> {code}
> public PointValuePair optimize(MultivariateFunction f,
>                                OptimizationData... optData);
> {code}
> A [thread|http://markmail.org/message/fbmqrbf2t5pb5br5] was started on the 
> "dev" ML.
> Initially, this proposal aimed at avoiding to call some optimizer-specific 
> methods. An example is the "setSimplex" method in 
> "o.a.c.m.optimization.direct.SimplexOptimizer": it must be called before the 
> call to "optimize". Not only this departs form the common API, but the 
> definition of the simplex also fixes the dimension of the problem; hence it 
> would be more natural to pass it together with the other parameters (i.e. in 
> "optimize") that are also dimension-dependent (initial guess, bounds).
> Eventually, the API will be simpler: users will
> # construct an optimizer (passing dimension-independent parameters at 
> construction),
> # call "optimize" (passing any dimension-dependent parameters).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MATH-874) New API for optimizers

Reply via email to