Hi all, Since 2.x series, we have been struggling in several areas with respect to algorithms API. The latest change was about optimizer, but it is only one example among others (solvers, integration, ODE and maybe some parts of statistics may be concerned by the proposal below).
The various things we want to keep and which are not always compatible with each others are : 1) simple use 2) immutability 3) good OO design 4) compatible with reference algorithms implementations 5) maintainable 6) extensible 7) backward compatibility 8) probably many other characteristics ... 3) and 4) often don't work together. 1) 6) and 7) are difficult to handle at once. If we look at optimizers, some progress have been with optimizers with respect to extensibility and backward compatibility, but simple use was clearly left behind as it is difficult to know which optimizer support which feature as neither strong typing nor fixed arguments are used anymore. However, keeping the older API would have prevented extensibility as the combinatorial explosion of arguments increases as features are added (and we still need to add several constraints types). If we look at ODE solvers, we are still using the original API from mantissa, but when we add a new feature, we add more and more setters, thus going farther and farther from immutability, and imposing some unwritten scheduling between calls (for example when we set up additional equations, we must also set up the initial additional state, and the user must set up a way to retrieve the final additional state). If we look at solvers, we started with some parameters set up during the call to solve while other were set up at construction time, but this repartition has changed along time. So I would like to suggest a new approach, which has been largely inspired by a recent discussion on the [CSV] component about the builder API (see <http://markmail.org/thread/o3s2a5hyj6xh4nzj>), by an older discussion on [math] about using fluen API for vectors (see <http://markmail.org/message/2gmg6wnpm5p2splb>), and by a talk Simone gave last year at ApacheCon Europe. The idea is to use fluent API to build progressively the algorithm adding features one at a time using withXxx methods defined in interfaces. As an example, consider just a few features used in optimization: constraints, iteration limit, evaluations limits, search interval, bracketing steps ... Some features are used in several optimizers, some are specific to univariate solvers, some can be used in a family of solvers ... Trying to fit everything in a single class hierarchy is impossible. We tried, but I don't think we succeeded. If we consider separately each features, we could have interfaces defined for each one as follows: interface Constrainable<T extends Constrainable<T>> extends Optimizer { /** Returns a new optimizer, handling an additional constraint. * @param c the constraint to add * @return a new optimizer handling the constraint * (note that the instance itself is not changed */ T withConstraint(Constraint c); } Basically they would be used where OptimizationData is used today. An optimizer that supports simple bounds constraints and max iterations would be defined as : public class TheOptimizer implements Optimizer, Constrainable<TheOptimizer>, IterationLimited<TheOptimizer> { private final int maxIter; private final List<Constraint> constraints; // internal constructor used for fluent API private TheOptimizer(..., int maxIter, List<Constraint> list) { ... this.maxIter = m; this.constraints = l; } public TheOptimizer withConstraint(Constraint c) { List<Constraint> l = new ArrayList<Constraint>(constraints); l.add(c); return new TheOptimizer(..., maxIter, l); } public TheOptimizer withMaxIter(int maxIter m) { return new TheOptimizer(..., m, constraints); } } So basically, the withXxx are sort-of setters, but they do preserve immutability (we do not return this, we return a new object). It is easy to add features to existing classes and there is no need to shove everythin within a single hierarchy, we have a forest, not a tree. When looking at the API, users clearly see what the can use and what they cannot use: if an optimizer does not support constraint, there will be no way to put a constraint into it. If in a later version constraints become available, the existing functions will not be changed, only new functions will appear. Of course, this creates a bunch of intermediate objects, but they are often quite small and the setting part is not the most computation-intensive one. It becomes also possible to do some parametric studies on some features, using code like: Algorithm core = new Algorithm().withA(a).withB(b).withC(c); for (double d = 0.0; d < 1.0; d += 0.001) { Algorithm dSpecial = core.withD(d); double result = dSpecial.run(); System.out.println(" d = " + d + ", result = " + result); } This would work for someone considering feature A is a core feature that should be fixed but feature D is a parameter, but this would equally well work for someone considering the opposite case: they will simply write the loop the other way, the call to withD being outside of the loop and the call to withA being insided the loop. A side effect is also that it becomes possible to copy safely algorithms by just resetting a feature, even when we don't really know what implementation we have. A typical example I have that creates problems to me is duplicating an ODE solver. It cannot be done currently, as some specific elements are required at construction time that depend on the exact type of solver you use (tolerance vectors for adaptive stepsize integrators). So if for example I want to do some Monte-Carlo analysis in parallel and need to duplicate an integrator, I would do it as follows: void FirstOrderIntegrator[] duplicate(FirstOrderIntegrator integrator, int n) { FirstOrderIntegrator copies = new FirstOrderIntegrator[n]; for (int i = 0; i < n; ++i) { copies[i] = integrator.withMaxEvaluations(integrator.getMaxEvaluations()); } return copies; } This kind of API could be extended to several algorithms, so it may be set up in a consistend way accross the library. As I wrote at the beginning of this message, I first think about root solvers, optimizers and ODE. What do you think? Luc --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org For additional commands, e-mail: dev-h...@commons.apache.org