[...]
This does make it important to decide on a well written and
complete API before releasing it.
When the scope of the software is well circumscribed, that would be
possible. With the whole of [Math]ematics, much less so. :-}
And state-of-the-art in Java is a moving target, aimed at by
changing
CM contributors with differing needs and tastes; this adds to the
unstable mix.
That's a good point. I still prefer the interface design (though I
may
be in the minority) for two reasons. First, if a concrete class only
publicly exposes the methods defined in an interface it encourages
polymorphism. User code that uses one implementation can be easily
switched to another and new implementations are less constrained.
Second, it encourages composition over inheritance. I agree with Josh
Bloch that composition produces more maintainable code. Adding new
methods to an existing interface/class breaks composition.
The "problem" is to get the interface right. As it happens, at some
point
we discover something that was not foreseen; and to correct/improve the
design, compatibility must be broken.
[But refactoring is not a failure of development; it's part of it.]
I think the interface/abstract class discussion is partially
separable
from the immutable/mutable discussion. I see the algorithm as the
part
that could really benefit from the polymorphism. Perhaps separating
the
problem definition (data) from the algorithm will improve the
flexibility of the API. For example,
PointVectorValuePair solveMyNLLSProblem(NLLSOptimizer opt){
//define problem to solve in an independent object
NLLSProblem p = new NLLSProblem(/*model functions, weights,
convergence checker, ...*/);
//provide algorithm with the data it needs
//algorithm has no problem specific state
return opt.optimize(p);
}
I may be missing something, but how much better is it to store
everything the optimizer needs in yet another class?
[Then, that's a possible approach, but it's not what we started
from in Commons Math, and when trying to fix some inconsistency
or removing duplicate code, I tried to retain what could be from
the existing design.]
[...] Thread safety is a tricky beast. I think we agree that the
only
way to guarantee thread safety is to only depend on final concrete
classes that are thread safe themselves.
I don't think so. When objects are immutable, thread-safety follows
It is somewhat off topic, but a counter example would be Vector3D.
Since
the class is not final, a user could extend it and override all the
methods and add some set{X,Y,Z} methods to make it mutable. Even
though
Vector3D is immutable, there is no _guarantee_ that every instanceof
Vector3D is immutable unless it is final. This is why String is
final.
I think I don't get your point: If someone extends a class that is safe
in a way that the extension is unsafe, that's his problem. ;-)
[...] copying any large matrices or arrays is prohibitively
expensive. For the NLLS package we would be copying a pointer to a
function that can generate a large matrix. I think adding some
documentation that functions should be thread safe if you want to
use
them from multiple threads would be sufficient.
I you pass a "pointer" (i.e. a "reference" in Java), all bets are
off:
the
class is not inherently thread-safe. That's why I suggested to
mandate a
_deep_ "copy" method (with a stringent contract that should allow a
caller
to be sure that all objects owned by an instance are disconnected
from
any
other objects).
As someone who has designed a thread safe application based on deep
copying I don't think this is route to follow. A deep copy means you
have to be able to copy an arbitrary (possibly cyclical) reference
graph. Without the graph copy there are many subtle bugs. (References
to
the same object are now references to different objects.) With the
graph
copy the implementation is very complex. This is the reason
Serialization has a separate "patch up" step after object creation,
which leads to some nasty tricks/bugs. Similarly, Cloneable only
produces a shallow copy. Opinions may vary, but in my experience
immutability is an easier approach to thread safety, especially when
you
have to depend on user code.
I agree that using immutability is easier, but my point all along is
that
it is at odds with simplicity (which is aimed at with "fluent API").
And since
1. the internals of the optimizers are not thread-safe yet (see e.g.
LevenbergMarquardtOptimizer"), and
2. it unclear why an optimizer object should be shared by several
threads,
I think that it not worth the additional code just to add the "final"
keyword.
Another, more directly useful (but also, unrelated), feature, would be
to allow concurrent evaluations of the objective function.
But this would impose that users ensure that the _input_ to the
optimizer
is thread-safe.
Either way there can be some tricky bugs for the user. In the
immutable
approach the bugs are solved by finding non-final fields and
references
to objects that are not thread safe. In the deep copy approach it can
be
hard to track down which reference isn't being copied correctly. My
experience is that when I used deep copy I found may bugs in
production.
Now with immutability it is easier for me to reason about thread
safety
and I find few.
You are certainly right, but here I was assuming that we would not have
to copy "complicated" objects.
In some way this adds to my argument: If we cannot ensure that all
objects are thread-safe (either from the way they are stored in the
optimizer or because the input is out of our control), why not just
let it go altogether?
Multi-thread applications should not be utterly annoyed by having
to instantiate several optimizer objects...
[...]
Best,
Gilles
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org