On Sat, Dec 29, 2012 at 10:22:20AM +0100, Dimitri Pourbaix wrote:
> Gilles,
> 
> >Handling weighted observations must take correlations into account, i.e. use
> >a _matrix_.
> >There is the _practical_ problem of memory. Solving it correctly is by
> >using a sparse implementation (and this is actually an implementation
> >_detail_).
> 
> The problem is where something becomes a detail!  You are right that the
> general least square problem copes with a matrix of weights ... but the
> way it is implemented is a detail.

That's what I said above, although I suspect that we don't mean the same
thing. OO programming allows to define types that will represent the
"real" concepts: in this case, if the problem is expressed in terms of a
a (mathematical) matrix, the algorithm should use a "Matrix" (type).
This is not an implementation detail; the goal is for the code to be as
close as possible to the mathematical description of the procedure
(self-documenting code).

The implementation detail is how the matrix type stores its data internally;
and this can be the subject of any necessary efficiency improvements,
independently of the matrix concept used at a higher level (e.g. in the
optimization algorithms).

> As already pointed out, even the
> vector of weights API allows for a complicated matrix of weights.  The user
> premultiplies by the 'square root' of that matrix and sets all the compo-
> nents of the weight vector to 1.  So, your enthusiasm to generalise the
> vector of weights to a matrix was a detail to make the life of very few
> users easier ... without adding any functionality.

This is a backward description of my change.
In reality:
1. The handling of weights was there.
2. Assuming that people wanted to keep it, I added the functionality to
   handle correlated observations.

If indeed the weight feature is independent of the optimization procedure,
then _all_ references to weights should be banned.
[If just because keeping an array of "ones" and doing loops that "multiply
by one" are obviously not going to improve clarity and performance.]
Eventually, this seems to be the accepted compromise now (IIUC).

> There are so many different configurations (e.g. block diagonal, ...), I
> doubt you can handle all of them in the most efficient way

Actually, my "Weight" class trivially handles _any_ "RealMatrix" (thanks to
inheritance!).

> so it is likely
> preferable to have the user taking care of them.

This is exactly what "Weight" does.
The problem is that CM does not provide efficient implementations for
matrix forms suited for this context (symmetric, sparse, diagonal).[1]

Above and in the previous post, I agreed that this would not be a problem i
we entirely drop the support for weights in the optimizers.

> It is however true that simple weights (i.e. vector form) are a very usual
> situation ... which is also very common in fitting tools.  So, I think CM
> should offer that approach as well.

Where? In the fitting tools or in the optimizers?
We just said that weights could be handled independenttly from the
optimization procedure. But we could indeed put weights back where they are
most useful (e.g. in the curve fitting) without dragging everywhere (where
most of the time they'd be equal to one...).

> In conclusion: the old CM 3.0 API was enough! :)

If that's so, then people can just copy/paste the source code of that
version and not care about subsequent versions of CM.


Cordially,
Gilles

[1] Actually, the problem is that some people complain that we don't do
    enough to their taste: In the past, at least 3 persons raised issues
    with matrix implementations, but without providing any useful help,
    unfortunately (to be clear, I'm not talking of current contributors!).

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org

Reply via email to