Hi,
2012/12/29 Gilles Sadowski <gil...@harfang.homelinux.org> > Hi. > > > [...] > > > > third is just bug-fix. Does the fix for the issue that started this > > > thread change the API? If so, we would need to cut 3.2 in any case. > > The current fix does change the usage (one needs to call optimize with an > argument of a different type). IIUC, it also silently removes the handling > of uncorrelated observations. > > > Yes, this fixes the issue. I have created/resolved the issue (MATH-924) > > and committed the fix as of r1426616. > > > > Could someone please review what I have done? > > I don't like the fix... > > Handling weighted observations must take correlations into account, i.e. > use > a _matrix_. > There is the _practical_ problem of memory. Solving it correctly is by > using > a sparse implementation (and this is actually an implementation _detail_). > If we _need_ such an implementation to solve the practical problem, I > strongly suggest that we focus on creating it (or fixing what CM already > has, or accept that some inconsistency will be present), rather than > reducing the code applicability (i.e. allowing only uncorrelated data). > If the observations are not correlated, the matrix is a diagonal matrix, > not > a vector. > CM also lacks implementations for symmetric, triangular, and diagonal > matrix, > which all would go some way to solving several practical problems of the > same > kind as the current one without sacrificing generality. > > Now, and several years ago, it was noticed that CM does not _have_ to > provide the "weights" feature because users can handle this before they > call CM code. [IIRC, no CM unit test actually use weights different from > 1.] > IMO, the other valid option is thus to have a simpler version of the > algorithm, but still a correct one. > This would also have the advantage that we won't have the urgent need to > keep the sparse matrix implementation. > [Then, if at some point we include helper code to handle weights > (_including_ correlations), we should also do it in a "preprocessing" step, > without touching the optimization algorithms.] > > > I also think (but did not test it), that there may be a problem with > > missing OptimizationData. If someone call the optimizer but forget to > > set the weight (or the target, or some other mandatory parameters), then > > I'm not sure we fail with an appropriate error. Looking for example at > > the private checkParameters method in the MultivariateVectorOptimizer > > abstract class, I guess we may have a NullPointer Exception if either > > Target or Weight/NonCorrelatedWeight has not been parsed in the > > parseOptimizationData method. Could someone confirm this? > > Yes. > And this (not checking for missing data) _could_ be considered a feature, > as > I stressed several times on this ML, and in the code documentation > (eliciting zero constructive comment). > We also _agreed_ that users not passing needed data will result in NPE. > [I imagined that applications would check that valid and complete input is > passed to the lower level "optimize(OptimizationData ...)" methods.] > > It is however straightforward to add a "checkParameters()" method that > would > raise more specific exceptions. [Although that would contradict the > conclusion of the previous discussion about NPE in CM. And restart it, > without even getting a chance to go forward with what had been decided!] > > > Hence, to summarize: > * The fix, in a 3.2 release, should be to replace the matrix > implementation > with one that does not exhaust the memory, e.g. "OpenMapRealMatrix"[1] > or > "DiagonalMatrix" (see my patch for MATH-924), but not change the API. > * We must decide wether to deprecate the weights feature in that release > (and thus remove it in 4.0) or to keep it in its general form (and thus > un-deprecate "OpenMapRealMatrix"[2]). > > > Best regards, > Gilles > > [1] The inconsistencies that led to the deprecation will have no bearing > on usage within the optimizers. > [2] The latter option seems likely to entail a fair amount of work to > improve the performance of "OpenMapRealMatrix" (which is quite poor > for some operations, e.g. "multiply"). > > I have never been happy with deprecating the sparse implementations of vectors and matrices. I also think we should "undeprecate" them and work on them, with some help! Regarding [1], my choice would be to point at the problematic cases in the Javadoc (and possibly the user's guide), and live with them. Regarding [2], would there be any performance gain if we designed an immutable sparse vector/matrix. I guess it would. The question is: do we need these objects to be mutable? Sébastien > --------------------------------------------------------------------- > To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org > For additional commands, e-mail: dev-h...@commons.apache.org > >