On Mon, 24 Feb 2014 11:49:26 -0500, Evan Ward wrote:
I've looked into improving performance further, but it seems any
further
improvements will need big API changes for memory management.
Currently using Gauss-Newton with Cholesky (or LU) requires 4 matrix
allocations _each_ evaluation. The objective function initially
allocates the Jacobian matrix. Then the weights are applied through
matrix multiplication, allocating a new matrix. Computing the normal
equations allocates a new matrix to hold the result, and finally the
decomposition allocates it's own matrix as a copy. With QR there are
3
matrix allocations each model function evaluation, since there is no
need to compute the normal equations, but the third allocation+copy
is
larger. Some empirical sampling data I've collected with the
jvisualvm
tool indicates that matrix allocation and copying takes 30% to 80% of
the execution time, depending on the dimension of the Jacobian.
One way to improve performance would be to provide pre-allocated
space
for the Jacobian and reuse it for each evaluation.
Do you have actual data to back this statement?
The
LeastSquaresProblem interface would then be:
void evaluate(RealVector point, RealVector resultResiduals,
RealVector
resultJacobian);
I'm interested in hearing your ideas on other approaches to solve
this
issue. Or even if this is an issue worth solving.
Not before we can be sure that in-place modification (rather than
reallocation) always provides a performance benefit.
Best Regards,
Gilles
Evan
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org