Ted Dunning wrote:
I would love more coordination.
Fine, thanks for your interest.
I tried to contribute some time ago to commons math, but was pretty soundly
dismissed. My interest in helping did not survive the contact.
I'm sorry about this. I apologize on behalf of the commons guys and hope
you will reconsider the situation.
The major issue in integrating the two is the fairly limited Vector/Matrix
class structure in commons math. In mahout, we need to be able to have
special implementations so that we can drive parallel implementations
easily.
I am ready to help adapt linear algebra in [math] (this is our way to
talk about commons sub-project). We released version 1.2 a few weeks ago
with almost no change in linear algebra parts except for QR
decomposition. This release had to be compatibile with version 1.1, so
no more change could have been included then.
The next version will probably be 2.0 since we already need to introduce
some incompatibility to implement some requests for changes. This would
be an opportunity to introduce changes you may need. Working together on
this would be nice, clearly separating what should belong to [math] and
what should belong to Mahout.
Looking at the recent messages, it seems some implementations currently
being developped are based on the very basic school-teached methods,
which are really useful only for concepts teaching and should be avoided
in real life applications. For example, as you pointed yourself out one
or two days ago, computing the inverse of a Matrix is really the wrong
thing to do. [math] can help here. We do have proper implementations of
several decomposition algorithms: LU, QR, Cholesky (hidden in the new
correlated random vector generation and also in an experimental part),
SVD being on its way with already two implementation proposals in JIRA.
We also do have a generic least squares solving framework with both
simple Gauss-Newton and more robust Levenberg-Marquardt. The
optimization package introduced with 1.2 will be reworked and extended.
The benefits to Mahout, however, would be significant. Some of the
distribution implementations would be very helpful, especially if they were
rounded out with the concept sampling in addition to computing densities and
cumulants. Having some basic linear algebra such as svd, eigenvectors and
cholesky decompositions would also help significantly (commons math only
provides LUD, if I remember, but together, the world is ours). For
comparison, Mahout so far only has a linear solver based on Cramer's method
which is a bit naïve.
I also think we can do great things together!
Mahout could [math] to its list of dependencies, probably by having
developers checking out the development version and updating it when
needed, and adding their requests using JIRA. I can handle these
requests on the commons side. I can also both give advices and submit
patches on the Mahout side. Doest this seems sensible to everybody ?
Could you explain what are the most urgent things you would need
especially from the linear algebra package ? The top element is the
RealMatrix interface, and currently we have only the dense
implementation of this interface. For sure we want to implement other
ones (sparse matrices, but also lower and upper triangular,
tri-diagonals, symmetric ...). Since we can afford API changes in 2.0,
we can both change the interface, change the implementation and add new
implementations.
Luc
On 4/10/08 4:54 AM, "[EMAIL PROTECTED]" <[EMAIL PROTECTED]> wrote:
Hello,
Currently at the ApacheCon Europe, Niall Pemberton pointed me to Isabel about
Mahout. We briefly talked together and I has a glance at your project, which I
didn't know before (sorry to have missed that).
I am one of the commons-math committers and also the developer of the Mantissa
library (I think Grant Ingersoll knows this library, I recall his name as one
of
the brave souls who noticed and solved somes bugs in it). I am not a
mathematician per se, but may help on some math-related subjects.
Some parts of commons-math may help you, and obviously some of your needs are
not addressed at all. It would be great if we could work together and improve
both projects by putting some algorithms where the better fit.
Does this proposal sound reasonable to you ?
Luc