On Wed, Sep 30, 2009 at 8:26 PM, Ted Dunning <[email protected]> wrote:
> No motion. I was pushing that integration because it looked like MTJ was > integrating with them. That would give some pretty high performance linear > algebra to commons-math. > MTJ is LGPL, how was that ever going anywhere? Luc has been doing some very nice work on small matrix decompositions > lately. As you say, however, the class structure is kinda over-done. > Yeah, I ended up using (in decomposer) commons-math-2.0's small matrix eigen decomposition for the final step in Lanczos, and as a check on the Hebbian techniques (to verify accuracy when the dimension is low enough to do both approaches and compare). The other issue is that we need vectors to be Writables which is not > something they are reasonably going to do. > So why do we really need vectors to be Writable? I see the appeal, it's nice and makes the code nicely integrated, but the way I ended up going, so that you could use decomposer either with or without Hadoop was to use a decorator - just have VectorWritable be an implementation of Vector which encapsulates the Writable methods, and delegates to a Hadoop - agnostic Vector member instance. This way all the algorithms which use the Vectors don't need to care about Hadoop unless they really do. > My question is whether we could get math's decompositions by implementing > their RealVector interface, or by extending one of their vector implements > as a Writable. Only the first option seems to have a chance to be easy > (guessing). > Implementing RealVector is uglyugly. Extending to implement Writable can be practically done by my IDE itself, it looks like. But do we want to do either of these? They actually don't even have any equivalent of OrderedIntDoublePair for fast iteration and slow random access (which is the only sparse implementation I've found I need - I rarely have much use for random access in a sparse vector). Is there anything else other than small-scale linear algebra that we could use from commons-math? If that's it, then it's probably not worth it - we can steal an implementation of whatever we need for auxiliary work with the Big Data matrices if we need to, right? I hear they're apache licensed over there. ;p -jake On Wed, Sep 30, 2009 at 6:54 PM, Jake Mannix <[email protected]> wrote: > > > So what's the status on integration of commons-math-2.0 in Mahout? > > > > Do we need that stuff? Some of their apis are pretty ugly (look at the > > number > > of methods you need to implement to qualify to be a "RealVector"), but > > piggybacking on some of their functionality would be pretty useful > > (especially > > stats/regression/distributions as well as the small matrix decomposition > > stuff). > > > > -jake > > > > > > On Fri, Aug 7, 2009 at 10:00 PM, Ted Dunning <[email protected]> > > wrote: > > > > > This is the key step that was pre-requisite to integration of MTJ into > > > commons math, and thereby making really good linear algebra available > for > > > us > > > in Mahout. > > > > > > ---------- Forwarded message ---------- > > > From: Phil Steitz <[email protected]> > > > Date: Fri, Aug 7, 2009 at 5:08 PM > > > Subject: [ANNOUNCEMENT] Apache Commons Math 2.0 Released > > > To: [email protected], [email protected], Commons > > > Developers List <[email protected]>, Commons Users List < > > > [email protected]> > > > Cc: [email protected] > > > > > > > > > The Apache Commons team is pleased to announce the release of version > 2.0 > > > of > > > Commons Math. Commons Math is a library of lightweight, self-contained > > > mathematics and statistics components addressing the most common > problems > > > not available in the Java programming language or Commons Lang. > > > > > > Version 2.0 is a major release, including bug fixes, new features and > > > enhancements to existing features. Most notable among the new features > > are > > > matrix decomposition algorithms, sparse matrices and vectors, genetic > > > algorithms, new optimization algorithms, curve fitting algorithms, > state > > > derivatives in ODE step handlers, new multistep integrators, multiple > > > regression, correlation, rank transformations and Mersenne twister > pseudo > > > random number generator. > > > > > > This release is NOT source and binary compatible with earlier versions > of > > > Commons Math. Starting with version 2.0 of the library, the minimal > > > version of the Java platform required to compile and use commons-math > is > > > Java 5. > > > Source and binary distributions are available for download from the > > Apache > > > Commons Math download site: > > > http://commons.apache.org/downloads/download_math.cgi > > > > > > Please verify signatures using the KEYS file available at the above > > > location > > > when downloading the release. > > > > > > Maven users please note that the maven repository groupId for Commons > > Math > > > has changed in version 2.0 to "org.apache.commons." The artifactId > > remains > > > "commons-math." > > > > > > For more information on Apache Commons Math, visit the Math home page: > > > http://commons.apache.org/math/ > > > > > > Feedback, suggestions for improvment or bug reports are welcome via the > > > "Mailing Lists" and "Issue Tracking" links here: > > > http://commons.apache.org/math/project-info.html > > > > > > Phil Steitz > > > - On behalf of the Apache Commons community > > > > > > > > > --------------------------------------------------------------------- > > > To unsubscribe, e-mail: [email protected] > > > For additional commands, e-mail: [email protected] > > > > > > > > > > > > > > > -- > > > Ted Dunning, CTO > > > DeepDyve > > > > > > > > > -- > Ted Dunning, CTO > DeepDyve >
