On Tue, Jun 18, 2013 at 6:14 PM, Dmitriy Lyubimov <dlie...@gmail.com> wrote:
> Hello, > > so i finally got around to actually do it. > > I want to get Mahout sparse vectors and matrices (DRMs) and rebuild some > solvers using spark and Bagel /scala. > > I also want to use in-core solvers that run directly on Mahout. > > Question #1: which mahout artifacts are better be imported if I don't want > to pick the hadoop stuff dependencies? Is there even such a separation of > code? I know mahout-math seems to try to avoid being hadoop specfic but not > sure if it is followed strictly. > mahout-math should not depend on hadoop apis at all, if you build it and hadoop gets pulled in via maven, then file a ticket, that's a bug. > Question #2: which in-core solvers are available for Mahout matrices? I > know there's SSVD, probably Cholesky, is there something else? In > paticular, i need to be solving linear systems, I guess Cholesky should be > equipped enough to do just that? > > Question #3: why did we try to import Colt solvers rather than actually > depend on Colt in the first place? Why did we not accept Colt's sparse > matrices and created native ones instead? > > Colt seems to have a notion of parse in-core matrices too and seems like a > well-rounded solution. However, it doesn't seem like being actively > supported, whereas I know Mahout experienced continued enhancements to the > in-core matrix support. > Colt was totally abandoned, and I talked to the original author and he blessed it's adoption. When we pulled it in, we found it was woefully undertested, and tried our best to hook it in with proper tests and use APIs that fit with the use cases we had. Plus, we already had the start of some linear apis (i.e. the Vector interface) and dropping the API completely seemed not terribly worth it at the time. > > Thanks in advance > -Dmitriy > -- -jake