I have been unable to determine whether the hadoop matrix is real or not. >From discussions, it definitely isn't sparse.
Sparsity is absolutely a must and not just for text. Really huge machine learning tends toward sparsity, regardless of area. On 2/19/08 12:13 PM, "Jason Rennie" <[EMAIL PROTECTED]> wrote: > On Mon, Feb 18, 2008 at 8:43 PM, Grant Ingersoll <[EMAIL PROTECTED]> > wrote: > >> yeah, we have had a few discussions on this. There is some support >> in Hadoop already for Matrix calculations via a donation, but I don't >> know that anyone has dug in too deep with it yet. It may be the case >> that we start with something, and then decide to go with something >> else as we get more running time together on this stuff. > > > Is the hadoop matrix lib sparse? I think I took a quick look and didn't > find any indication of such. If a significant application area of mahout is > text, sparsity is a must. Even non-text domains, such as collaborative > filtering, often require sparse representation in order to scale to > medium-sized data sets. But, yeah, understood that it's good to hit the > ground running, see how far we can get and make changes as necessary/useful > :) > > >> Read only access is available via: >> svn co http://svn.apache.org/repos/asf/lucene/mahout/trunk >> > > Thanks. I was trying to checkout one directory too high. > > Jason
