I have been doing factorizations (SVD, not NMF) in R on sparse matrices of
about this size.  Stochastic decomposition algorithms are incredibly fast on
this size data and in many cases don't even need a block decomposition.  I
am computing relatively few singular values (about 30), but most of the
SVD's take only a fraction of a second.

On Mon, Apr 26, 2010 at 11:14 AM, Jake Mannix <jake.man...@gmail.com> wrote:

> > I have a matrix that has 3,000,000 by 70,000 entries, however it is very
> > sparse.  It could be broken down to 60,000,000 non-zero data points.
> >
> > 2. Am I better off using R, than Mahout?
> >
>
> 60 million doubles as a data set fits in memory (~0.5GB), and depending on
> what algorithm you use, if you stay sparse, you should be fine in R.  If
> you do something which has dense intermediate results, you'll be toast,
> however.

Reply via email to