How bout trying with a bunch of beefy spot instances from Amazon. I'm
willing to bankroll partially just to see what happens. But again, where to
find a suitable dataset?

On Thu, Feb 25, 2010 at 3:11 PM, Robin Anil <robin.a...@gmail.com> wrote:

> +1 I'm ready. What do we need. Perf Tuning! Cluster Setup?, Amazon Credits?
> Someone to pay for the machines or from our own pockets?
>
>
> Robin
>
> On Fri, Feb 26, 2010 at 1:20 AM, Ted Dunning <ted.dunn...@gmail.com>
> wrote:
>
> > These guys:
> >
> >
> >
> http://delivery.acm.org/10.1145/1460000/1459718/a18-vigna.pdf?key1=1459718&key2=4070317621&coll=GUIDE&dl=GUIDE&CFID=77555530&CFTOKEN=13940667
> >
> > say this:
> >
> >   > We present experiments over a collection with 3.6 billions of
> > postings---two orders of magnitudes larger than any published experiment
> in
> > the literature.
> >
> > My impression is that Mahout on about 100 machines is ready to break this
> > record with Jake's latest code.  The stochastic decomposition should make
> > it
> > even more plausible.
> >
> > The hardest part will be to find reasonable data with > 4 billion
> non-zero
> > entries.  At 0.01% sparsity, this is roughly a square matrix with 5
> million
> > rows and columns.
> >
> > Jake, your social graph should be much larger than that.
> >
> > --
> > Ted Dunning, CTO
> > DeepDyve
> >
>



-- 
Zaki Rahaman

Reply via email to