Jake is in.. Check!

On Fri, Feb 26, 2010 at 2:00 AM, Jake Mannix <jake.man...@gmail.com> wrote:

> Hmm...
>
> code: *check*
> desire to add stochastic decomp to code: *check*
> amazon credits: *check* (my account today: almost $300 left burning hole in
> pocket)
> relatively gigantic social graph: *check*
> legal ability to put gigantic social graph on ec2: not so check, but maybe
> some
> clever anonymization work on export could be done here.
>
> Let's break some records! :)
>
>  -jake
>
> On Thu, Feb 25, 2010 at 12:18 PM, Drew Farris <drew.far...@gmail.com>
> wrote:
>
> > Sound's pretty interesting. Assuming this is EC2, Would be great if
> > Amazon would pick up the tab, us being an open source project and all
> > and potentially good marketing to boot. Also, whomever's account is
> > used will have to have its default limit of 20 machines raised.
> >
> > On Thu, Feb 25, 2010 at 3:11 PM, Robin Anil <robin.a...@gmail.com>
> wrote:
> > > +1 I'm ready. What do we need. Perf Tuning! Cluster Setup?, Amazon
> > Credits?
> > > Someone to pay for the machines or from our own pockets?
> > >
> > >
> > > Robin
> > >
> > > On Fri, Feb 26, 2010 at 1:20 AM, Ted Dunning <ted.dunn...@gmail.com>
> > wrote:
> > >
> > >> These guys:
> > >>
> > >>
> > >>
> >
> http://delivery.acm.org/10.1145/1460000/1459718/a18-vigna.pdf?key1=1459718&key2=4070317621&coll=GUIDE&dl=GUIDE&CFID=77555530&CFTOKEN=13940667
> > >>
> > >> say this:
> > >>
> > >>   > We present experiments over a collection with 3.6 billions of
> > >> postings---two orders of magnitudes larger than any published
> experiment
> > in
> > >> the literature.
> > >>
> > >> My impression is that Mahout on about 100 machines is ready to break
> > this
> > >> record with Jake's latest code.  The stochastic decomposition should
> > make
> > >> it
> > >> even more plausible.
> > >>
> > >> The hardest part will be to find reasonable data with > 4 billion
> > non-zero
> > >> entries.  At 0.01% sparsity, this is roughly a square matrix with 5
> > million
> > >> rows and columns.
> > >>
> > >> Jake, your social graph should be much larger than that.
> > >>
> > >> --
> > >> Ted Dunning, CTO
> > >> DeepDyve
> > >>
> > >
> >
>

Reply via email to