Jake is in.. Check! On Fri, Feb 26, 2010 at 2:00 AM, Jake Mannix <jake.man...@gmail.com> wrote:
> Hmm... > > code: *check* > desire to add stochastic decomp to code: *check* > amazon credits: *check* (my account today: almost $300 left burning hole in > pocket) > relatively gigantic social graph: *check* > legal ability to put gigantic social graph on ec2: not so check, but maybe > some > clever anonymization work on export could be done here. > > Let's break some records! :) > > -jake > > On Thu, Feb 25, 2010 at 12:18 PM, Drew Farris <drew.far...@gmail.com> > wrote: > > > Sound's pretty interesting. Assuming this is EC2, Would be great if > > Amazon would pick up the tab, us being an open source project and all > > and potentially good marketing to boot. Also, whomever's account is > > used will have to have its default limit of 20 machines raised. > > > > On Thu, Feb 25, 2010 at 3:11 PM, Robin Anil <robin.a...@gmail.com> > wrote: > > > +1 I'm ready. What do we need. Perf Tuning! Cluster Setup?, Amazon > > Credits? > > > Someone to pay for the machines or from our own pockets? > > > > > > > > > Robin > > > > > > On Fri, Feb 26, 2010 at 1:20 AM, Ted Dunning <ted.dunn...@gmail.com> > > wrote: > > > > > >> These guys: > > >> > > >> > > >> > > > http://delivery.acm.org/10.1145/1460000/1459718/a18-vigna.pdf?key1=1459718&key2=4070317621&coll=GUIDE&dl=GUIDE&CFID=77555530&CFTOKEN=13940667 > > >> > > >> say this: > > >> > > >> > We present experiments over a collection with 3.6 billions of > > >> postings---two orders of magnitudes larger than any published > experiment > > in > > >> the literature. > > >> > > >> My impression is that Mahout on about 100 machines is ready to break > > this > > >> record with Jake's latest code. The stochastic decomposition should > > make > > >> it > > >> even more plausible. > > >> > > >> The hardest part will be to find reasonable data with > 4 billion > > non-zero > > >> entries. At 0.01% sparsity, this is roughly a square matrix with 5 > > million > > >> rows and columns. > > >> > > >> Jake, your social graph should be much larger than that. > > >> > > >> -- > > >> Ted Dunning, CTO > > >> DeepDyve > > >> > > > > > >