12 gigs, it uses several more (up to 10?) times the memory than the dataset
size.

2012/10/24 Shuo Wang <ecisp.wangs...@gmail.com>

> How large your data is? Our cluster has 10 nodes, 45 tasks, each task has
> 512M memory. But when I run the 200M data, it has OUTOFMEMORY failure.
>
> 2012/10/24 Thomas Jungblut <thomas.jungb...@gmail.com>
>
> > Sure it does run, if you have enough ram ;)
> >
> > 2012/10/24 Shuo Wang <ecisp.wangs...@gmail.com>
> >
> > > How much data have you run the pagerank on HAMA? Does it run? I want to
> > run
> > > large data for pagerank on HAMA, but it always fails.
> > >
> > > 2012/10/24 Thomas Jungblut <thomas.jungb...@gmail.com>
> > >
> > > > Yes it works on any directed graph.
> > > > The best format to use is
> > > >
> > > > Vertex <\t> AdjacentVertex1 <\n> AdjacentVertex2 etc.
> > > >
> > > > So you have a adjacency list, and a vertex is represented by each
> line.
> > > > This is splittable, which the web-google dataset is not.
> > > >
> > > > 2012/10/24 Shuo Wang <ecisp.wangs...@gmail.com>
> > > >
> > > > > Thanks! Does the pagerank work on any web graph? I generate a
> random
> > > web
> > > > > graph just like the data type of web-Google.txt, but the result is
> > > > > infinity.
> > > > >
> > > > > 2012/10/24 Thomas Jungblut <thomas.jungb...@gmail.com>
> > > > >
> > > > > > Because graph iterations != supersteps. You have to take the
> > > > partitioning
> > > > > > into account, the time to accumulate the number of vertices.
> > Pagerank
> > > > > > requires an additional superstep to run aggregators.
> > > > > >
> > > > > > 2012/10/24 Shuo Wang <ecisp.wangs...@gmail.com>
> > > > > >
> > > > > > > Hi,
> > > > > > >
> > > > > > > I have run the pagerank on HAMA, I set the max iteration to 20,
> > but
> > > > it
> > > > > > run
> > > > > > > 48 supersteps. Why?
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>

Reply via email to