Hi Armando,

I uploaded my test code to github at:

https://github.com/sscdotopen/giraph/tree/hyperball64-ooc

I'm working on an algorithm to estimate the neighborhood function of the graph (similar to [1]). I'm running this on the transposed adjacency matrix of a snapshot of the twitter follower graph [2]. For this graph out-of-core is not necessary, but I would like to run my algorithm on another larger graph that doesn't fit into the aggregated main memory of the cluster anymore.

I think for testing purposes, you can run it on any large graph in adjacency form.

Our cluster consists of 25 machines with 32GB ram, 8 cores and 4 disks per machine. I use the following options to run the algorithm:

hadoop jar giraph-examples-1.1.0-SNAPSHOT-for-hadoop-1.2.1-jar-with-dependencies.jar org.apache.giraph.GiraphRunner

org.apache.giraph.examples.hyperball.HyperBall

--vertexInputFormat org.apache.giraph.examples.hyperball.HyperBallTextInputFormat

--vertexInputPath hdfs:///ssc/twitter-negative/

--vertexOutputFormat org.apache.giraph.io.formats.IdWithValueTextOutputFormat

--outputPath hdfs:///ssc/tmp-123/

--combiner org.apache.giraph.comm.messages.HyperLogLogCombiner

--outEdges org.apache.giraph.edge.LongNullArrayEdges

--workers 24

--customArguments

giraph.oneToAllMsgSending=true,
giraph.isStaticGraph=true,
giraph.numComputeThreads=15,
giraph.numInputThreads=15,
giraph.numOutputThreads=15,
giraph.maxNumberOfSupersteps=30,
giraph.useOutOfCoreGraph=true,
giraph.maxPartitionsInMemory=20

Best,
Sebastian

[1] http://arxiv.org/abs/1308.2144
[2] http://konect.uni-koblenz.de/networks/twitter_mpi

On 02/12/2014 04:21 PM, Armando Miraglia wrote:

Hi Sebastian,

On Wed, Feb 12, 2014 at 02:59:20PM +0100, Sebastian Schelter wrote:
No. Should I have done that?

could you please provide me with the test you have done together with
the variables that you have set during for the computation? This would
help me a lot.

Cheers,
Armando


Reply via email to