Hi Armando,
I uploaded my test code to github at:
https://github.com/sscdotopen/giraph/tree/hyperball64-ooc
I'm working on an algorithm to estimate the neighborhood function of the
graph (similar to [1]). I'm running this on the transposed adjacency
matrix of a snapshot of the twitter follower graph [2]. For this graph
out-of-core is not necessary, but I would like to run my algorithm on
another larger graph that doesn't fit into the aggregated main memory of
the cluster anymore.
I think for testing purposes, you can run it on any large graph in
adjacency form.
Our cluster consists of 25 machines with 32GB ram, 8 cores and 4 disks
per machine. I use the following options to run the algorithm:
hadoop jar
giraph-examples-1.1.0-SNAPSHOT-for-hadoop-1.2.1-jar-with-dependencies.jar org.apache.giraph.GiraphRunner
org.apache.giraph.examples.hyperball.HyperBall
--vertexInputFormat
org.apache.giraph.examples.hyperball.HyperBallTextInputFormat
--vertexInputPath hdfs:///ssc/twitter-negative/
--vertexOutputFormat
org.apache.giraph.io.formats.IdWithValueTextOutputFormat
--outputPath hdfs:///ssc/tmp-123/
--combiner org.apache.giraph.comm.messages.HyperLogLogCombiner
--outEdges org.apache.giraph.edge.LongNullArrayEdges
--workers 24
--customArguments
giraph.oneToAllMsgSending=true,
giraph.isStaticGraph=true,
giraph.numComputeThreads=15,
giraph.numInputThreads=15,
giraph.numOutputThreads=15,
giraph.maxNumberOfSupersteps=30,
giraph.useOutOfCoreGraph=true,
giraph.maxPartitionsInMemory=20
Best,
Sebastian
[1] http://arxiv.org/abs/1308.2144
[2] http://konect.uni-koblenz.de/networks/twitter_mpi
On 02/12/2014 04:21 PM, Armando Miraglia wrote:
Hi Sebastian,
On Wed, Feb 12, 2014 at 02:59:20PM +0100, Sebastian Schelter wrote:
No. Should I have done that?
could you please provide me with the test you have done together with
the variables that you have set during for the computation? This would
help me a lot.
Cheers,
Armando