Hi,
I have a scalability issue for Giraph and I can not find out where is
the problem.
--- Cluster specs:
# nodes 1
# threads 32
Processor Intel Xeon 2.0GHz
OS ubuntu 32bit
RAM 64GB
--- Giraph specs
Hadoop Apache Hadoop 1.2.1
Giraph 1.2.0 Snapshot
Tested Graphs:
amazon0302 V=262,111, E=1,234,877
coAuthorsCiteseer V=227,320, E=1,628,268
I run the provided PageRank algorithm in Giraph
"SimplePageRankComputation" with the followng options
(time ($HADOOP_HOME/bin/hadoop jar
$GIRAPH_HOME/giraph-examples/target/giraph-examples-1.2.0-SNAPSHOT-for-hadoop-1.2.1-jar-with-dependencies.jar
\
org.apache.giraph.GiraphRunner
-Dgiraph.graphPartitionerFactoryClass=org.apache.giraph.partition.HashRangePartitionerFactory
\
org.apache.giraph.examples.PageRankComputation \
-vif
org.apache.giraph.io.formats.JsonLongDoubleFloatDoubleVertexInputFormat \
-vip /user/hduser/input/$file \
-vof org.apache.giraph.io.formats.IdWithValueTextOutputFormat \
-op /user/hduser/output/pagerank -w $2 \
-mc
org.apache.giraph.examples.PageRankComputation\$PageRankMasterCompute))
2>&1 \
| tee -a ./pagerank_results/$file.GHR_$2.$iter.output.txt
The algorithm runs without any issue. The number of supersteps is set to
31 by default in the algorithm.
*Problem:*
I dont get any scalability for more than 8 (or 16) processor cores that
is I get speedup up to 8 (or 16) cores and then the run time starts to
increase.
I have run the PageRank with only one superstep as well as running other
algorithms such as ShortestPath algorithm. I get the same results. I can
not figure out where is the problem.
1- I have tried two options by changing the giraph.numInputThreads and
giraph.numOutputThreads: the performance gets a littile bit better but
no impact on scalability.
2- Does it related to the size of the graphs? because the graphs I am
testing are small graphs.
3- Is it a platform related issue?
It is the timing details of amazon graph:
# Processor cores
1 2 4 8 16 24 32
Input
3260 3447 3269 3921 4555 4766
Intialise
3467 36458 45474 39091 100281 79012
Setup
34 52 59 70 77 86
Shutdown
9954 10226 11021 9524 13393 15930
Total
135482 84483 61081 52190 58921 61898
HDFS READ
21097485 26117723 36158199 57808783
80086015 102163071
FILE WRITE
65889 109815 197667 373429 549165 724901
HDFS WRITE
7330986 7331068 7331093 7330988 7330976
7331203
Best Regards,
Karos