Hi,

I have a scalability issue for Giraph and I can not find out where is the problem.

--- Cluster specs:
# nodes             1
# threads          32
Processor          Intel Xeon 2.0GHz
OS                    ubuntu 32bit
RAM                 64GB

--- Giraph specs
Hadoop            Apache Hadoop 1.2.1
Giraph              1.2.0 Snapshot

Tested Graphs:
amazon0302                V=262,111, E=1,234,877
coAuthorsCiteseer        V=227,320, E=1,628,268


I run the provided PageRank algorithm in Giraph "SimplePageRankComputation" with the followng options

(time ($HADOOP_HOME/bin/hadoop jar $GIRAPH_HOME/giraph-examples/target/giraph-examples-1.2.0-SNAPSHOT-for-hadoop-1.2.1-jar-with-dependencies.jar \ org.apache.giraph.GiraphRunner -Dgiraph.graphPartitionerFactoryClass=org.apache.giraph.partition.HashRangePartitionerFactory \
 org.apache.giraph.examples.PageRankComputation  \
-vif org.apache.giraph.io.formats.JsonLongDoubleFloatDoubleVertexInputFormat \
-vip /user/hduser/input/$file \
-vof org.apache.giraph.io.formats.IdWithValueTextOutputFormat \
-op /user/hduser/output/pagerank -w $2 \
-mc org.apache.giraph.examples.PageRankComputation\$PageRankMasterCompute)) 2>&1 \
| tee -a ./pagerank_results/$file.GHR_$2.$iter.output.txt

The algorithm runs without any issue. The number of supersteps is set to 31 by default in the algorithm.

*Problem:*
I dont get any scalability for more than 8 (or 16) processor cores that is I get speedup up to 8 (or 16) cores and then the run time starts to increase.

I have run the PageRank with only one superstep as well as running other algorithms such as ShortestPath algorithm. I get the same results. I can not figure out where is the problem.

1- I have tried two options by changing the giraph.numInputThreads and giraph.numOutputThreads: the performance gets a littile bit better but no impact on scalability. 2- Does it related to the size of the graphs? because the graphs I am testing are small graphs.
3- Is it a platform related issue?

It is the timing details of amazon graph:

# Processor cores
        1       2       4       8       16      24      32


Input   
        3260    3447    3269    3921    4555    4766
Intialise       
        3467    36458   45474   39091   100281  79012
Setup   
        34      52      59      70      77      86
Shutdown        
        9954    10226   11021   9524    13393   15930
Total   
        135482  84483   61081   52190   58921   61898


HDFS READ       
        21097485        26117723        36158199        57808783        
80086015        102163071
FILE WRITE      
        65889   109815  197667  373429  549165  724901
HDFS WRITE      
        7330986         7331068         7331093         7330988         7330976 
        7331203



Best Regards,
Karos



Reply via email to