You can use it for performance testing, although it is not a great simulation of real graphs. Real graphs tend to be more power law distributed (see

Hope that helps,


I am using Giraph solely for performance characterization -- primarily comparing hardware platforms but also for Hadoop configuration tuning. Am I correct that we could use the PseudoRandomVertexInputFormat, as used in the PageRank example, to generate any size graphs that can then be used in the simple shortest path example program and thus avoiding the need to obtain actual datasets?

