Hello Avery and all: I have a cluster of 10 two-processor/48 GB RAM servers, upon which we are conducting Hadoop performance characterization tests. I plan to use the Giraph pagerank and simple shortest path example tests as part of this exercise and would appreciate guidance on problem sizes for both tests. I'm looking at paring down an obfuscated Twitter dataset and it would save a lot of time if someone has some knowledge on roughly how the time and memory scales with number of nodes in a graph.
Best regards, Steve Fleischman
