Hi again,

I am trying to estimate minimum requirements to process graph analysis over
my input data,

In shortest path example it is said that
"The first thing that happens is that getSplits() is called by the master
and then the workers will process the InputSplit objects with the
VertexReader to load their portion of the graph into memory"

What I undestood is in a time T all graph nodes must be loaded on cluster
memory.
If I have 100 gb of graph data, will I need 25 machines having 4 gb ram
each?

If this is the case I have a big memory problem to anaylze 4tb data :)

best regards.

Reply via email to