To process a BIG input graph in giraph.

Suijian Zhou Sat, 01 Mar 2014 14:13:24 -0800

Hi,
  Here I'm trying to process a very big input file through giraph, ~70GB.
I'm running the giraph program on a 40 nodes linux cluster but the program
just get stuck there after it read in a small fraction of the input file.
Although each node has 16GB mem, it looks that only one node read the input
file which is on HDFS(into its memory). As the input file is so big, is
there a way to scatter the input file on all the nodes so each node will
read in  a fraction of the file then start processing the graph? Will it be
helpful if we split the single big input file into many smaller files and
let each node read in one of them to process( of course the overall
stucture of the graph should be kept)? Thanks!


  Best Regards,
  Suijian

To process a BIG input graph in giraph.

Reply via email to