Hi,

I encountered a critical scaling problem using Giraph. I made a very simple 
algorithm to test Giraph on large graphs : a connexity test. It works on 
relatively large graphs (3 072 441 nodes and 117 185 083 edges) but not on very 
large graph (52 000 000 nodes and 2 000 000 000 edges). In fact, during the 
processing of the biggest graph, Giraph core seems to fail after the superstep 
14 (15 on some jobs). The input graph size is 30 GB stored as text and the 
output is also stored as text. 9 working jobs are used to compute the graph.
Here is the tracktrace of jobs (this is the same for the 9 jobs):    
java.lang.IllegalStateException: run: Caught an unrecoverable exception exists: 
Failed to check 
/_hadoopBsp/job_201307260439_0006/_applicationAttemptsDir/0/_superstepDir/97/_addressesAndPartitions
 after 3 tries!        at 
org.apache.giraph.graph.GraphMapper.run(GraphMapper.java:101)        at 
org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764)        at 
org.apache.hadoop.mapred.MapTask.run(MapTask.java:370)        at 
org.apache.hadoop.mapred.Child$4.run(Child.java:255)        at 
java.security.AccessController.doPrivileged(Native Method)        at 
javax.security.auth.Subject.doAs(Unknown Source)        at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1093)
        at org.apache.hadoop.mapred.Child.main(Child.java:249)    Caused by: 
java.lang.IllegalStateException: exists: Failed to check 
/_hadoopBsp/job_201307260439_0006/_applicationAttemptsDir/0/_superstepDir/97/_addressesAndPartitions
 after 3 tries!        at 
org.apache.giraph.zk.ZooKeeperExt.exists(ZooKeeperExt.java:369)        at 
org.apache.giraph.worker.BspServiceWorker.startSuperstep(BspServiceWorker.java:678)
        at 
org.apache.giraph.graph.GraphTaskManager.execute(GraphTaskManager.java:248)     
   at org.apache.giraph.graph.GraphMapper.run(GraphMapper.java:91)        ... 7 
more
Could you help me to solve this problem?If you need the code of the program, I 
can put that here (the code is relatively tiny).
Thanks, Jérôme.
                                                                                
  

Reply via email to