[ 
https://issues.apache.org/jira/browse/GIRAPH-25?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13096816#comment-13096816
 ] 

Dmitriy V. Ryaboy commented on GIRAPH-25:
-----------------------------------------

Here's the log I saw on a timed out master:

{code}
2011-09-04 05:22:11,115 INFO org.apache.giraph.graph.BspServiceMaster: 
checkWorkers: Only found 182 responses of 186 needed to start superstep -1.  
Sleeping for 30000 msecs and used 9 of 10 attempts.
2011-09-04 05:22:11,115 WARN org.apache.giraph.graph.BspServiceMaster: 
checkWorkers: Did not receive enough processes in time (only 182 of 186 
required)
2011-09-04 05:22:11,120 INFO org.apache.giraph.graph.BspServiceMaster: 
setJobState: 
{"_stateKey":"FAILED","_applicationAttemptKey":-1,"_superstepKey":-1} on 
superstep -1
2011-09-04 05:22:11,129 FATAL org.apache.giraph.graph.BspServiceMaster: 
failJob: Killing job job_201109012213_17306
2011-09-04 05:22:11,159 ERROR org.apache.giraph.graph.MasterThread: 
masterThread: Master algorithm failed: 
java.lang.NullPointerException
        at 
org.apache.giraph.graph.BspServiceMaster.createInputSplits(BspServiceMaster.java:486)
        at org.apache.giraph.graph.MasterThread.run(MasterThread.java:94)
2011-09-04 05:22:11,160 FATAL org.apache.giraph.graph.GraphMapper: 
uncaughtException: OverrideExceptionHandler on thread 
org.apache.giraph.graph.MasterThread, msg = java.lang.NullPointerException, 
exiting...
java.lang.RuntimeException: java.lang.NullPointerException
        at org.apache.giraph.graph.MasterThread.run(MasterThread.java:177)
Caused by: java.lang.NullPointerException
        at 
org.apache.giraph.graph.BspServiceMaster.createInputSplits(BspServiceMaster.java:486)
        at org.apache.giraph.graph.MasterThread.run(MasterThread.java:94)
2011-09-04 05:22:11,161 WARN org.apache.giraph.zk.ZooKeeperManager: 
onlineZooKeeperServers: Forced a shutdown hook kill of the ZooKeeper process.
{code}

> NPE in BspServiceMaster when failing a job
> ------------------------------------------
>
>                 Key: GIRAPH-25
>                 URL: https://issues.apache.org/jira/browse/GIRAPH-25
>             Project: Giraph
>          Issue Type: Bug
>            Reporter: Dmitriy V. Ryaboy
>            Priority: Minor
>
> When BspServiceMaster times out waiting for all workers to check in, it dies 
> with a NullPointerException.
> This can perhaps be handled a bit more gracefully.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to