Avery Ching created GIRAPH-356:
----------------------------------

             Summary: Help debug ZooKeeper issues
                 Key: GIRAPH-356
                 URL: https://issues.apache.org/jira/browse/GIRAPH-356
             Project: Giraph
          Issue Type: Improvement
            Reporter: Avery Ching


Currently, if the ZooKeeper process fails, we have little information on why 
and what happened.  This patch addresses this by keeping the last 100 log lines 
and dumps when the map fails under a RuntimeException.

Here is an example of a master task failure when there is an invalid JVM 
argument passed to ZooKeeper.  The error is much for obvious now.

2012-10-04 15:05:28,916 WARN org.apache.giraph.zk.ZooKeeperManager: 
logZooKeeperOutput: Dumping up to last 100 lines of the ZooKeeper process 
STDOUT and STDERR.
2012-10-04 15:05:28,916 WARN 
org.apache.giraph.zk.ZooKeeperManager$StreamCollector: Unrecognized option: 
-BadOpt
2012-10-04 15:05:28,916 WARN 
org.apache.giraph.zk.ZooKeeperManager$StreamCollector: Could not create the 
Java virtual machine.
2012-10-04 15:05:28,919 INFO org.apache.hadoop.mapred.TaskLogsTruncater: 
Initializing logs' truncater with mapRetainSize=-1 and reduceRetainSize=-1
2012-10-04 15:05:28,959 WARN org.apache.hadoop.mapred.Child: Error running child
java.lang.IllegalStateException: run: Caught an unrecoverable exception 
onlineZooKeeperServers: Failed to connect in 5 tries!
                                 at 
org.apache.giraph.graph.GraphMapper.run(GraphMapper.java:591)
                                 at 
org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:763)
                                 at 
org.apache.hadoop.mapred.MapTask.run(MapTask.java:369)
                                 at 
org.apache.hadoop.mapred.Child$4.run(Child.java:259)
                                 at 
java.security.AccessController.doPrivileged(Native Method)
                                 at 
javax.security.auth.Subject.doAs(Subject.java:396)
                                 at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1059)
                                 at 
org.apache.hadoop.mapred.Child.main(Child.java:253)
Caused by: java.lang.IllegalStateException: onlineZooKeeperServers: Failed to 
connect in 5 tries!
       at 
org.apache.giraph.zk.ZooKeeperManager.onlineZooKeeperServers(ZooKeeperManager.java:721)
       at org.apache.giraph.graph.GraphMapper.setup(GraphMapper.java:328)
       at org.apache.giraph.graph.GraphMapper.run(GraphMapper.java:573)
       ... 7 more
2012-10-04 15:05:28,963 INFO org.apache.hadoop.mapred.Task: Runnning cleanup 
for the task



--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to