[ https://issues.apache.org/jira/browse/HBASE-14074?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
JoneZhang reassigned HBASE-14074: --------------------------------- Assignee: Andrew Purtell > HBase cluster crashed on-the-hour > ---------------------------------- > > Key: HBASE-14074 > URL: https://issues.apache.org/jira/browse/HBASE-14074 > Project: HBase > Issue Type: Bug > Components: Admin > Affects Versions: 0.96.2 > Environment: Hadoop 2.5.1 > HBase 0.96.2 > Reporter: JoneZhang > Assignee: Andrew Purtell > > I found hbase clutser crashed on-the-hour > HBase master running log as follows > "2015-07-14 14:41:49,832 DEBUG [master:10.240.131.18:60000.oldLogCleaner] > master.ReplicationLogCleaner: Didn't find this log in ZK, deleting: > 10-241-125-46%2C60020%2C1436841063572.1436851865226 > 2015-07-14 14:45:49,822 DEBUG [master:10.240.131.18:60000.oldLogCleaner] > master.ReplicationLogCleaner: Didn't find this log in ZK, deleting: > 10-241-85-137%2C60020%2C1436841341086.1436852143141 > 2015-07-14 15:00:03,481 INFO [main] util.VersionInfo: HBase 0.96.2-hadoop2 > 2015-07-14 15:00:03,481 INFO [main] util.VersionInfo: Subversion > https://svn.apache.org/repos/asf/hbase/tags/0.96.2RC2 -r 1581096 > 2015-07-14 15:00:03,481 INFO [main] util.VersionInfo: Compiled by stack on > Mon Mar 24 16:03:18 PDT 2014 > 2015-07-14 15:00:03,729 INFO [main] zookeeper.ZooKeeper: Client > environment:zookeeper.version=3.4.5-1392090, built on 09/30/2012 17:52 GMT > 2015-07-14 15:00:03,730 INFO [main] zookeeper.ZooKeeper: Client > environment:host.name=10-240-131-18 > 2015-07-14 15:00:03,730 INFO [main] zookeeper.ZooKeeper: Client > environment:java.version=1.7.0_72 > ... > 2015-07-14 15:00:03,749 INFO [main] zookeeper.RecoverableZooKeeper: Process > identifier=clean znode for master connecting to ZooKeeper > ensemble=10.240.131.17:2200,10.240.131.16:2200,10.240.131.15:2200,10.240.131.14:2200,10.240.131.18:2200 > 2015-07-14 15:00:03,751 INFO [main-SendThread(10-240-131-18:2200)] > zookeeper.ClientCnxn: Opening socket connection to server > 10-240-131-18/10.240.131.18:2200. Will not attempt to authenticate using SASL > (unknown error) > 2015-07-14 15:00:03,757 INFO [main-SendThread(10-240-131-18:2200)] > zookeeper.ClientCnxn: Socket connection established to > 10-240-131-18/10.240.131.18:2200, initiating session > 2015-07-14 15:00:03,764 INFO [main-SendThread(10-240-131-18:2200)] > zookeeper.ClientCnxn: Session establishment complete on server > 10-240-131-18/10.240.131.18:2200, sessionid = 0x34e8a64b453024a, negotiated > timeout = 40000 > 2015-07-14 15:00:04,835 INFO [main] zookeeper.ZooKeeper: Session: > 0x34e8a64b453024a closed > 2015-07-14 15:00:04,835 INFO [main-EventThread] zookeeper.ClientCnxn: > EventThread shut down" > After print " Didn't find this log in ZK..." every hour at a time > The master dead > Zookeeper running log as follows > "2015-07-14 15:00:03,756 [myid:3] - INFO > [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2200:NIOServerCnxnFactory@197] - > Accepted socket connection from /10.240.131.18:52733 > 2015-07-14 15:00:03,761 [myid:3] - INFO > [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2200:ZooKeeperServer@868] - Client > attempting to establish new session at /10.240.131.18:52733 > 2015-07-14 15:00:03,762 [myid:3] - INFO > [CommitProcessor:3:ZooKeeperServer@617] - Established session > 0x34e8a64b453024a with negotiated timeout 40000 for client > /10.240.131.18:52733 > 2015-07-14 15:00:04,836 [myid:3] - INFO > [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2200:NIOServerCnxn@1007] - Closed > socket connection for client /10.240.131.18:52733 which had sessionid > 0x34e8a64b453024a" -- This message was sent by Atlassian JIRA (v6.3.4#6332)