JoneZhang created HBASE-14074:
---------------------------------

             Summary: HBase cluster crashed on-the-hour 
                 Key: HBASE-14074
                 URL: https://issues.apache.org/jira/browse/HBASE-14074
             Project: HBase
          Issue Type: Bug
          Components: Admin
    Affects Versions: 0.96.2
         Environment: Hadoop 2.5.1
HBase 0.96.2
            Reporter: JoneZhang


I found hbase clutser crashed on-the-hour
HBase master running log as follows

"2015-07-14 14:41:49,832 DEBUG [master:10.240.131.18:60000.oldLogCleaner] 
master.ReplicationLogCleaner: Didn't find this log in ZK, deleting: 
10-241-125-46%2C60020%2C1436841063572.1436851865226
2015-07-14 14:45:49,822 DEBUG [master:10.240.131.18:60000.oldLogCleaner] 
master.ReplicationLogCleaner: Didn't find this log in ZK, deleting: 
10-241-85-137%2C60020%2C1436841341086.1436852143141
2015-07-14 15:00:03,481 INFO  [main] util.VersionInfo: HBase 0.96.2-hadoop2
2015-07-14 15:00:03,481 INFO  [main] util.VersionInfo: Subversion 
https://svn.apache.org/repos/asf/hbase/tags/0.96.2RC2 -r 1581096
2015-07-14 15:00:03,481 INFO  [main] util.VersionInfo: Compiled by stack on Mon 
Mar 24 16:03:18 PDT 2014
2015-07-14 15:00:03,729 INFO  [main] zookeeper.ZooKeeper: Client 
environment:zookeeper.version=3.4.5-1392090, built on 09/30/2012 17:52 GMT
2015-07-14 15:00:03,730 INFO  [main] zookeeper.ZooKeeper: Client 
environment:host.name=10-240-131-18
2015-07-14 15:00:03,730 INFO  [main] zookeeper.ZooKeeper: Client 
environment:java.version=1.7.0_72

...

2015-07-14 15:00:03,749 INFO  [main] zookeeper.RecoverableZooKeeper: Process 
identifier=clean znode for master connecting to ZooKeeper 
ensemble=10.240.131.17:2200,10.240.131.16:2200,10.240.131.15:2200,10.240.131.14:2200,10.240.131.18:2200
2015-07-14 15:00:03,751 INFO  [main-SendThread(10-240-131-18:2200)] 
zookeeper.ClientCnxn: Opening socket connection to server 
10-240-131-18/10.240.131.18:2200. Will not attempt to authenticate using SASL 
(unknown error)
2015-07-14 15:00:03,757 INFO  [main-SendThread(10-240-131-18:2200)] 
zookeeper.ClientCnxn: Socket connection established to 
10-240-131-18/10.240.131.18:2200, initiating session
2015-07-14 15:00:03,764 INFO  [main-SendThread(10-240-131-18:2200)] 
zookeeper.ClientCnxn: Session establishment complete on server 
10-240-131-18/10.240.131.18:2200, sessionid = 0x34e8a64b453024a, negotiated 
timeout = 40000
2015-07-14 15:00:04,835 INFO  [main] zookeeper.ZooKeeper: Session: 
0x34e8a64b453024a closed
2015-07-14 15:00:04,835 INFO  [main-EventThread] zookeeper.ClientCnxn: 
EventThread shut down"


After print " Didn't find this log in ZK..." every hour at a time
The master dead


Zookeeper  running log as follows

"2015-07-14 15:00:03,756 [myid:3] - INFO  
[NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2200:NIOServerCnxnFactory@197] - Accepted 
socket connection from /10.240.131.18:52733
2015-07-14 15:00:03,761 [myid:3] - INFO  
[NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2200:ZooKeeperServer@868] - Client 
attempting to establish new session at /10.240.131.18:52733
2015-07-14 15:00:03,762 [myid:3] - INFO  
[CommitProcessor:3:ZooKeeperServer@617] - Established session 0x34e8a64b453024a 
with negotiated timeout 40000 for client /10.240.131.18:52733
2015-07-14 15:00:04,836 [myid:3] - INFO  
[NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2200:NIOServerCnxn@1007] - Closed socket 
connection for client /10.240.131.18:52733 which had sessionid 
0x34e8a64b453024a"




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to