Can you tell us the version of HBase you're using ?

Do you find something in region server logs on the 4 remaining nodes ?

Cheers

On Thu, Aug 8, 2013 at 1:36 PM, oc tsdb <[email protected]> wrote:

> Hi,
>
> I am running a cluster with 6 nodes;
> Two of 6 nodes in my cluster went down (due to other application failure)
> and came back after some time (had to do a power reboot).
> When these nodes are back I use to get "WARN org.apache.hadoop.DFSClient:
> Failed to connect to , add to deadnodes and continue".
> Now these messages are stopped and getting continuous debug message as
> follows.
>
> 2013-08-08 12:57:36,628 DEBUG org.apache.hadoop.hbase.
> master.SplitLogManager: total tasks = 14 unassigned = 14
> 2013-08-08 12:57:37,628 DEBUG
> org.apache.hadoop.hbase.master.SplitLogManager: total tasks = 14 unassigned
> = 14
> 2013-08-08 12:57:37,629 DEBUG
> org.apache.hadoop.hbase.master.SplitLogManager: task not yet acquired
> /hbase/splitlog/hdfs%3A%2F%2Fmb-1.corp.oc.com%3A54310%2Fhbase%2F.logs%
> 2Fmb-3.corp.oc.com%2C60020%2C1375466447768-splitting%2Fmb-3.corp.oc.com
> %252C60020%252C1375466447768.1375631802971
> ver = 0
> 2013-08-08 12:57:37,629 DEBUG
> org.apache.hadoop.hbase.master.SplitLogManager: task not yet acquired
> /hbase/splitlog/hdfs%3A%2F%2Fmb-1.corp.oc.com%3A54310%2Fhbase%2F.logs%
> 2Fmb-6.corp.oc.com%2C60020%2C1375466460755-splitting%2Fmb-6.corp.oc.com
> %252C60020%252C1375466460755.1375623787557
> ver = 0
> 2013-08-08 12:57:37,629 DEBUG
> org.apache.hadoop.hbase.master.SplitLogManager: task not yet acquired
> /hbase/splitlog/hdfs%3A%2F%2Fmb-1.corp.oc.com%3A54310%2Fhbase%2F.logs%
> 2Fmb-6.corp.oc.com%2C60020%2C1375466460755-splitting%2Fmb-6.corp.oc.com
> %252C60020%252C1375466460755.1375619231059
> ver = 3
> 2013-08-08 12:57:37,629 DEBUG
> org.apache.hadoop.hbase.master.SplitLogManager: task not yet acquired
> /hbase/splitlog/hdfs%3A%2F%2Fmb-1.corp.oc.com%3A54310%2Fhbase%2F.logs%
> 2Fmb-2.corp.oc.com%2C60020%2C1375466479427-splitting%2Fmb-2.corp.oc.com
> %252C60020%252C1375466479427.1375639017535
> ver = 0
> 2013-08-08 12:57:37,629 DEBUG
> org.apache.hadoop.hbase.master.SplitLogManager: task not yet acquired
> /hbase/splitlog/hdfs%3A%2F%2Fmb-1.corp.oc.com%3A54310%2Fhbase%2F.logs%
> 2Fmb-6.corp.oc.com%2C60020%2C1375466460755-splitting%2Fmb-6.corp.oc.com
> %252C60020%252C1375466460755.1375623021175
> ver = 0
> 2013-08-08 12:57:37,629 DEBUG
> org.apache.hadoop.hbase.master.SplitLogManager: task not yet acquired
> /hbase/splitlog/hdfs%3A%2F%2Fmb-1.corp.oc.com%3A54310%2Fhbase%2F.logs%
> 2Fmb-3.corp.oc.com%2C60020%2C1375466447768-splitting%2Fmb-3.corp.oc.com
> %252C60020%252C1375466447768.1375630425141
> ver = 0
> 2013-08-08 12:57:37,629 DEBUG
> org.apache.hadoop.hbase.master.SplitLogManager: resubmitting unassigned
> task(s) after timeout
> 2013-08-08 12:57:37,629 DEBUG
> org.apache.hadoop.hbase.master.SplitLogManager: task not yet acquired
> /hbase/splitlog/hdfs%3A%2F%2Fmb-1.corp.oc.com%3A54310%2Fhbase%2F.logs%
> 2Fmb-6.corp.oc.com%2C60020%2C1375466460755-splitting%2Fmb-6.corp.oc.com
> %252C60020%252C1375466460755.1375620714514
> ver = 3
> 2013-08-08 12:57:37,630 DEBUG
> org.apache.hadoop.hbase.master.SplitLogManager: task not yet acquired
> /hbase/splitlog/hdfs%3A%2F%2Fmb-1.corp.oc.com%3A54310%2Fhbase%2F.logs%
> 2Fmb-6.corp.oc.com%2C60020%2C1375924525310-splitting%2Fmb-6.corp.oc.com
> %252C60020%252C1375924525310.1375924529658
> ver = 0
> 2013-08-08 12:57:37,630 DEBUG
> org.apache.hadoop.hbase.master.SplitLogManager: task not yet acquired
> /hbase/splitlog/hdfs%3A%2F%2Fmb-1.corp.oc.com%3A54310%2Fhbase%2F.logs%
> 2Fmb-4.corp.oc.com%2C60020%2C1375466551673-splitting%2Fmb-4.corp.oc.com
> %252C60020%252C1375466551673.1375641592581
> ver = 0
> 2013-08-08 12:57:37,630 DEBUG
> org.apache.hadoop.hbase.master.SplitLogManager: task not yet acquired
> /hbase/splitlog/hdfs%3A%2F%2Fmb-1.corp.oc.com%3A54310%2Fhbase%2F.logs%
> 2Fmb-5.corp.oc.com%2C60020%2C1375924528073-splitting%2Fmb-5.corp.oc.com
> %252C60020%252C1375924528073.1375924532442
> ver = 0
> 2013-08-08 12:57:37,630 DEBUG
> org.apache.hadoop.hbase.master.SplitLogManager: task not yet acquired
> /hbase/splitlog/hdfs%3A%2F%2Fmb-1.corp.oc.com%3A54310%2Fhbase%2F.logs%
> 2Fmb-6.corp.oc.com%2C60020%2C1375466460755-splitting%2Fmb-6.corp.oc.com
> %252C60020%252C1375466460755.1375622290167
> ver = 3
> 2013-08-08 12:57:37,630 DEBUG
> org.apache.hadoop.hbase.master.SplitLogManager: task not yet acquired
> /hbase/splitlog/hdfs%3A%2F%2Fmb-1.corp.oc.com%3A54310%2Fhbase%2F.logs%
> 2Fmb-5.corp.oc.com%2C60020%2C1375466463385-splitting%2Fmb-5.corp.oc.com
> %252C60020%252C1375466463385.1375638183425
> ver = 0
> 2013-08-08 12:57:37,630 DEBUG
> org.apache.hadoop.hbase.master.SplitLogManager: task not yet acquired
> /hbase/splitlog/hdfs%3A%2F%2Fmb-1.corp.oc.com%3A54310%2Fhbase%2F.logs%
> 2Fmb-5.corp.oc.com%2C60020%2C1375466463385-splitting%2Fmb-5.corp.oc.com
> %252C60020%252C1375466463385.1375639599559
> ver = 0
> 2013-08-08 12:57:37,630 DEBUG
> org.apache.hadoop.hbase.master.SplitLogManager: task not yet acquired
> /hbase/splitlog/hdfs%3A%2F%2Fmb-1.corp.oc.com%3A54310%2Fhbase%2F.logs%
> 2Fmb-5.corp.oc.com%2C60020%2C1375466463385-splitting%2Fmb-5.corp.oc.com
> %252C60020%252C1375466463385.1375641710787
> ver = 3
> 2013-08-08 12:57:37,633 INFO
> org.apache.hadoop.hbase.master.SplitLogManager: task
> /hbase/splitlog/RESCAN0000006975 entered state done mb-1.corp.oc.com
> ,60000,1375924508669
> 2013-08-08 12:57:37,633 DEBUG
> org.apache.hadoop.hbase.master.SplitLogManager$DeleteAsyncCallback: deleted
> /hbase/splitlog/RESCAN0000006975
> 2013-08-08 12:57:37,633 DEBUG
> org.apache.hadoop.hbase.master.SplitLogManager: deleted task without in
> memory state /hbase/splitlog/RESCAN0000006975
> 2013-08-08 12:57:38,629 DEBUG
> org.apache.hadoop.hbase.master.SplitLogManager: total tasks = 14 unassigned
> = 14
> 2013-08-08 12:57:39,628 DEBUG
> org.apache.hadoop.hbase.master.SplitLogManager: total tasks = 14 unassigned
> = 14
> 2013-08-08 12:57:40,628 DEBUG
> org.apache.hadoop.hbase.master.SplitLogManager: total tasks = 14 unassigned
> = 14
> 2013-08-08 12:57:41,628 DEBUG
> org.apache.hadoop.hbase.master.SplitLogManager: total tasks = 14 unassigned
> = 14
> 2013-08-08 12:57:42,628 DEBUG
> org.apache.hadoop.hbase.master.SplitLogManager: total tasks = 14 unassigned
> = 14
> 2013-08-08 12:57:43,628 DEBUG
> org.apache.hadoop.hbase.master.SplitLogManager: total tasks = 14 unassigned
> = 14
> 2013-08-08 12:57:44,628 DEBUG
> org.apache.hadoop.hbase.master.SplitLogManager: total tasks = 14 unassigned
> = 14
> 2013-08-08 12:57:45,628 DEBUG
> org.apache.hadoop.hbase.master.SplitLogManager: total tasks = 14 unassigned
> = 14
> 2013-08-08 12:57:46,628 DEBUG
> org.apache.hadoop.hbase.master.SplitLogManager: total tasks = 14 unassigned
> = 14
> 2013-08-08 12:57:47,628 DEBUG
> org.apache.hadoop.hbase.master.SplitLogManager: total tasks = 14 unassigned
> = 14
> 2013-08-08 12:57:48,628 DEBUG
> org.apache.hadoop.hbase.master.SplitLogManager: total tasks = 14 unassigned
> = 14
> 2013-08-08 12:57:49,628 DEBUG
> org.apache.hadoop.hbase.master.SplitLogManager: total tasks = 14 unassigned
> = 14
> 2013-08-08 12:57:50,628 DEBUG
> org.apache.hadoop.hbase.master.SplitLogManager: total tasks = 14 unassigned
> = 14
> 2013-08-08 12:57:51,628 DEBUG
> org.apache.hadoop.hbase.master.SplitLogManager: total tasks = 14 unassigned
> = 14
> 2013-08-08 12:57:52,628 DEBUG
> org.apache.hadoop.hbase.master.SplitLogManager: total tasks = 14 unassigned
> = 14
> 2013-08-08 12:57:53,629 DEBUG
> org.apache.hadoop.hbase.master.SplitLogManager: total tasks = 14 unassigned
> = 14
> 2013-08-08 12:57:54,487 DEBUG
>
> org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation:
> Lookedup root region location,
>
> connection=org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation@24ddb5c9
> ;
> serverName=
> 2013-08-08 12:57:54,629 DEBUG
> org.apache.hadoop.hbase.master.SplitLogManager: total tasks = 14 unassigned
> = 14
> 2013-08-08 12:57:55,629 DEBUG
> org.apache.hadoop.hbase.master.SplitLogManager: total tasks = 14 unassigned
> = 14
> 2013-08-08 12:57:56,629 DEBUG
> org.apache.hadoop.hbase.master.SplitLogManager: total tasks = 14 unassigned
> = 14
> 2013-08-08 12:57:57,629 DEBUG
> org.apache.hadoop.hbase.master.SplitLogManager: total tasks = 14 unassigned
> = 14
> 2013-08-08 12:57:58,629 DEBUG
> org.apache.hadoop.hbase.master.SplitLogManager: total tasks = 14 unassigned
> = 14
> 2013-08-08 12:57:59,629 DEBUG
> org.apache.hadoop.hbase.master.SplitLogManager: total tasks = 14 unassigned
> = 14
> 2013-08-08 12:58:00,629 DEBUG
> org.apache.hadoop.hbase.master.SplitLogManager: total tasks = 14 unassigned
> = 14
>
>
> The cluster is unresponsive. I cannot access 4242 port on any of the
> cluster nodes.
> When I try to run tsdb command "tsdb uig grep metrics .", i am getting
> following error messages
>   ERROR [main-EventThread] HBaseClient: The znode for the -ROOT- region
> doesn't exist!
>   ERROR [main-EventThread] HBaseClient: The znode for the -ROOT- region
> doesn't exist!
>
> Could you please suggest me what I can do to stop it.
>
> Thanks in Advance.
>
> Regards,
> OC.
>

Reply via email to