Do you see in the master log something similar to the following ? master.HMaster: Not running balancer because 1 region(s) in transition
You can search backwards for balancer / assignment related logs. Cheers On Mon, Jul 6, 2015 at 8:49 AM, Akmal Abbasov <[email protected]> wrote: > > What error(s) did you get when trying to restart the region server ? Have > > you checked its log files ? > it was a VM, and I was not able to access it any more, I can’t login to > it. Restarting several times didn’t helped. > > > > Can you check master log around this time ? If there was region in > > transition, balancer wouldn't balance. > I have a lot of this > 2015-07-06 15:15:39,918 INFO [snapshot-log-cleaner-cache-refresher] > util.FSVisitor: No logs under > directory:hdfs://test/hbase/.hbase-snapshot/table1-snapshot-31.05.2015_18.14/WALs > 2015-07-06 15:15:39,918 INFO [snapshot-log-cleaner-cache-refresher] > util.FSVisitor: No logs under > directory:hdfs://test/hbase/.hbase-snapshot/table1-snapshot-31.05.2015_19.14/WALs > 2015-07-06 15:15:39,921 INFO [snapshot-log-cleaner-cache-refresher] > util.FSVisitor: No logs under > directory:hdfs://test/hbase/.hbase-snapshot/table1-snapshot-31.05.2015_20.13/WALs > 2015-07-06 15:15:39,925 INFO [snapshot-log-cleaner-cache-refresher] > util.FSVisitor: No logs under > directory:hdfs://test/hbase/.hbase-snapshot/table1-snapshot-31.05.2015_21.14/WALs > 2015-07-06 15:15:39,926 INFO [snapshot-log-cleaner-cache-refresher] > util.FSVisitor: No logs under > directory:hdfs://test/hbase/.hbase-snapshot/table1-snapshot-31.05.2015_22.14/WALs > 2015-07-06 15:15:39,927 INFO [snapshot-log-cleaner-cache-refresher] > util.FSVisitor: No logs under > directory:hdfs://test/hbase/.hbase-snapshot/table1-snapshot-31.05.2015_23.14/WALs > 2015-07-06 15:15:39,928 INFO [snapshot-log-cleaner-cache-refresher] > util.FSVisitor: No logs under > directory:hdfs://test/hbase/.hbase-snapshot/testsnap/WALs > 2015-07-06 15:15:47,324 INFO [FifoRpcScheduler.handler1-thread-18] > master.HMaster: Client=hadoop//10.32.0.140 set balanceSwitch=false > 2015-07-06 15:23:31,265 DEBUG [master:hbase-m2:60000.oldLogCleaner] > master.ReplicationLogCleaner: Didn't find this log in ZK, deleting: > hbase-rs1%2C60020%2C1436189457794.1436190023718 > 2015-07-06 15:23:31,504 DEBUG [master:hbase-m2:60000.oldLogCleaner] > master.ReplicationLogCleaner: Didn't find this log in ZK, deleting: > hbase-rs1%2C60020%2C1436189457794.1436193624562 > 2015-07-06 15:32:49,382 INFO [FifoRpcScheduler.handler1-thread-14] > master.HMaster: Client=hadoop//10.32.0.156 set balanceSwitch=false > 2015-07-06 15:32:56,936 INFO [FifoRpcScheduler.handler1-thread-1] > master.HMaster: Client=hadoop//10.32.0.156 set balanceSwitch=false > > Thank you. > > > On 06 Jul 2015, at 17:37, Ted Yu <[email protected]> wrote: > > > > bq. I had to delete and recreate it > > > > What error(s) did you get when trying to restart the region server ? Have > > you checked its log files ? > > > > bq. start balancer manually, but it returned false > > > > Can you check master log around this time ? If there was region in > > transition, balancer wouldn't balance. > > > > Cheers > > > > On Mon, Jul 6, 2015 at 8:29 AM, Akmal Abbasov <[email protected]> > > wrote: > > > >> Hi all, > >> I have a strange behaviour in my HBase cluster. I have 5 rs and 2 > masters. > >> One of the rs stopped working, restart didn’t worked, and I had to > delete > >> and recreate it. > >> But when this rs have stopped, the cluster also stopped functioning. > >> There were a lot of inconsistencies. When I recreated the rs with disks > of > >> the previous one, cluster started working. > >> But now, only 3 rs host the regions, other 2 have 0 regions. > >> I’ve tried to start balancer manually, but it returned false? > >> Any idea? > >> > >> I am using hbase hbase-0.98.7-hadoop2. > >> Thank you. > >> > >> Kind regards, > >> Akmal Abbasov > >> > >> > >
