Josh Elser created ACCUMULO-4410:
------------------------------------
Summary: Master didn't not resume balancing after administrative
tserver shutdown
Key: ACCUMULO-4410
URL: https://issues.apache.org/jira/browse/ACCUMULO-4410
Project: Accumulo
Issue Type: Bug
Components: master
Affects Versions: 1.8.0
Reporter: Josh Elser
Priority: Critical
I realized that I misconfigured a property, so, I started manually stopping
each tabletserver (using {{accumulo admin stop <host:port>}}).
This worked as intended, the tablets were migrated and the tserver was stopped:
{noformat}
2016-08-17 15:24:20,871 [master.EventCoordinator] INFO : Tablet Server shutdown
requested for jelser-accumulo-180-4.openstacklocal:59540[1568579a4b4004c]
2016-08-17 15:24:20,991 [master.Master] DEBUG: FATE op shutting down
jelser-accumulo-180-4.openstacklocal:59540[1568579a4b4004c] finished
{noformat}
However, after this point, the master did not resume balancing:
{noformat}
2016-08-17 15:24:31,024 [master.Master] DEBUG: Finished gathering information
from 4 servers in 0.02 seconds
2016-08-17 15:24:31,024 [master.Master] DEBUG: not balancing while shutting
down servers [jelser-accumulo-180-4.openstacklocal:59540[1568579a4b4004c]]
2016-08-17 15:24:36,831 [replication.WorkDriver] DEBUG: Sleeping 30000 ms
before next work assignment
2016-08-17 15:24:41,074 [master.Master] DEBUG: Finished gathering information
from 4 servers in 0.05 seconds
2016-08-17 15:24:41,083 [master.Master] DEBUG: not balancing while shutting
down servers [jelser-accumulo-180-4.openstacklocal:59540[1568579a4b4004c]]
2016-08-17 15:24:51,134 [master.Master] DEBUG: Finished gathering information
from 4 servers in 0.05 seconds
2016-08-17 15:24:51,135 [master.Master] DEBUG: not balancing while shutting
down servers [jelser-accumulo-180-4.openstacklocal:59540[1568579a4b4004c]]
{noformat}
Even after I brought a new tserver online on that host, the master still did
not resume balancing:
{noformat}
2016-08-17 15:25:53,015 [master.Master] INFO : New servers:
[jelser-accumulo-180-4.openstacklocal:54722[2568579a5c3006e]]
2016-08-17 15:25:53,026 [master.EventCoordinator] INFO : There are now 5 tablet
servers
2016-08-17 15:25:53,096 [master.Master] DEBUG: Finished gathering information
from 5 servers in 0.06 seconds
2016-08-17 15:25:53,109 [master.Master] DEBUG: not balancing while shutting
down servers [jelser-accumulo-180-4.openstacklocal:59540[1568579a4b4004c]]
{noformat}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)