John Vines created ACCUMULO-4663:
------------------------------------
Summary: ShutdownTServer attempts shutdown over and over again,
can end up blocking migrations
Key: ACCUMULO-4663
URL: https://issues.apache.org/jira/browse/ACCUMULO-4663
Project: Accumulo
Issue Type: Bug
Affects Versions: 1.7.2
Reporter: John Vines
Also affects 1.7.1
ACCUMULO-1259 identified a problem with it repeatedly invoking
master.shutdownTServer. One side effect of this is a race where a server goes
down, gets removed from the online tablet sets, etc. and then gets re-added to
the serversToShutdown set. This will cause the balancer to not balance due to
shutdown in progress and never gets rectified. Only workaround is to restart
the master (or bring that server back up, I'm guessing).
ACCUMULO-3897 attempted to fix that problem by attempting shutdown once and
only once. It does this by setting a local boolean. But because we do not
reserialize our fate repos between isReady calls, this boolean effectively is
reset between each check, making it pointless.
I believe there are 2 problems here- 1 is that
ShutdownTServer.requestedShutdown is not implemented correctly
2 is we should have a mechanism to remove from serversToShutdown any server
that is not present.
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)