John Vines created ACCUMULO-4663:
------------------------------------

             Summary: ShutdownTServer attempts shutdown over and over again, 
can end up blocking migrations
                 Key: ACCUMULO-4663
                 URL: https://issues.apache.org/jira/browse/ACCUMULO-4663
             Project: Accumulo
          Issue Type: Bug
    Affects Versions: 1.7.2
            Reporter: John Vines


Also affects 1.7.1

ACCUMULO-1259 identified a problem with it repeatedly invoking 
master.shutdownTServer. One side effect of this is a race where a server goes 
down, gets removed from the online tablet sets, etc. and then gets re-added to 
the serversToShutdown set. This will cause the balancer to not balance due to 
shutdown in progress and never gets rectified. Only workaround is to restart 
the master (or bring that server back up, I'm guessing).

ACCUMULO-3897 attempted to fix that problem by attempting shutdown once and 
only once. It does this by setting a local boolean. But because we do not 
reserialize our fate repos between isReady calls, this boolean effectively is 
reset between each check, making it pointless.

I believe there are 2 problems here- 1 is that 
ShutdownTServer.requestedShutdown is not implemented correctly
2 is we should have a mechanism to remove from serversToShutdown any server 
that is not present.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to