Will Berkeley created KUDU-2634:
-----------------------------------

             Summary: token_signer-itest can get stuck when the cluster is 
shutting down while the leader master generates a new TSK
                 Key: KUDU-2634
                 URL: https://issues.apache.org/jira/browse/KUDU-2634
             Project: Kudu
          Issue Type: Bug
    Affects Versions: 1.8.0
            Reporter: Will Berkeley
         Attachments: token_signer-itest.log

I saw the following thing happen in token_signer-itest:

1. The test body finishes. The InternalMiniCluster is being shut down as part 
of cleaning up the test.
2. The follower masters shut down.
3. The leader master starts shutting down (Master::Shutdown()). The catalog 
manager is shutting down the background tasks 
(CatalogManagerBgTasks::Shutdown(), and so is joining with the bg task thread.
4. The bg task thread is in the middle of CatalogManagerBgTasks::Run(), where, 
because of the short TSK rotation times, it detects it needs to generate a new 
TSK. It calls through to SysCatalogTable::SyncWrite to write the new TSK.
5. The other two masters are shut down, so SyncWrite blocks forever waiting for 
the TSK write to replicate.
6. The test eventually times out because the itest thread is stuck in 
CatalogManagerBgTasks::Shutdown() waiting for SysCatalogTable::SyncWrite().

Log of the failing test attached.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to