[
https://issues.apache.org/jira/browse/KUDU-2634?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Grant Henke updated KUDU-2634:
------------------------------
Component/s: test
> token_signer-itest can get stuck when the cluster is shutting down while the
> leader master generates a new TSK
> --------------------------------------------------------------------------------------------------------------
>
> Key: KUDU-2634
> URL: https://issues.apache.org/jira/browse/KUDU-2634
> Project: Kudu
> Issue Type: Bug
> Components: test
> Affects Versions: 1.8.0
> Reporter: William Berkeley
> Priority: Major
> Attachments: token_signer-itest.log
>
>
> I saw the following thing happen in token_signer-itest:
> 1. The test body finishes. The InternalMiniCluster is being shut down as part
> of cleaning up the test.
> 2. The follower masters shut down.
> 3. The leader master starts shutting down (Master::Shutdown()). The catalog
> manager is shutting down the background tasks
> (CatalogManagerBgTasks::Shutdown(), and so is joining with the bg task thread.
> 4. The bg task thread is in the middle of CatalogManagerBgTasks::Run(),
> where, because of the short TSK rotation times, it detects it needs to
> generate a new TSK. It calls through to SysCatalogTable::SyncWrite to write
> the new TSK.
> 5. The other two masters are shut down, so SyncWrite blocks forever waiting
> for the TSK write to replicate.
> 6. The test eventually times out because the itest thread is stuck in
> CatalogManagerBgTasks::Shutdown() waiting for SysCatalogTable::SyncWrite().
> Log of the failing test attached.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)