Adar Dembo created KUDU-2963:
--------------------------------
Summary: Catalog manager never gives up on CreateTablet RPCs
Key: KUDU-2963
URL: https://issues.apache.org/jira/browse/KUDU-2963
Project: Kudu
Issue Type: Improvement
Components: master
Affects Versions: 1.11.0
Reporter: Adar Dembo
This is a problem when there aren't enough live tservers upon which to place a
tablet's replicas, or when a chosen tserver doesn't create the replica quickly
enough. If the catalog manager decides to replace the tablet, the replaced
tablet's CreateTablet RPCs continue to retry ad infinitum. If the previously
dead tservers then come back to life, they must needlessly process the
CreateTablet RPCs.
The tablets are eventually deleted, either through explicit DeleteTablet RPCs
(triggered by the catalog manager replacement process), or by heartbeating, but
it's an unnecessary drain on cluster resources.
We should probably abort CreateTablet RPCs for tablets that have been removed
from their table.
CreateTableITest_TestCreateWhenMajorityOfReplicasFailCreation demonstrates this
acutely.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)