Adar Dembo created KUDU-2963:
--------------------------------

             Summary: Catalog manager never gives up on CreateTablet RPCs
                 Key: KUDU-2963
                 URL: https://issues.apache.org/jira/browse/KUDU-2963
             Project: Kudu
          Issue Type: Improvement
          Components: master
    Affects Versions: 1.11.0
            Reporter: Adar Dembo


This is a problem when there aren't enough live tservers upon which to place a 
tablet's replicas, or when a chosen tserver doesn't create the replica quickly 
enough. If the catalog manager decides to replace the tablet, the replaced 
tablet's CreateTablet RPCs continue to retry ad infinitum. If the previously 
dead tservers then come back to life, they must needlessly process the 
CreateTablet RPCs.

The tablets are eventually deleted, either through explicit DeleteTablet RPCs 
(triggered by the catalog manager replacement process), or by heartbeating, but 
it's an unnecessary drain on cluster resources.

We should probably abort CreateTablet RPCs for tablets that have been removed 
from their table.

CreateTableITest_TestCreateWhenMajorityOfReplicasFailCreation demonstrates this 
acutely.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to