Adar Dembo created KUDU-2114:
--------------------------------

             Summary: Master asks tservers to delete tombstoned tablets forever
                 Key: KUDU-2114
                 URL: https://issues.apache.org/jira/browse/KUDU-2114
             Project: Kudu
          Issue Type: Bug
          Components: consensus, master, tserver
    Affects Versions: 1.5.0
            Reporter: Adar Dembo
            Assignee: Mike Percy
            Priority: Blocker


Commit 5bca7d8 changed the behavior of tombstoned replicas such that they now 
retain RaftConsensus instances despite being in the TOMBSTONED state. This 
means that some additional consensus-related state is included in their tablet 
report entries when a full tablet report is sent to the master. The master 
evaluates this consensus-related state when considering whether an evicted 
replica should be deleted, but it does not consider the TOMBSTONED state. As a 
result, the master notices that these tombstones replicas have been evicted, 
and asks the hosting tserver to delete them. Over, and over, and over.

This needs to be fixed, whether by excluding tombstone consensus state from 
tablet reports, or by changing the master to consider the tablet's overall 
state when deciding whether to delete it.

When observed on a live cluster, it was further observed that the tablet 
deletion requests were rather expensive. It appears that a DeleteTablet RPC on 
a tombstone is not a no-op; it always flushes the superblock twice, which 
generates two fsyncs. This should also be addressed.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to