Adar Dembo created KUDU-2114:
--------------------------------
Summary: Master asks tservers to delete tombstoned tablets forever
Key: KUDU-2114
URL: https://issues.apache.org/jira/browse/KUDU-2114
Project: Kudu
Issue Type: Bug
Components: consensus, master, tserver
Affects Versions: 1.5.0
Reporter: Adar Dembo
Assignee: Mike Percy
Priority: Blocker
Commit 5bca7d8 changed the behavior of tombstoned replicas such that they now
retain RaftConsensus instances despite being in the TOMBSTONED state. This
means that some additional consensus-related state is included in their tablet
report entries when a full tablet report is sent to the master. The master
evaluates this consensus-related state when considering whether an evicted
replica should be deleted, but it does not consider the TOMBSTONED state. As a
result, the master notices that these tombstones replicas have been evicted,
and asks the hosting tserver to delete them. Over, and over, and over.
This needs to be fixed, whether by excluding tombstone consensus state from
tablet reports, or by changing the master to consider the tablet's overall
state when deciding whether to delete it.
When observed on a live cluster, it was further observed that the tablet
deletion requests were rather expensive. It appears that a DeleteTablet RPC on
a tombstone is not a no-op; it always flushes the superblock twice, which
generates two fsyncs. This should also be addressed.
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)