Song Jiacheng created KUDU-3486:
-----------------------------------
Summary: Too many tombstone tablet may lead to high memory usage.
Key: KUDU-3486
URL: https://issues.apache.org/jira/browse/KUDU-3486
Project: Kudu
Issue Type: Bug
Components: tserver
Affects Versions: 1.14.0
Reporter: Song Jiacheng
Attachments: TServer_Delete_tombstone_tablet_periodically.patch,
image-2023-07-06-15-59-44-181.png
There are two kinds of tablet replica deletion: tombstone and delete. A
tombstone tablet replica might never be deleted since the delete-type deletion
could only occur then the tablet is deleted, and the requests will be sent to
the Voters, not including the Tombstone ones.
Here is a example:
Tablet T:
replica A
replica B
replica C
After rebalance:
replica A
replica B
replica C(Tombstone)
replica D
!image-2023-07-06-15-59-44-181.png!
When the tablet T is deleted, A B D are deleted, and C exists forever.
Like this, the tablet had already been deleted at 3:00 am 13th Jun, but the
tombstone replica still exists.
The data of tombstone replica is deleted, but metadata is persisted in memory,
especially the biggest one SchemaPB will occupy a lot of memory.
In some of our clusters, tombstone replicas of each tserver could reach 50k ~
100k, which takes about 10G.
My temporary solution is that create a thread to delete tombstone replicas who
live too long in tserver, and I think the perfect solution is that master would
try to delete all the replicas, including tombstone ones, when the tablet is
delete.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)