[ 
https://issues.apache.org/jira/browse/KUDU-3486?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Song Jiacheng updated KUDU-3486:
--------------------------------
    Description: 
There are two kinds of tablet replica deletion: tombstone and delete. A 
tombstone tablet replica might never be deleted since the delete-type deletion 
could only occur then the tablet is deleted, and the requests will be sent to 
the Voters, not including the Tombstone ones. 
Here is a example:
Tablet T:
replica A
replica B
replica C
After rebalance:
replica A
replica B
replica C(Tombstone)
replica D
When the tablet T is deleted, A B D are deleted, and C exists forever.
Like this picture, the tablet had already been deleted at 3:00 am 13th Jun, but 
the tombstone replica still exists.
!image-2023-07-06-15-59-44-181.png|width=568,height=261! 
The data of tombstone replica is deleted, but metadata is persisted in memory, 
especially the biggest one SchemaPB will occupy a lot of memory.
In some of our clusters, tombstone replicas of each tserver could reach 50k ~ 
100k, which takes about 10G.

It takes too much resource if adds a vector for each tablet to store the 
history tablet servers that used to hold a replica of the tablet. So I think 
periodically heartbeat might be a good way to solve the problem.

  was:
There are two kinds of tablet replica deletion: tombstone and delete. A 
tombstone tablet replica might never be deleted since the delete-type deletion 
could only occur then the tablet is deleted, and the requests will be sent to 
the Voters, not including the Tombstone ones. 
Here is a example:
Tablet T:
replica A
replica B
replica C
After rebalance:
replica A
replica B
replica C(Tombstone)
replica D
When the tablet T is deleted, A B D are deleted, and C exists forever.
Like this picture, the tablet had already been deleted at 3:00 am 13th Jun, but 
the tombstone replica still exists.
!image-2023-07-06-15-59-44-181.png|width=568,height=261! 
The data of tombstone replica is deleted, but metadata is persisted in memory, 
especially the biggest one SchemaPB will occupy a lot of memory.
In some of our clusters, tombstone replicas of each tserver could reach 50k ~ 
100k, which takes about 10G.
My temporary solution is that create a thread to delete tombstone replicas who 
live too long in tserver, and I think the perfect solution is that master would 
try to delete all the replicas, including tombstone ones, when the tablet is 
delete.

Posted my temporary solution patch to make it clearer.


> Tserver: Too many tombstone tablet may lead to high memory usage.
> -----------------------------------------------------------------
>
>                 Key: KUDU-3486
>                 URL: https://issues.apache.org/jira/browse/KUDU-3486
>             Project: Kudu
>          Issue Type: Bug
>          Components: tserver
>    Affects Versions: 1.14.0
>            Reporter: Song Jiacheng
>            Priority: Minor
>         Attachments: image-2023-07-06-15-59-44-181.png
>
>
> There are two kinds of tablet replica deletion: tombstone and delete. A 
> tombstone tablet replica might never be deleted since the delete-type 
> deletion could only occur then the tablet is deleted, and the requests will 
> be sent to the Voters, not including the Tombstone ones. 
> Here is a example:
> Tablet T:
> replica A
> replica B
> replica C
> After rebalance:
> replica A
> replica B
> replica C(Tombstone)
> replica D
> When the tablet T is deleted, A B D are deleted, and C exists forever.
> Like this picture, the tablet had already been deleted at 3:00 am 13th Jun, 
> but the tombstone replica still exists.
> !image-2023-07-06-15-59-44-181.png|width=568,height=261! 
> The data of tombstone replica is deleted, but metadata is persisted in 
> memory, especially the biggest one SchemaPB will occupy a lot of memory.
> In some of our clusters, tombstone replicas of each tserver could reach 50k ~ 
> 100k, which takes about 10G.
> It takes too much resource if adds a vector for each tablet to store the 
> history tablet servers that used to hold a replica of the tablet. So I think 
> periodically heartbeat might be a good way to solve the problem.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to