[
https://issues.apache.org/jira/browse/KUDU-3531?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Alexey Serbin updated KUDU-3531:
--------------------------------
Description:
I came across a case where a tablet server had just about 2K live tablet
replicas, but it opened about 24K files in its WAL and data directories. The
issue stems from the fact that tombstoned tablet replica's files are opened by
the FS manager the same as for a live replica, and those are kept open even if
they are never about to change. It would be prudent to avoid keeping
tombstoned tablet replicas' files open, if possible: maybe, just read the
required information (last voted term and opId index?) and keep it in runtime
structures, but close corresponding files right after bootstrapping?
Otherwise, this doesn't seem to scale well.
was:
I came across a case where a tablet server has just about 2K tablet replicas,
but it opened about 24K files in its WAL and data directories. The issue stems
from the fact that tobmstoned tablet replica's files are opened by the FS
manager as well, and those are kept open even if they are never about to change
or receive any Raft updates. It would be prudent to avoid keeping tombstoned
tablet replicas' files open, if possible: maybe, just read the require
information (last voted term and opId index?) and keep it in runtime
structures, but close corresponding files right after bootstrapping?
Otherwise, this doesn't seem to scale well.
> Limit the amount of resources used by tombstoned tablet replicas
> ----------------------------------------------------------------
>
> Key: KUDU-3531
> URL: https://issues.apache.org/jira/browse/KUDU-3531
> Project: Kudu
> Issue Type: Improvement
> Reporter: Alexey Serbin
> Priority: Major
>
> I came across a case where a tablet server had just about 2K live tablet
> replicas, but it opened about 24K files in its WAL and data directories. The
> issue stems from the fact that tombstoned tablet replica's files are opened
> by the FS manager the same as for a live replica, and those are kept open
> even if they are never about to change. It would be prudent to avoid keeping
> tombstoned tablet replicas' files open, if possible: maybe, just read the
> required information (last voted term and opId index?) and keep it in runtime
> structures, but close corresponding files right after bootstrapping?
> Otherwise, this doesn't seem to scale well.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)