Alexey Serbin created KUDU-3427:
-----------------------------------
Summary: Add a metric for delta of snapshot vs current timestamp
for scan-at-snapshot scans
Key: KUDU-3427
URL: https://issues.apache.org/jira/browse/KUDU-3427
Project: Kudu
Issue Type: Improvement
Components: tserver
Reporter: Alexey Serbin
Currently, the setting for the {{\-\-tablet_history_max_age_sec}} is set to
quite arbitrary value of 60 * 60 * 24 * 7 seconds (7 days). Keeping a lot of
data in UNDO deltas for longer than necessary means using IO throughput, CPU
cycles, and memory during various types of background maintenance jobs to
process data which no longer needed. However, as of Kudu 1.16.0 version, there
isn't a simple way to tell whether the current setting of
{{\-\-tablet_history_max_age_sec}} is appropriate for the workload running on a
Kudu cluster. An operator interested in optimizing the amount of tablet
history stored has no visibility on what might be the optimal value for the
{{\-\-tablet_history_max_age_sec}} based on the workloads run against the
cluster.
It would be great to add a per-tablet metric (a histogram?) to accumulate stats
on the difference of snapshots used for scan operations in READ_AT_SNAPSHOT and
READ_YOUR_WRITES mode vs current timestamp.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)