Hello Kudu Jenkins,
I'd like you to reexamine a change. Please visit
to look at the new patch set (#3).
Change subject: WIP: KUDU-1567. Decouple hard-minimum WAL segment retention
WIP: KUDU-1567. Decouple hard-minimum WAL segment retention from target
This changes the behavior around the "minimum log segments to retain".
Previously, the maintenance manager considered it high priority to flush
any in-memory store which was retaining more than this number of log
segments. With the default log_min_segments_to_retain=2, this caused the
maintenance manager to trigger very small flushes (128MB) regardless of
the size of flush_threshold_mb. The end result here was high write
Testing with -log_min_segments_to_retain=50 indicated that write
performance could be improved about 2x and write amplification reduced
by about 1.7x by removing this aggressive flush behavior.
However, setting the 'min segments to retain' also had the unfortunate
side effect of _always_ retaining 50 segments, regardless of whether
those were actually necessary for durability purposes. In a long-running
cluster, most tablets are not actively being loaded into at such a high
rate, and retaining 50 segments would mean unnecessary disk usage as
well as longer startup times in the absence of a solution to KUDU-38.
Thus, this patch takes the approach of decoupling the two ideas into two
1) the original 'log_min_segments_to_retain', which can be left very
low, and now is really only useful for things like post-mortem
debugging. A future commit could change this to 1 or possibly even 0.
2) a new 'maintenance_manager_target_log_replay_size_mb' flag, which
indicates the amount of retained log data at which point the MM
should schedule flushes of in-memory stores.
With the new defaults, we should have the following behavior:
- an MRS can fill up until the logs reach 1GB. At that point, the MM
will begin flushing.
- after a flush, the logs will be GCed down to 2 segments.
WIP for a few reasons:
- should have some more tests for the new behavior
- probably needs some code cleanup
- should be cluster-tested
- maybe the flag should be renamed to have the 'log' prefix despite
being named in the maintenance_manager_* module.
- should we drop log_min_segments_to_retain to 0?
- should test interaction if log_max_segments_to_retain is smaller than
the size configured by ..._replay_size_mb.
12 files changed, 143 insertions(+), 110 deletions(-)
git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/70/4470/3
To view, visit http://gerrit.cloudera.org:8080/4470
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings
Gerrit-Owner: Todd Lipcon <t...@apache.org>
Gerrit-Reviewer: David Ribeiro Alves <dral...@apache.org>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <mpe...@apache.org>
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <t...@apache.org>