[kudu-CR] WIP: KUDU-1567. Decouple hard-minimum WAL segment retention from target

Todd Lipcon (Code Review) Mon, 19 Sep 2016 20:35:27 -0700

Hello Kudu Jenkins,

I'd like you to reexamine a change.  Please visit


    http://gerrit.cloudera.org:8080/4470

to look at the new patch set (#3).

Change subject: WIP: KUDU-1567. Decouple hard-minimum WAL segment retention 
from target
......................................................................

WIP: KUDU-1567. Decouple hard-minimum WAL segment retention from target

This changes the behavior around the "minimum log segments to retain".
Previously, the maintenance manager considered it high priority to flush
any in-memory store which was retaining more than this number of log
segments. With the default log_min_segments_to_retain=2, this caused the
maintenance manager to trigger very small flushes (128MB) regardless of
the size of flush_threshold_mb. The end result here was high write
amplification.

Testing with -log_min_segments_to_retain=50 indicated that write
performance could be improved about 2x and write amplification reduced
by about 1.7x by removing this aggressive flush behavior.

However, setting the 'min segments to retain' also had the unfortunate
side effect of _always_ retaining 50 segments, regardless of whether
those were actually necessary for durability purposes. In a long-running
cluster, most tablets are not actively being loaded into at such a high
rate, and retaining 50 segments would mean unnecessary disk usage as
well as longer startup times in the absence of a solution to KUDU-38.

Thus, this patch takes the approach of decoupling the two ideas into two
separate configurations:

1) the original 'log_min_segments_to_retain', which can be left very
   low, and now is really only useful for things like post-mortem
   debugging. A future commit could change this to 1 or possibly even 0.

2) a new 'maintenance_manager_target_log_replay_size_mb' flag, which
   indicates the amount of retained log data at which point the MM
   should schedule flushes of in-memory stores.

With the new defaults, we should have the following behavior:
- an MRS can fill up until the logs reach 1GB. At that point, the MM
  will begin flushing.
- after a flush, the logs will be GCed down to 2 segments.

WIP for a few reasons:
- should have some more tests for the new behavior
- probably needs some code cleanup
- should be cluster-tested
- maybe the flag should be renamed to have the 'log' prefix despite
  being named in the maintenance_manager_* module.
- should we drop log_min_segments_to_retain to 0?
- should test interaction if log_max_segments_to_retain is smaller than
  the size configured by ..._replay_size_mb.

Change-Id: I31400e2200f9ce3eeb63f4bc948bc630e8c1115f
---
M src/kudu/consensus/log-test.cc
M src/kudu/consensus/log.cc
M src/kudu/consensus/log.h
M src/kudu/integration-tests/raft_consensus-itest.cc
M src/kudu/tablet/tablet-test.cc
M src/kudu/tablet/tablet.cc
M src/kudu/tablet/tablet.h
M src/kudu/tablet/tablet_peer.cc
M src/kudu/tablet/tablet_peer.h
M src/kudu/tablet/tablet_peer_mm_ops.cc
M src/kudu/util/maintenance_manager-test.cc
M src/kudu/util/maintenance_manager.cc
12 files changed, 143 insertions(+), 110 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/70/4470/3
-- 
To view, visit http://gerrit.cloudera.org:8080/4470
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I31400e2200f9ce3eeb63f4bc948bc630e8c1115f
Gerrit-PatchSet: 3
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Todd Lipcon <t...@apache.org>
Gerrit-Reviewer: David Ribeiro Alves <dral...@apache.org>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <mpe...@apache.org>
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <t...@apache.org>

[kudu-CR] WIP: KUDU-1567. Decouple hard-minimum WAL segment retention from target

Reply via email to