[
https://issues.apache.org/jira/browse/KUDU-1567?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15427668#comment-15427668
]
Todd Lipcon commented on KUDU-1567:
-----------------------------------
Increasing the log segment retention to 20 instead of the default of 2
increased the size of flushes substantially (thus requiring less compaction)
and also increased the frequency of compactions running (thus reducing the
blooms-per-op statistic).
The downside of course is that startup time will be longer. However, the most
common case where someone cares about startup time is for rolling restart. We
could provide a "clean shutdown" mode which (optionally) stops accepting
writes, flushes all the memory stores, and then shuts down. This, combined with
a fix for KUDU-38, would allow a planned restart to proceed quickly since there
would be little in the logs left to replay. Then, unplanned restarts would be
comparitively rare and the occasional 5+ minute replay time would be no big
deal.
A super-cheap implementation of the above would be to just use a gflag which
tells the maintenance manager to prioritize flushes above all else. Before a
planned shutdown, we can runtime-set the gflag to 'true' and wait until the
in-memory stores are all flushed, then does a normal kill. But, we'd still need
KUDU-38 fixed.
> Short default for log retention increases write amplification
> -------------------------------------------------------------
>
> Key: KUDU-1567
> URL: https://issues.apache.org/jira/browse/KUDU-1567
> Project: Kudu
> Issue Type: Improvement
> Components: perf, tserver
> Affects Versions: 0.10.0
> Reporter: Todd Lipcon
> Assignee: Todd Lipcon
>
> Currently the maintenance manager prioritizes flushes over compactions if the
> flush operations are retaining WAL segments. The goal here is to prevent the
> amount of in-memory data from getting so large that restarts would be
> incredibly slow. However, it has a somewhat unintuitive negative effect on
> performance:
> - with the default of retaining just two segments, flushes become highly
> prioritized when the MRS only has ~128MB of data, regardless of the
> "flush_threshold_mb" configuration
> - this creates lots of overlapping rowsets in the case of random-write
> applications
> - because flushes are prioritized over compactions, compactions rarely run
> - the frequent flushes, combined with low priority of compactions, means that
> after a few days of constant inserts, we often end up with average "bloom
> lookups per op" metrics of 50-100, which is quite slow even if the blooms fit
> in cache.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)