[jira] [Commented] (KUDU-1567) Short default for log retention increases write amplification

Todd Lipcon (JIRA) Thu, 01 Sep 2016 17:31:49 -0700

    [ 
https://issues.apache.org/jira/browse/KUDU-1567?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15457066#comment-15457066
 ]


Todd Lipcon commented on KUDU-1567:
-----------------------------------

Another thought: would be good to change the retention behavior to support the 
following:

- on an actively written tablet, don't worry about going up to 10-20 log 
segments. If someone restarts in the middle of a heavy write workload, it's 
probably more expected for those tablets to recover slowly.
- when the tablet has flushed due to time reasons and no longer needs all of 
those log segments, we should delete them rather than adhering to some 
arbitrary "min segments"

In other words, the user configuration should be to set a target size (a soft 
upper bound) for the logs that need to be replayed, but not a lower bound of 
logs which are kept for no good reason.

> Short default for log retention increases write amplification
> -------------------------------------------------------------
>
>                 Key: KUDU-1567
>                 URL: https://issues.apache.org/jira/browse/KUDU-1567
>             Project: Kudu
>          Issue Type: Improvement
>          Components: perf, tserver
>    Affects Versions: 0.10.0
>            Reporter: Todd Lipcon
>            Assignee: Todd Lipcon
>
> Currently the maintenance manager prioritizes flushes over compactions if the 
> flush operations are retaining WAL segments. The goal here is to prevent the 
> amount of in-memory data from getting so large that restarts would be 
> incredibly slow. However, it has a somewhat unintuitive negative effect on 
> performance:
> - with the default of retaining just two segments, flushes become highly 
> prioritized when the MRS only has ~128MB of data, regardless of the 
> "flush_threshold_mb" configuration
> - this creates lots of overlapping rowsets in the case of random-write 
> applications
> - because flushes are prioritized over compactions, compactions rarely run
> - the frequent flushes, combined with low priority of compactions, means that 
> after a few days of constant inserts, we often end up with average "bloom 
> lookups per op" metrics of 50-100, which is quite slow even if the blooms fit 
> in cache.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (KUDU-1567) Short default for log retention increases write amplification

Reply via email to