[ 
https://issues.apache.org/jira/browse/KUDU-1567?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Todd Lipcon resolved KUDU-1567.
-------------------------------
       Resolution: Fixed
    Fix Version/s: 1.1.0

Fixed in 127438af30356f1afedb862166c907ff754d1c55

> Short default for log retention increases write amplification
> -------------------------------------------------------------
>
>                 Key: KUDU-1567
>                 URL: https://issues.apache.org/jira/browse/KUDU-1567
>             Project: Kudu
>          Issue Type: Improvement
>          Components: perf, tserver
>    Affects Versions: 0.10.0
>            Reporter: Todd Lipcon
>            Assignee: Todd Lipcon
>             Fix For: 1.1.0
>
>
> Currently the maintenance manager prioritizes flushes over compactions if the 
> flush operations are retaining WAL segments. The goal here is to prevent the 
> amount of in-memory data from getting so large that restarts would be 
> incredibly slow. However, it has a somewhat unintuitive negative effect on 
> performance:
> - with the default of retaining just two segments, flushes become highly 
> prioritized when the MRS only has ~128MB of data, regardless of the 
> "flush_threshold_mb" configuration
> - this creates lots of overlapping rowsets in the case of random-write 
> applications
> - because flushes are prioritized over compactions, compactions rarely run
> - the frequent flushes, combined with low priority of compactions, means that 
> after a few days of constant inserts, we often end up with average "bloom 
> lookups per op" metrics of 50-100, which is quite slow even if the blooms fit 
> in cache.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to