[jira] [Comment Edited] (KUDU-3180) kudu don't always prefer to flush MRS/DMS that anchor more memory

Andrew Wong (Jira) Fri, 07 Aug 2020 00:23:13 -0700


    [ 
https://issues.apache.org/jira/browse/KUDU-3180?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17172948#comment-17172948
 ]


Andrew Wong edited comment on KUDU-3180 at 8/7/20, 7:22 AM:
------------------------------------------------------------

Looking through the code a bit to explain the 0B logs retained, it seems like 
logs retained only accounts for the size of ReadableLogSegments, meaning if a 
WAL segment is still being written to, it will be accounted for in the space 
retained estimate. See GetReplaySizeMap() in consensus/log.h for more details.

{quote}It's not always true that older or larger mem-stores anchor more WAL 
bytes as far as I saw on /maintenance-manager page, so maybe we shouldn't 
always use WAL bytes anchored to determine what to flush.{quote}

That's true, but WAL bytes anchored will be somewhat correlated with both the 
size and the age, not taking into account the above replay size map discrepancy.

One question about your particular use case though: would tuning the 
{{--memory_pressure_percentage}} gflag help at all? If you reduce it 
significantly, you would guarantee MRS/DMS flushing would be prioritized over 
compactions. Admittedly, it will use the WAL bytes anchored to prioritize ops, 
but that should still work out to flush larger mem-stores in insert-mostly 
workloads.


was (Author: andrew.wong):
Looking through the code a bit, it seems like logs retained only accounts for 
the size of ReadableLogSegments, meaning if a WAL segment is still being 
written to, it will be accounted for in the space retained estimate. See 
GetReplaySizeMap() in consensus/log.h for more details.

{quote}It's not always true that older or larger mem-stores anchor more WAL 
bytes as far as I saw on /maintenance-manager page, so maybe we shouldn't 
always use WAL bytes anchored to determine what to flush.{quote}

That's true, but WAL bytes anchored will be somewhat correlated with both the 
size and the age, not taking into account the above replay size map discrepancy.

One question about your particular use case though: would tuning the 
{{--memory_pressure_percentage}} gflag help at all? If you reduce it 
significantly, you would guarantee MRS/DMS flushing would be prioritized over 
compactions. Admittedly, it will use the WAL bytes anchored to prioritize ops, 
but that should still work out to flush larger mem-stores in insert-mostly 
workloads.

> kudu don't always prefer to flush MRS/DMS that anchor more memory
> -----------------------------------------------------------------
>
>                 Key: KUDU-3180
>                 URL: https://issues.apache.org/jira/browse/KUDU-3180
>             Project: Kudu
>          Issue Type: Improvement
>            Reporter: YifanZhang
>            Priority: Major
>         Attachments: image-2020-08-04-20-26-53-749.png, 
> image-2020-08-04-20-28-00-665.png
>
>
> Current time-based flush policy always give a flush op a high score if we 
> haven't flushed for the tablet in a long time, that may lead to starvation of 
> ops that could free more memory.
> We set  -flush_threshold_mb=32,  -flush_threshold_secs=1800 in a cluster, and 
> find that some small MRS/DMS flushes has a higher perf score than big MRS/DMS 
> flushes and compactions, which seems not so reasonable.
> !image-2020-08-04-20-26-53-749.png|width=1424,height=317!!image-2020-08-04-20-28-00-665.png|width=1414,height=327!



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Comment Edited] (KUDU-3180) kudu don't always prefer to flush MRS/DMS that anchor more memory

Reply via email to