[jira] [Commented] (KUDU-1587) Memory-based backpressure is insufficient on seek-bound workloads

Todd Lipcon (JIRA) Fri, 18 Nov 2016 17:04:12 -0800

    [ 
https://issues.apache.org/jira/browse/KUDU-1587?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15678278#comment-15678278
 ]


Todd Lipcon commented on KUDU-1587:
-----------------------------------

Been pondering this a bit today. Here's a sketch of a possible solution which 
shouldn't be difficult to implement:

- use something like the "codel" algorithm on the Apply threadpool. Overview is:
-- for each task that comes off the queue, measure its queue time (already done 
for the purpose of metrics)
-- if the queue time is above a "target queue time" (eg 100ms), then the queue 
is in "overloaded" state. Otherwise it is in a "good" state. Overloaded implies 
some kind of standing queue.
-- if overloaded, keep track of how long we have been in the overloaded state.
- when a new operation is about to start, check (before PREPARE) whether the 
queue is in the overloaded state. If so, reject the write with some 
probability. The probability should be based on (a) how many writes have been 
dropped so far since we entered the overloaded state, and (b) how many 
operations the write contains

The hope is that, if the apply queue is overloaded, we'll start shedding load 
more and more aggressively rather than accumulating a longer and longer queue.

Any thoughts on another potential solution that might work well?

> Memory-based backpressure is insufficient on seek-bound workloads
> -----------------------------------------------------------------
>
>                 Key: KUDU-1587
>                 URL: https://issues.apache.org/jira/browse/KUDU-1587
>             Project: Kudu
>          Issue Type: Bug
>          Components: tserver
>    Affects Versions: 0.10.0
>            Reporter: Todd Lipcon
>            Priority: Critical
>         Attachments: graph.png, queue-time.png
>
>
> I pushed a uniform random insert workload from a bunch of clients to the 
> point that the vast majority of bloom filters no longer fit in buffer cache, 
> and the compaction had fallen way behind. Thus, every inserted row turns into 
> 40+ seeks (due to non-compact data) and takes 400-500ms. In this kind of 
> workload, the current backpressure (based on memory usage) is insufficient to 
> prevent ridiculously long queues.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (KUDU-1587) Memory-based backpressure is insufficient on seek-bound workloads

Reply via email to