[jira] [Commented] (KUDU-749) Improve performance for zipfian update

Todd Lipcon (JIRA) Wed, 25 May 2016 19:03:27 -0700

    [ 
https://issues.apache.org/jira/browse/KUDU-749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15301309#comment-15301309
 ]


Todd Lipcon commented on KUDU-749:
----------------------------------

Just addressed the quadratic time delta collection in 
c7178e97e842f42e9ed9d5e9e2a4f521fbe70b6b.

Another item I'm noting is that the "heavy write" diskrowsets aren't getting 
agressivelly major-delta-compacted. The issue is that we only look at the total 
size ratio of the delta files vs the base data, and not a more realistic 
measure of performance. One quick thought is to actually capture counters at 
runtime for the number of deltas _applied during reads_ on the DRS vs the 
_number of rows read_. So, for the case where a single row has a zipfian 
pattern, the ratio will be quite high (eg 1000:1) whereas in the more analytic 
use case where a single column has been updated once across all rows, the ratio 
will be more like 1:1.

The potential downside of this read-dependent tracking is that it wouldn't 
apply as well on the replicas where there might not be a heavy read workload, 
and then a leader change would result in a big latency spike as the readers 
started to shift to unoptimized replicas.

> Improve performance for zipfian update
> --------------------------------------
>
>                 Key: KUDU-749
>                 URL: https://issues.apache.org/jira/browse/KUDU-749
>             Project: Kudu
>          Issue Type: Improvement
>          Components: perf, tablet
>    Affects Versions: Private Beta
>            Reporter: Todd Lipcon
>         Attachments: screenshot-1.png, screenshot-2.png
>
>
> A zipfian 50/50 update/read workload on YCSB gets slower and slower until 
> it's pretty intolerable (random reads taking 100+ms of CPU). It seems like 
> all the CPU is spent in DMSIterator::PrepareBatch. We're probably doing 
> something dumb here - let's look for some low hanging fruit to fix this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (KUDU-749) Improve performance for zipfian update

Reply via email to