[
https://issues.apache.org/jira/browse/CASSANDRA-13038?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15858568#comment-15858568
]
Jeff Jirsa commented on CASSANDRA-13038:
----------------------------------------
Here's the new microbench results. Code coming shortly. In each test, we create
an array of 10,000,000 integers.
By default, it creates them in the range of 0-86400 (TTLs for every second in a
day)
In the narrow test, they range from 0-14400 (TTLs for every second in 3 hours).
In the sparse test, they range from 0-60 (TTLs for every second in a minute).
The first set of results shows the existing streaming histogram behavior
(~8s/run)
The second set of results is the existing streaming histogram with 60s rounding
(~300ms/run)
The third set buffers updates into a spool 1000x larger than the bins (so 100
bins and 100000 spool), faster than stock by ~2-4x.
The fourth set buffers updates into a spool 1000x larger than the bins (100
bins, 100000 spool), and rounds to 60s: ~170ms/run
The rest can probably be ignored - they were testing other bin/spool sizes.
Based on this output, I think trimming to 60s is worthwhile - I propose we
round to 60 second TTL resolution by default, and then make it tunable via
system property to either go higher ( ~3600 if hourly buckets work ), or as low
as 1 for existing behavior.
{code}
[java] # Run complete. Total time: 00:15:01
[java]
[java] Benchmark
Mode Samples Score Error Units
[java] o.a.c.t.m.StreamingHistogramBench.exitingSH
avgt 5 8299.160 ± 819.791 ms/op
[java] o.a.c.t.m.StreamingHistogramBench.narrowexistingSH
avgt 5 8754.663 ± 1026.628 ms/op
[java] o.a.c.t.m.StreamingHistogramBench.sparseexistingSH
avgt 5 755.780 ± 43.003 ms/op
[java] o.a.c.t.m.StreamingHistogramBench.sparsestreaminghistogram60s
avgt 5 278.455 ± 22.318 ms/op
[java] o.a.c.t.m.StreamingHistogramBench.streaminghistogram60s
avgt 5 280.750 ± 38.739 ms/op{code}
[java] o.a.c.t.m.StreamingHistogramBench.narrowstreaminghistogram60s
avgt 5 291.652 ± 24.164 ms/op
[java] o.a.c.t.m.StreamingHistogramBench.newSH1000x
avgt 5 2909.856 ± 245.517 ms/op
[java] o.a.c.t.m.StreamingHistogramBench.narrownewSH1000x
avgt 5 1491.073 ± 96.055 ms/op
[java] o.a.c.t.m.StreamingHistogramBench.sparsenewSH1000x
avgt 5 467.546 ± 41.027 ms/op
[java] o.a.c.t.m.StreamingHistogramBench.newstreaminghistogram1000x60s
avgt 5 174.485 ± 7.145 ms/op
[java]
o.a.c.t.m.StreamingHistogramBench.narrownewstreaminghistogram1000x60s avgt
5 163.713 ± 16.302 ms/op
[java]
o.a.c.t.m.StreamingHistogramBench.sparsenewstreaminghistogram1000x60 avgt
5 162.857 ± 19.290 ms/op
[java] o.a.c.t.m.StreamingHistogramBench.newSH10x
avgt 5 8148.194 ± 1053.900 ms/op
[java] o.a.c.t.m.StreamingHistogramBench.narrownewSH10x
avgt 5 8843.801 ± 824.742 ms/op
[java] o.a.c.t.m.StreamingHistogramBench.sparsenewSH10x
avgt 5 457.407 ± 22.189 ms/op
[java] o.a.c.t.m.StreamingHistogramBench.newSH100x
avgt 5 11140.663 ± 1116.051 ms/op
[java] o.a.c.t.m.StreamingHistogramBench.narrownewSH100x
avgt 5 7654.534 ± 379.143 ms/op
[java] o.a.c.t.m.StreamingHistogramBench.sparsenewSH100x
avgt 5 445.953 ± 5.981 ms/op
[java] o.a.c.t.m.StreamingHistogramBench.newSH10000x
avgt 5 3016.663 ± 605.904 ms/op
[java] o.a.c.t.m.StreamingHistogramBench.narrownewSH10000x
avgt 5 2773.356 ± 292.641 ms/op
[java] o.a.c.t.m.StreamingHistogramBench.sparsenewSH10000x
avgt 5 3090.090 ± 361.765 ms/op
[java] o.a.c.t.m.StreamingHistogramBench.newSH50and100x
avgt 5 7305.847 ± 619.186 ms/op
[java] o.a.c.t.m.StreamingHistogramBench.narrownewSH50and100x
avgt 5 5015.139 ± 611.160 ms/op
[java] o.a.c.t.m.StreamingHistogramBench.sparsenewSH50and100x
avgt 5 477.743 ± 12.814 ms/op
[java] o.a.c.t.m.StreamingHistogramBench.newSH50and1000x
avgt 5 3304.479 ± 342.117 ms/op
[java] o.a.c.t.m.StreamingHistogramBench.narrownewSH50and1000x
avgt 5 1683.944 ± 167.946 ms/op
[java] o.a.c.t.m.StreamingHistogramBench.sparsenewSH50and1000x
avgt 5 461.928 ± 5.799 ms/op
{code}
> 33% of compaction time spent in StreamingHistogram.update()
> -----------------------------------------------------------
>
> Key: CASSANDRA-13038
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13038
> Project: Cassandra
> Issue Type: Bug
> Components: Compaction
> Reporter: Corentin Chary
> Assignee: Jeff Jirsa
> Attachments: compaction-speedup.patch,
> compaction-streaminghistrogram.png, profiler-snapshot.nps
>
>
> With the following table, that contains a *lot* of cells:
> {code}
> CREATE TABLE biggraphite.datapoints_11520p_60s (
> metric uuid,
> time_start_ms bigint,
> offset smallint,
> count int,
> value double,
> PRIMARY KEY ((metric, time_start_ms), offset)
> ) WITH CLUSTERING ORDER BY (offset DESC);
> AND compaction = {'class':
> 'org.apache.cassandra.db.compaction.TimeWindowCompactionStrategy',
> 'compaction_window_size': '6', 'compaction_window_unit': 'HOURS',
> 'max_threshold': '32', 'min_threshold': '6'}
> Keyspace : biggraphite
> Read Count: 1822
> Read Latency: 1.8870054884742042 ms.
> Write Count: 2212271647
> Write Latency: 0.027705127678653473 ms.
> Pending Flushes: 0
> Table: datapoints_11520p_60s
> SSTable count: 47
> Space used (live): 300417555945
> Space used (total): 303147395017
> Space used by snapshots (total): 0
> Off heap memory used (total): 207453042
> SSTable Compression Ratio: 0.4955200053039823
> Number of keys (estimate): 16343723
> Memtable cell count: 220576
> Memtable data size: 17115128
> Memtable off heap memory used: 0
> Memtable switch count: 2872
> Local read count: 0
> Local read latency: NaN ms
> Local write count: 1103167888
> Local write latency: 0.025 ms
> Pending flushes: 0
> Percent repaired: 0.0
> Bloom filter false positives: 0
> Bloom filter false ratio: 0.00000
> Bloom filter space used: 105118296
> Bloom filter off heap memory used: 106547192
> Index summary off heap memory used: 27730962
> Compression metadata off heap memory used: 73174888
> Compacted partition minimum bytes: 61
> Compacted partition maximum bytes: 51012
> Compacted partition mean bytes: 7899
> Average live cells per slice (last five minutes): NaN
> Maximum live cells per slice (last five minutes): 0
> Average tombstones per slice (last five minutes): NaN
> Maximum tombstones per slice (last five minutes): 0
> Dropped Mutations: 0
> {code}
> It looks like a good chunk of the compaction time is lost in
> StreamingHistogram.update() (which is used to store the estimated tombstone
> drop times).
> This could be caused by a huge number of different deletion times which would
> makes the bin huge but it this histogram should be capped to 100 keys. It's
> more likely caused by the huge number of cells.
> A simple solutions could be to only take into accounts part of the cells, the
> fact the this table has a TWCS also gives us an additional hint that sampling
> deletion times would be fine.
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)