[ 
https://issues.apache.org/jira/browse/CASSANDRA-13038?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15858568#comment-15858568
 ] 

Jeff Jirsa commented on CASSANDRA-13038:
----------------------------------------

Here's the new microbench results. Code coming shortly. In each test, we create 
an array of 10,000,000 integers. 

By default, it creates them in the range of 0-86400 (TTLs for every second in a 
day)
In the narrow test, they range from 0-14400 (TTLs for every second in 3 hours). 
In the sparse test, they range from 0-60 (TTLs for every second in a minute). 

The first set of results shows the existing streaming histogram behavior 
(~8s/run)
The second set of results is the existing streaming histogram with 60s rounding 
(~300ms/run)
The third set buffers updates into a spool 1000x larger than the bins (so 100 
bins and 100000 spool), faster than stock by ~2-4x.
The fourth set buffers updates into a spool 1000x larger than the bins (100 
bins, 100000 spool), and rounds to 60s: ~170ms/run

The rest can probably be ignored - they were testing other bin/spool sizes.

Based on this output, I think trimming to 60s is worthwhile - I propose we 
round to 60 second TTL resolution by default, and then make it tunable via 
system property to either go higher ( ~3600 if hourly buckets work ), or as low 
as 1 for existing behavior.

{code}
     [java] # Run complete. Total time: 00:15:01
     [java]
     [java] Benchmark                                                           
    Mode  Samples      Score      Error  Units
     [java] o.a.c.t.m.StreamingHistogramBench.exitingSH                         
    avgt        5   8299.160 ±  819.791  ms/op
     [java] o.a.c.t.m.StreamingHistogramBench.narrowexistingSH                  
     avgt        5   8754.663 ± 1026.628  ms/op
     [java] o.a.c.t.m.StreamingHistogramBench.sparseexistingSH                  
     avgt        5    755.780 ±   43.003  ms/op

     [java] o.a.c.t.m.StreamingHistogramBench.sparsestreaminghistogram60s       
    avgt        5    278.455 ±   22.318  ms/op
     [java] o.a.c.t.m.StreamingHistogramBench.streaminghistogram60s             
    avgt        5    280.750 ±   38.739  ms/op{code}
     [java] o.a.c.t.m.StreamingHistogramBench.narrowstreaminghistogram60s       
     avgt        5    291.652 ±   24.164  ms/op

     [java] o.a.c.t.m.StreamingHistogramBench.newSH1000x                        
    avgt        5   2909.856 ±  245.517  ms/op
     [java] o.a.c.t.m.StreamingHistogramBench.narrownewSH1000x                  
    avgt        5   1491.073 ±   96.055  ms/op
     [java] o.a.c.t.m.StreamingHistogramBench.sparsenewSH1000x                  
    avgt        5    467.546 ±   41.027  ms/op

     [java] o.a.c.t.m.StreamingHistogramBench.newstreaminghistogram1000x60s     
    avgt        5    174.485 ±    7.145  ms/op
     [java] 
o.a.c.t.m.StreamingHistogramBench.narrownewstreaminghistogram1000x60s    avgt   
     5    163.713 ±   16.302  ms/op
     [java] 
o.a.c.t.m.StreamingHistogramBench.sparsenewstreaminghistogram1000x60    avgt    
    5    162.857 ±   19.290  ms/op

     [java] o.a.c.t.m.StreamingHistogramBench.newSH10x                          
    avgt        5   8148.194 ± 1053.900  ms/op
     [java] o.a.c.t.m.StreamingHistogramBench.narrownewSH10x                    
    avgt        5   8843.801 ±  824.742  ms/op
     [java] o.a.c.t.m.StreamingHistogramBench.sparsenewSH10x                    
    avgt        5    457.407 ±   22.189  ms/op

     [java] o.a.c.t.m.StreamingHistogramBench.newSH100x                         
    avgt        5  11140.663 ± 1116.051  ms/op
     [java] o.a.c.t.m.StreamingHistogramBench.narrownewSH100x                   
    avgt        5   7654.534 ±  379.143  ms/op
     [java] o.a.c.t.m.StreamingHistogramBench.sparsenewSH100x                   
    avgt        5    445.953 ±    5.981  ms/op

     [java] o.a.c.t.m.StreamingHistogramBench.newSH10000x                       
    avgt        5   3016.663 ±  605.904  ms/op
     [java] o.a.c.t.m.StreamingHistogramBench.narrownewSH10000x                 
    avgt        5   2773.356 ±  292.641  ms/op
     [java] o.a.c.t.m.StreamingHistogramBench.sparsenewSH10000x                 
    avgt        5   3090.090 ±  361.765  ms/op

     [java] o.a.c.t.m.StreamingHistogramBench.newSH50and100x                    
    avgt        5   7305.847 ±  619.186  ms/op
     [java] o.a.c.t.m.StreamingHistogramBench.narrownewSH50and100x              
    avgt        5   5015.139 ±  611.160  ms/op
     [java] o.a.c.t.m.StreamingHistogramBench.sparsenewSH50and100x              
    avgt        5    477.743 ±   12.814  ms/op

     [java] o.a.c.t.m.StreamingHistogramBench.newSH50and1000x                   
     avgt        5   3304.479 ±  342.117  ms/op
     [java] o.a.c.t.m.StreamingHistogramBench.narrownewSH50and1000x             
    avgt        5   1683.944 ±  167.946  ms/op
     [java] o.a.c.t.m.StreamingHistogramBench.sparsenewSH50and1000x             
    avgt        5    461.928 ±    5.799  ms/op
{code}

> 33% of compaction time spent in StreamingHistogram.update()
> -----------------------------------------------------------
>
>                 Key: CASSANDRA-13038
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-13038
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Compaction
>            Reporter: Corentin Chary
>            Assignee: Jeff Jirsa
>         Attachments: compaction-speedup.patch, 
> compaction-streaminghistrogram.png, profiler-snapshot.nps
>
>
> With the following table, that contains a *lot* of cells: 
> {code}
> CREATE TABLE biggraphite.datapoints_11520p_60s (
>     metric uuid,
>     time_start_ms bigint,
>     offset smallint,
>     count int,
>     value double,
>     PRIMARY KEY ((metric, time_start_ms), offset)
> ) WITH CLUSTERING ORDER BY (offset DESC);
> AND compaction = {'class': 
> 'org.apache.cassandra.db.compaction.TimeWindowCompactionStrategy', 
> 'compaction_window_size': '6', 'compaction_window_unit': 'HOURS', 
> 'max_threshold': '32', 'min_threshold': '6'}
> Keyspace : biggraphite
>         Read Count: 1822
>         Read Latency: 1.8870054884742042 ms.
>         Write Count: 2212271647
>         Write Latency: 0.027705127678653473 ms.
>         Pending Flushes: 0
>                 Table: datapoints_11520p_60s
>                 SSTable count: 47
>                 Space used (live): 300417555945
>                 Space used (total): 303147395017
>                 Space used by snapshots (total): 0
>                 Off heap memory used (total): 207453042
>                 SSTable Compression Ratio: 0.4955200053039823
>                 Number of keys (estimate): 16343723
>                 Memtable cell count: 220576
>                 Memtable data size: 17115128
>                 Memtable off heap memory used: 0
>                 Memtable switch count: 2872
>                 Local read count: 0
>                 Local read latency: NaN ms
>                 Local write count: 1103167888
>                 Local write latency: 0.025 ms
>                 Pending flushes: 0
>                 Percent repaired: 0.0
>                 Bloom filter false positives: 0
>                 Bloom filter false ratio: 0.00000
>                 Bloom filter space used: 105118296
>                 Bloom filter off heap memory used: 106547192
>                 Index summary off heap memory used: 27730962
>                 Compression metadata off heap memory used: 73174888
>                 Compacted partition minimum bytes: 61
>                 Compacted partition maximum bytes: 51012
>                 Compacted partition mean bytes: 7899
>                 Average live cells per slice (last five minutes): NaN
>                 Maximum live cells per slice (last five minutes): 0
>                 Average tombstones per slice (last five minutes): NaN
>                 Maximum tombstones per slice (last five minutes): 0
>                 Dropped Mutations: 0
> {code}
> It looks like a good chunk of the compaction time is lost in 
> StreamingHistogram.update() (which is used to store the estimated tombstone 
> drop times).
> This could be caused by a huge number of different deletion times which would 
> makes the bin huge but it this histogram should be capped to 100 keys. It's 
> more likely caused by the huge number of cells.
> A simple solutions could be to only take into accounts part of the cells, the 
> fact the this table has a TWCS also gives us an additional hint that sampling 
> deletion times would be fine.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Reply via email to