[
https://issues.apache.org/jira/browse/CASSANDRA-7247?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13999621#comment-13999621
]
Chris Lohfink commented on CASSANDRA-7247:
------------------------------------------
Problem is StreamSummary is not thread safe. There is a
ConcurrentStreamSummary, which I found in this implementation to be ~5x slower
then a synchronized block around the offer of the non-thread safe one.
Concurrent did perform similarly when also wrapped in synchronized block which
I will show below but because it would lose any benefit of being a concurrent
implementation when access is serialized I think the faster impl is best.
Done on 2013 retina MBP with 500gb ssd:
{code:title=No Changes}
id, ops , op/s, key/s, mean, med, .95,
.99, .999, max, time, stderr
4 threadCount, 634450 , 21692, 21692, 0.2, 0.2, 0.2,
0.2, 0.4, 740.1, 29.2, 0.01188
8 threadCount, 886600 , 29762, 29762, 0.3, 0.2, 0.3,
0.4, 1.3, 1007.3, 29.8, 0.01220
16 threadCount, 912050 , 29035, 29035, 0.5, 0.3, 0.9,
2.5, 11.2, 1393.8, 31.4, 0.01162
24 threadCount, 1022250 , 32681, 32681, 0.7, 0.5, 1.0,
2.9, 13.5, 1126.5, 31.3, 0.00923
36 threadCount, 946550 , 30900, 30900, 1.2, 0.8, 1.4,
3.0, 22.5, 1369.2, 30.6, 0.01089
{code}
{code:title=With Patch}
id, ops , op/s, key/s, mean, med, .95,
.99, .999, max, time, stderr
4 threadCount, 643900 , 21700, 21700, 0.2, 0.2, 0.2,
0.2, 0.9, 941.1, 29.7, 0.01079
8 threadCount, 942100 , 32300, 32300, 0.2, 0.2, 0.3,
0.3, 1.2, 849.5, 29.2, 0.01519
16 threadCount, 907400 , 30650, 30650, 0.5, 0.3, 0.8,
1.9, 10.7, 1124.0, 29.6, 0.01112
24 threadCount, 1026150 , 31753, 31753, 0.7, 0.5, 0.9,
3.3, 20.6, 1299.0, 32.3, 0.01295
36 threadCount, 980600 , 30077, 30077, 1.2, 0.8, 1.3,
2.7, 24.9, 1394.3, 32.6, 0.01747
{code}
> Provide top ten most frequent keys per column family
> ----------------------------------------------------
>
> Key: CASSANDRA-7247
> URL: https://issues.apache.org/jira/browse/CASSANDRA-7247
> Project: Cassandra
> Issue Type: Improvement
> Reporter: Chris Lohfink
> Priority: Minor
> Attachments: patch.diff
>
>
> Since already have the nice addthis stream library, can use it to keep track
> of most frequent DecoratedKeys that come through the system using
> StreamSummaries ([nice
> explaination|http://boundary.com/blog/2013/05/14/approximate-heavy-hitters-the-spacesaving-algorithm/]).
> Then provide a new metric to access them via JMX.
--
This message was sent by Atlassian JIRA
(v6.2#6252)