[ https://issues.apache.org/jira/browse/CASSANDRA-7247?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14142375#comment-14142375 ]
Benedict commented on CASSANDRA-7247: ------------------------------------- It's probably better to construct a lightweight wrapper around the data you're using for equality (key bytes / token), with knowledge of _how_ to turn it into a string, and to do so only when we're asked for the TopK. It could well be worth enabling this on a per-CF / per-KS basis, though, or configuring the size of the sample in the yaml. If you have large keys (64K), the structure as it stands will take up > 128Mb per key space, or > 64Mb with the adjustment I've just suggested. Either way that's non-trivial, especially since we have two of them. Admittedly such large keys are not likely to be common. > Provide top ten most frequent keys per column family > ---------------------------------------------------- > > Key: CASSANDRA-7247 > URL: https://issues.apache.org/jira/browse/CASSANDRA-7247 > Project: Cassandra > Issue Type: Improvement > Reporter: Chris Lohfink > Assignee: Chris Lohfink > Priority: Minor > Attachments: cassandra-2.1-7247.txt, jconsole.png, patch.txt > > > Since already have the nice addthis stream library, can use it to keep track > of most frequent DecoratedKeys that come through the system using > StreamSummaries ([nice > explaination|http://boundary.com/blog/2013/05/14/approximate-heavy-hitters-the-spacesaving-algorithm/]). > Then provide a new metric to access them via JMX. -- This message was sent by Atlassian JIRA (v6.3.4#6332)