[
https://issues.apache.org/jira/browse/CASSANDRA-7974?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Chris Lohfink updated CASSANDRA-7974:
-
Attachment: cassandra-2.1-7974v2.txt
I attached a version with a few extras:
* Includes sampling of writes
* Expose the partition type in JMX so that nodetool can serialize the blobs as
strings
* Include the margin of error from the summary
* Defaults for capacity and topK count to make it simpler to use, allows
overriding either with options
** not setting capacity to topK count since summary becomes very inaccurate if
cardinality vastly exceeds capacity (in case where capacity=10 a cardinality of
just 100 would be very inaccurate in a lot of loads)
** print out the estimated cardinality (using hyperloglog) so that its easier
to identify what an appropriate capacity will be if margin of error unacceptable
* make it so if sampling disabled theres no blocking (as opposed to
synchronizing addSample)
** also make case where sampling being enabled is non-blocking
* made it easy to add additional samplers, I would like to add a columns
count or size sampler as well
output looks like:
{code}
READ Sampler:
Cardinality: ~235 (256 capacity used)
Top 10 partitions:
PartitionCount +/-
4BpaP7j05i:true 1 0
jSvq6b62uXwfQb:true 1 0
BvkRbLI1rKO:true 1 0
...
WRITE Sampler:
Cardinality: ~4681 (256 capacity used)
Top 10 partitions:
Partition Count +/-
jXyI4PpocdtXAkvxG8geS1bkY:true4910
bid3tbjRKzDZ4l5Wu:true2912
cWti3ryllghSxOGEuG:true 1918
...
{code}
Enable tooling to detect hot partitions
---
Key: CASSANDRA-7974
URL: https://issues.apache.org/jira/browse/CASSANDRA-7974
Project: Cassandra
Issue Type: Improvement
Reporter: Brandon Williams
Assignee: Brandon Williams
Attachments: 7974.txt, cassandra-2.1-7974v2.txt
Sometimes you know you have a hot partition by the load on a replica set, but
have no way of determining which partition it is. Tracing is inadequate for
this without a lot of post-tracing analysis that might not yield results.
Since we already include stream-lib for HLL in compaction metadata, it
shouldn't be too hard to wire up topK for X seconds via jmx/nodetool and then
return the top partitions hit.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)