Stefan Podkowinski created CASSANDRA-14999:
----------------------------------------------
Summary: Incorrect fallback calculation of getApproximateKeyCount
Key: CASSANDRA-14999
URL: https://issues.apache.org/jira/browse/CASSANDRA-14999
Project: Cassandra
Issue Type: Bug
Reporter: Stefan Podkowinski
Creating a key count for a number of sstables depends on a probabilistic
hyperloglog data structure for estimating cardinality of keys. In case of any
errors, we'll fallback to [some
code|https://github.com/apache/cassandra/blob/7d138e20ea987d44fffbc47de4674b253b7431ff/src/java/org/apache/cassandra/io/sstable/format/SSTableReader.java#L294]
that does not calculate the cardinality, but simply creates a sum of all
estimated keys for all sstables. This will lead to very different results for
larger numbers of sstables with identical keys.
We should have a look at the possible implications of that. Do we depend on
this value for sizing bloom filters?
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]