I've decreased bloom_filter_fp_chance from 0.01 to 0.001. The sstableupgrade took 3 days to complete. And this is a result: node1 Bloom filter false positives: 380965 Bloom filter false ratio: 0.46560 Bloom filter space used: 27.1 MiB Bloom filter off heap memory used: 27.09 MiB node2 Bloom filter false positives: 866636 Bloom filter false ratio: 0.40865 Bloom filter space used: 27.78 MiB Bloom filter off heap memory used: 27.77 MiB node3 Bloom filter false positives: 433296 Bloom filter false ratio: 0.20359 Bloom filter space used: 26.15 MiB Bloom filter off heap memory used: 26.15 MiB node4 Bloom filter false positives: 550721 Bloom filter false ratio: 0.30233 Bloom filter space used: 24.7 MiB Bloom filter off heap memory used: 24.7 MiB
Martin On Wed, Apr 17, 2019 at 1:45 PM Stefan Miklosovic < stefan.mikloso...@instaclustr.com> wrote: > Lastly I wonder if that number is very same from every node you > connect your nodetool to. Do all nodes see very similar false > positives ratio / number? > > On Wed, 17 Apr 2019 at 21:41, Stefan Miklosovic > <stefan.mikloso...@instaclustr.com> wrote: > > > > One thing comes to my mind but my reasoning is questionable as I am > > not an expert in this. > > > > If you think about this, the whole concept of Bloom filter is to check > > if some record is in particular SSTable. False positive mean that, > > obviously, filter thought it was there but in fact it is not. So > > Cassandra did a look unnecessarily. Why does it think that it is there > > in such number of cases? You either make a lot of same requests on > > same partition key over time hence querying same data over and over > > again (but would not that data be cached?) or there was a lot of data > > written with same partition key so it thinks it is there but > > clustering column is different. As ts is of type timeuuid, isnt it > > true that you are doing a lot of queries with some date? It might be > > true that hash is done only on partition keys and not on clustering > > columns so filter gives you "yes" and it goes there, checks it > > clustering column is equal what you queried and its not there. But as > > I say I might be wrong ... > > > > More to it, your read_repair_chance is 0.0 so it will never do a > > repair after successful read (e.g. you have rf 3 and cl quorum so one > > node is somehow behind) so if you dont run repairs maybe it is just > > somehow unsychronized but that is really just my guess. > > > > On Wed, 17 Apr 2019 at 21:39, Martin Mačura <m.mac...@gmail.com> wrote: > > > > > > We cannot run any repairs on these tables. Whenever we tried it > (incremental or full or partitioner range), it caused a node to run out of > disk space during anticompaction. We'll try again once Cassandra 4.0 is > released. > > > > > > On Wed, Apr 17, 2019 at 1:07 PM Stefan Miklosovic < > stefan.mikloso...@instaclustr.com> wrote: > > >> > > >> if you invoke nodetool it gets false positives number from this metric > > >> > > >> > https://github.com/apache/cassandra/blob/cassandra-3.11/src/java/org/apache/cassandra/metrics/TableMetrics.java#L564-L578 > > >> > > >> You get high false positives so this accumulates them > > >> > > >> > https://github.com/apache/cassandra/blob/cassandra-3.11/src/java/org/apache/cassandra/metrics/TableMetrics.java#L572 > > >> > > >> If you follow that, that number is computed here > > >> > > >> > https://github.com/apache/cassandra/blob/cassandra-3.11/src/java/org/apache/cassandra/io/sstable/BloomFilterTracker.java#L44-L55 > > >> > > >> In order to have that number so high, the difference has to be so big > > >> so lastFalsePositiveCount is imho significantly lower > > >> > > >> False positives are ever increased only in BigTableReader where it get > > >> complicated very quickly and I am not sure why it is called to be > > >> honest. > > >> > > >> Is all fine with db as such? Do you run repairs? Does that number > > >> increses or decreases over time? Has repair or compaction some effect > > >> on it? > > >> > > >> On Wed, 17 Apr 2019 at 20:48, Martin Mačura <m.mac...@gmail.com> > wrote: > > >> > > > >> > Both tables use the default bloom_filter_fp_chance of 0.01 ... > > >> > > > >> > CREATE TABLE ... ( > > >> > a int, > > >> > b int, > > >> > bucket timestamp, > > >> > ts timeuuid, > > >> > c int, > > >> > ... > > >> > PRIMARY KEY ((a, b, bucket), ts, c) > > >> > ) WITH CLUSTERING ORDER BY (ts DESC, monitor ASC) > > >> > AND bloom_filter_fp_chance = 0.01 > > >> > AND compaction = {'class': > 'org.apache.cassandra.db.compaction.TimeWindowCompactionStrategy', > 'compaction_window_size': '1', 'compaction_window_unit': 'DAYS', > 'tombstone_threshold': '0.9', 'unchecked_tombstone_compaction': > > >> > 'false'} > > >> > AND dclocal_read_repair_chance = 0.0 > > >> > AND default_time_to_live = 63072000 > > >> > AND gc_grace_seconds = 10800 > > >> > ... > > >> > AND read_repair_chance = 0.0 > > >> > AND speculative_retry = 'NONE'; > > >> > > > >> > > > >> > CREATE TABLE ... ( > > >> > c int, > > >> > b int, > > >> > bucket timestamp, > > >> > ts timeuuid, > > >> > ... > > >> > PRIMARY KEY ((c, b, bucket), ts) > > >> > ) WITH CLUSTERING ORDER BY (ts DESC) > > >> > AND bloom_filter_fp_chance = 0.01 > > >> > AND compaction = {'class': > 'org.apache.cassandra.db.compaction.TimeWindowCompactionStrategy', > 'compaction_window_size': '1', 'compaction_window_unit': 'DAYS', > 'tombstone_threshold': '0.9', 'unchecked_tombstone_compaction': > > >> > 'false'} > > >> > AND dclocal_read_repair_chance = 0.0 > > >> > AND default_time_to_live = 63072000 > > >> > AND gc_grace_seconds = 10800 > > >> > ... > > >> > AND read_repair_chance = 0.0 > > >> > AND speculative_retry = 'NONE'; > > >> > > > >> > On Wed, Apr 17, 2019 at 12:25 PM Stefan Miklosovic < > stefan.mikloso...@instaclustr.com> wrote: > > >> >> > > >> >> What is your bloom_filter_fp_chance for either table? I guess it is > > >> >> bigger for the first one, bigger that number is between 0 and 1, > less > > >> >> memory it will use (17 MiB against 54.9 Mib) which means more false > > >> >> positives you will get. > > >> >> > > >> >> On Wed, 17 Apr 2019 at 19:59, Martin Mačura <m.mac...@gmail.com> > wrote: > > >> >> > > > >> >> > Hi, > > >> >> > I have a table with poor bloom filter false ratio: > > >> >> > SSTable count: 1223 > > >> >> > Space used (live): 726.58 GiB > > >> >> > Number of partitions (estimate): 8592749 > > >> >> > Bloom filter false positives: 35796352 > > >> >> > Bloom filter false ratio: 0.68472 > > >> >> > Bloom filter space used: 17.82 MiB > > >> >> > Compacted partition maximum bytes: 386857368 > > >> >> > > > >> >> > It's a time series, TWCS compaction, window size 1 day, data > partitioned in daily buckets, TTL 2 years. > > >> >> > > > >> >> > I have another table with a similar schema, but it is not > affected for some reason: > > >> >> > SSTable count: 1114 > > >> >> > Space used (live): 329.87 GiB > > >> >> > Number of partitions (estimate): 25460768 > > >> >> > Bloom filter false positives: 156942 > > >> >> > Bloom filter false ratio: 0.00010 > > >> >> > Bloom filter space used: 54.9 MiB > > >> >> > Compacted partition maximum bytes: 20924300 > > >> >> > > > >> >> > Thanks for any advice, > > >> >> > > > >> >> > Martin > > >> >> > > >> >> > --------------------------------------------------------------------- > > >> >> To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org > > >> >> For additional commands, e-mail: user-h...@cassandra.apache.org > > >> >> > > >> > > >> --------------------------------------------------------------------- > > >> To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org > > >> For additional commands, e-mail: user-h...@cassandra.apache.org > > >> > > --------------------------------------------------------------------- > To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org > For additional commands, e-mail: user-h...@cassandra.apache.org > >