According to this: https://issues.apache.org/jira/browse/CASSANDRA-5029
Bloom filter is still on by default for LCS in 1.2.X Thanks. -Wei ________________________________ From: "Hiller, Dean" <dean.hil...@nrel.gov> To: "user@cassandra.apache.org" <user@cassandra.apache.org> Sent: Monday, March 4, 2013 10:42 AM Subject: Re: Poor read latency Recommended settings are 8G RAM and your memory grows with the number of rows through index samples(configured in cassandra.yaml as samples per row something…look for the word index). Also, bloomfilters grow with RAM if using size tiered compaction. We are actually trying to switch to leveled compaction in 1.2.2 as I think the default is no bloomfilters as LCS does not "really" need them I think since 90% of rows are in highest tier(but this just works better for certain type profiles like very heavy read vs. the number of writes). Later, Dean From: Tom Martin <tompo...@gmail.com<mailto:tompo...@gmail.com>> Reply-To: "user@cassandra.apache.org<mailto:user@cassandra.apache.org>" <user@cassandra.apache.org<mailto:user@cassandra.apache.org>> Date: Monday, March 4, 2013 11:20 AM To: "user@cassandra.apache.org<mailto:user@cassandra.apache.org>" <user@cassandra.apache.org<mailto:user@cassandra.apache.org>> Subject: Re: Poor read latency Yeah, I just checked and the heap size 0.75 warning has been appearing. nodetool info reports: Heap Memory (MB) : 563.88 / 1014.00 Heap Memory (MB) : 646.01 / 1014.00 Heap Memory (MB) : 639.71 / 1014.00 We have plenty of free memory on each instance. Do we need bigger instances or should we just configure each node to have a bigger max heap? On Mon, Mar 4, 2013 at 6:10 PM, Hiller, Dean <dean.hil...@nrel.gov<mailto:dean.hil...@nrel.gov>> wrote: What is nodetool info say for your memory? (we hit that one with memory near the max and it slowed down our system big time…still working on resolving it too). Do any logs have the hit 0.75, running compaction OR worse hit 0.85 running compaction….you get that if the above is the case typically. Dean From: Tom Martin <tompo...@gmail.com<mailto:tompo...@gmail.com><mailto:tompo...@gmail.com<mailto:tompo...@gmail.com>>> Reply-To: "user@cassandra.apache.org<mailto:user@cassandra.apache.org><mailto:user@cassandra.apache.org<mailto:user@cassandra.apache.org>>" <user@cassandra.apache.org<mailto:user@cassandra.apache.org><mailto:user@cassandra.apache.org<mailto:user@cassandra.apache.org>>> Date: Monday, March 4, 2013 10:31 AM To: "user@cassandra.apache.org<mailto:user@cassandra.apache.org><mailto:user@cassandra.apache.org<mailto:user@cassandra.apache.org>>" <user@cassandra.apache.org<mailto:user@cassandra.apache.org><mailto:user@cassandra.apache.org<mailto:user@cassandra.apache.org>>> Subject: Poor read latency Hi all, We have a small (3 node) cassandra cluster on aws. We have a replication factor of 3, a read level of local_quorum and are using the ephemeral disk. We're getting pretty poor read performance and quite high read latency in cfstats. For example: Column Family: AgentHotel SSTable count: 4 Space used (live): 829021175 Space used (total): 829021175 Number of Keys (estimate): 2148352 Memtable Columns Count: 0 Memtable Data Size: 0 Memtable Switch Count: 0 Read Count: 67204 Read Latency: 23.813 ms. Write Count: 0 Write Latency: NaN ms. Pending Tasks: 0 Bloom Filter False Positives: 50 Bloom Filter False Ratio: 0.00201 Bloom Filter Space Used: 7635472 Compacted row minimum size: 259 Compacted row maximum size: 4768 Compacted row mean size: 873 For comparison we have a similar set up in another cluster for an old project (hosted on rackspace) where we're getting sub 1ms read latencies. We are using multigets on the client (Hector) but are only requesting ~40 rows per request on average. I feel like we should reasonably expect better performance but perhaps I'm mistaken. Is there anything super obvious we should be checking out?