Hello Team, Cluster Details: 1. Number of Nodes in cluster : 7 2. Number of CPU cores: 48 3. Swap is enabled on all nodes 4. Memory available on all nodes : 120GB 5. Disk space available : 745GB 6. Cassandra version: 2.1 7. Active tables are using size-tiered compaction strategy 8. Read Throughput: 6000 reads/s on each node (42000 reads/s cluster wide) 9. Read latency 99%: 300 ms 10. Write Throughput : 1800 writes/s 11. Write Latency 99%: 50 ms 12. Known issues in the cluster ( Large Partitions(upto 560MB, observed when they get compacted), tombstones) 13. To reduce the impact of tombstones, gc_grace_seconds set to 0 for the active tables 14. Heap size: 48 GB G1GC 15. Read timeout : 5000ms , Write timeouts: 2000ms 16. Number of concurrent reads: 64 17. Number of connections from clients on port 9042 stays almost constant (close to 1800) 18. Cassandra thread count also stays almost constant (close to 2000)
Problem Statement: 1. ReadStage often gets full (reaches max size 64) on 2 to 3 nodes and pending reads go upto 4000. 2. When the above happens Native-Transport-Stage gets full on neighbouring nodes(1024 max) and pending threads are also observed. 3. During this time, CPU load average rises, user % for Cassandra process reaches 90% 4. We see Read getting dropped, org.apache.cassandra.transport package errors of reads getting timeout is seen. 5. Read latency 99% reached 5seconds, client starts seeing impact. 6. No IOwait observed on any of the virtual cores, sjk ttop command shows max us% being used by “Worker Threads” I have trying hard to zero upon what is the exact issue. What I make out of these above observations is…there might be some slow queries, which get stuck on few nodes. Then there is a cascading effect wherein other queries get lined up. Unable to figure out any such slow queries up till now. As I mentioned, there are large partitions. We using size-tiered compaction strategy, hence a large partition might be spread across multiple stables. Can this fact lead to slow queries. I also tried to understand, that data in stables is stored in serialized format and when read into memory, it is unseralized. This would lead to a large object in memory which then needs to be transferred across the wire to the client. Not sure what might be the reason. Kindly help on helping me understand what might be the impact on read performance when we have large partitions. Kindly Suggest ways to catch these slow queries. Also do add if you see any other issues from the above details We are now considering to expand our cluster. Is the cluster under-sized. Will addition of nodes help resolve the issue. Thanks, Rajsekhar Mallick --------------------------------------------------------------------- To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org For additional commands, e-mail: user-h...@cassandra.apache.org