ReadStage filling up and leading to Read Timeouts

Rajsekhar Mallick Tue, 05 Feb 2019 21:31:07 -0800

Hello Team,

Cluster Details:
1. Number of Nodes in cluster : 7
2. Number of CPU cores: 48
3. Swap is enabled on all nodes
4. Memory available on all nodes : 120GB 
5. Disk space available : 745GB
6. Cassandra version: 2.1
7. Active tables are using size-tiered compaction strategy
8. Read Throughput: 6000 reads/s on each node (42000 reads/s cluster wide)
9. Read latency 99%: 300 ms
10. Write Throughput : 1800 writes/s
11. Write Latency 99%: 50 ms
12. Known issues in the cluster ( Large Partitions(upto 560MB, observed when 
they get compacted), tombstones)
13. To reduce the impact of tombstones, gc_grace_seconds set to 0 for the 
active tables
14. Heap size: 48 GB G1GC
15. Read timeout : 5000ms , Write timeouts: 2000ms
16. Number of concurrent reads: 64
17. Number of connections from clients on port 9042 stays almost constant 
(close to 1800)
18. Cassandra thread count also stays almost constant (close to 2000)


Problem Statement:
1. ReadStage often gets full (reaches max size 64) on 2 to 3 nodes and pending 
reads go upto 4000.
2. When the above happens Native-Transport-Stage gets full on neighbouring 
nodes(1024 max) and pending threads are also observed.
3. During this time, CPU load average rises, user % for Cassandra process 
reaches 90%
4. We see Read getting dropped, org.apache.cassandra.transport package errors 
of reads getting timeout is seen.
5. Read latency 99% reached 5seconds, client starts seeing impact.
6. No IOwait observed on any of the virtual cores, sjk ttop command shows max 
us% being used by “Worker Threads”

I have trying hard to zero upon what is the exact issue.
What I make out of these above observations is…there might be some slow 
queries, which get stuck on few nodes.
Then there is a cascading effect wherein other queries get lined up.
Unable to figure out any such slow queries up till now.
As I mentioned, there are large partitions. We using size-tiered compaction 
strategy, hence a large partition might be spread across multiple stables.
Can this fact lead to slow queries. I also tried to understand, that data in 
stables is stored in serialized format and when read into memory, it is 
unseralized. This would lead to a large object in memory which then needs to be 
transferred across the wire to the client.

Not sure what might be the reason. Kindly help on helping me understand what 
might be the impact on read performance when we have large partitions.
Kindly Suggest ways to catch these slow queries.
Also do add if you see any other issues from the above details
We are now considering to expand our cluster. Is the cluster under-sized. Will 
addition of nodes help resolve the issue.

Thanks,
Rajsekhar Mallick





---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
For additional commands, e-mail: user-h...@cassandra.apache.org

ReadStage filling up and leading to Read Timeouts

Reply via email to