Hello Jeff,

Thanks for the reply.
We do have GC logs enabled.
We do observe gc pauses upto 2 seconds but quite often we see this issue
even when the gc log reads good and clear.

JVM Flags related to G1GC:

Xms: 48G
Xmx:48G
Maxgcpausemillis=200
Parallels gc threads=32
Concurrent gc threads= 10
Initiatingheapoccupancypercent=50

You talked about dropping application page size. Please do elaborate on how
to change the same.
Reducing the concurrent reads to 32 does help as we have tried the
same...the cpu load average remains under threshold...but read timeout
keeps on happening.

We will definitely try increasing the key cache sizes after verifying the
current max heap usage in the cluster.

Thanks,
Rajsekhar Mallick

On Wed, 6 Feb, 2019, 11:17 AM Jeff Jirsa <jji...@gmail.com wrote:

> What you're potentially seeing is the GC impact of reading a large
> partition - do you have GC logs or StatusLogger output indicating you're
> pausing? What are you actual JVM flags you're using?
>
> Given your heap size, the easiest mitigation may be significantly
> increasing your key cache size (up to a gigabyte or two, if needed).
>
> Yes, when you read data, it's materialized in memory (iterators from each
> sstable are merged and sent to the client), so reading lots of rows from a
> wide partition can cause GC pressure just from materializing the responses.
> Dropping your application's paging size could help if this is the problem.
>
> You may be able to drop concurrent reads from 64 to something lower
> (potentially 48 or 32, given your core count) to mitigate GC impact from
> lots of objects when you have a lot of concurrent reads, or consider
> upgrading to 3.11.4 (when it's out) to take advantage of CASSANDRA-11206
> (which made reading wide partitions less expensive). STCS especially wont
> help here - a large partition may be larger than you think, if it's
> spanning a lot of sstables.
>
>
>
>
> On Tue, Feb 5, 2019 at 9:30 PM Rajsekhar Mallick <raj.mallic...@gmail.com>
> wrote:
>
>> Hello Team,
>>
>> Cluster Details:
>> 1. Number of Nodes in cluster : 7
>> 2. Number of CPU cores: 48
>> 3. Swap is enabled on all nodes
>> 4. Memory available on all nodes : 120GB
>> 5. Disk space available : 745GB
>> 6. Cassandra version: 2.1
>> 7. Active tables are using size-tiered compaction strategy
>> 8. Read Throughput: 6000 reads/s on each node (42000 reads/s cluster wide)
>> 9. Read latency 99%: 300 ms
>> 10. Write Throughput : 1800 writes/s
>> 11. Write Latency 99%: 50 ms
>> 12. Known issues in the cluster ( Large Partitions(upto 560MB, observed
>> when they get compacted), tombstones)
>> 13. To reduce the impact of tombstones, gc_grace_seconds set to 0 for the
>> active tables
>> 14. Heap size: 48 GB G1GC
>> 15. Read timeout : 5000ms , Write timeouts: 2000ms
>> 16. Number of concurrent reads: 64
>> 17. Number of connections from clients on port 9042 stays almost constant
>> (close to 1800)
>> 18. Cassandra thread count also stays almost constant (close to 2000)
>>
>> Problem Statement:
>> 1. ReadStage often gets full (reaches max size 64) on 2 to 3 nodes and
>> pending reads go upto 4000.
>> 2. When the above happens Native-Transport-Stage gets full on
>> neighbouring nodes(1024 max) and pending threads are also observed.
>> 3. During this time, CPU load average rises, user % for Cassandra process
>> reaches 90%
>> 4. We see Read getting dropped, org.apache.cassandra.transport package
>> errors of reads getting timeout is seen.
>> 5. Read latency 99% reached 5seconds, client starts seeing impact.
>> 6. No IOwait observed on any of the virtual cores, sjk ttop command shows
>> max us% being used by “Worker Threads”
>>
>> I have trying hard to zero upon what is the exact issue.
>> What I make out of these above observations is…there might be some slow
>> queries, which get stuck on few nodes.
>> Then there is a cascading effect wherein other queries get lined up.
>> Unable to figure out any such slow queries up till now.
>> As I mentioned, there are large partitions. We using size-tiered
>> compaction strategy, hence a large partition might be spread across
>> multiple stables.
>> Can this fact lead to slow queries. I also tried to understand, that data
>> in stables is stored in serialized format and when read into memory, it is
>> unseralized. This would lead to a large object in memory which then needs
>> to be transferred across the wire to the client.
>>
>> Not sure what might be the reason. Kindly help on helping me understand
>> what might be the impact on read performance when we have large partitions.
>> Kindly Suggest ways to catch these slow queries.
>> Also do add if you see any other issues from the above details
>> We are now considering to expand our cluster. Is the cluster under-sized.
>> Will addition of nodes help resolve the issue.
>>
>> Thanks,
>> Rajsekhar Mallick
>>
>>
>>
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
>> For additional commands, e-mail: user-h...@cassandra.apache.org
>>
>>

Reply via email to