[ https://issues.apache.org/jira/browse/CASSANDRA-15430?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17016628#comment-17016628 ]
Thomas Steinmaurer commented on CASSANDRA-15430: ------------------------------------------------ [~benedict], please try to download the JFR files for both 2.1.18 and 3.0.18 here: [https://dynatrace-my.sharepoint.com/:f:/p/thomas_steinmaurer/EoFkdBH-WnlOmuGZ4hL_8PwByBTQLwhtlBGBLW_0y3P9rg?e=uKlr6W] The data model is pretty straightforward originating from Astyanax/Thrift legacy days, moving over to CQL, in a BLOB-centric model, with our client-side "serializer framework". E.g.: {noformat} CREATE TABLE ks."cf" ( k blob, n blob, v blob, PRIMARY KEY (k, n) ) WITH COMPACT STORAGE AND CLUSTERING ORDER BY (n ASC) AND bloom_filter_fp_chance = 0.01 AND caching = {'keys': 'ALL', 'rows_per_partition': 'NONE'} AND comment = '' AND compaction = {'class': 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy', 'max_threshold': '32', 'min_threshold': '2'} AND compression = {'chunk_length_in_kb': '64', 'class': 'org.apache.cassandra.io.compress.LZ4Compressor'} AND crc_check_chance = 1.0 AND dclocal_read_repair_chance = 0.0 AND default_time_to_live = 0 AND gc_grace_seconds = 259200 AND max_index_interval = 2048 AND memtable_flush_period_in_ms = 0 AND min_index_interval = 128 AND read_repair_chance = 0.0 AND speculative_retry = 'NONE'; {noformat} Regarding queries. It is really just about the write path (batch message processing) in Cas 2.1 vs. 3.0 as outlined in the issue description. We have tried single-partition batches vs. multi-partition batches (I know, bad practice), but single-partition batches didn't have a positive impact on the write path in 3.0 either in our tests. Moving from 2.1 to 3.0 would mean for us to add ~ 30-40% more resources to handle the same load sufficiently. Thanks for any help in that area! > Cassandra 3.0.18: BatchMessage.execute - 10x more on-heap allocations > compared to 2.1.18 > ---------------------------------------------------------------------------------------- > > Key: CASSANDRA-15430 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15430 > Project: Cassandra > Issue Type: Bug > Reporter: Thomas Steinmaurer > Priority: Normal > Attachments: dashboard.png, jfr_allocations.png, mutation_stage.png > > > In a 6 node loadtest cluster, we have been running with 2.1.18 a certain > production-like workload constantly and sufficiently. After upgrading one > node to 3.0.18 (remaining 5 still on 2.1.18 after we have seen that sort of > regression described below), 3.0.18 is showing increased CPU usage, increase > GC, high mutation stage pending tasks, dropped mutation messages ... > Some spec. All 6 nodes equally sized: > * Bare metal, 32 physical cores, 512G RAM > * Xmx31G, G1, max pause millis = 2000ms > * cassandra.yaml basically unchanged, thus same settings in regard to number > of threads, compaction throttling etc. > Following dashboard shows highlighted areas (CPU, suspension) with metrics > for all 6 nodes and the one outlier being the node upgraded to Cassandra > 3.0.18. > !dashboard.png|width=1280! > Additionally we see a large increase on pending tasks in the mutation stage > after the upgrade: > !mutation_stage.png! > And dropped mutation messages, also confirmed in the Cassandra log: > {noformat} > INFO [ScheduledTasks:1] 2019-11-15 08:24:24,780 MessagingService.java:1022 - > MUTATION messages were dropped in last 5000 ms: 41552 for internal timeout > and 0 for cross node timeout > INFO [ScheduledTasks:1] 2019-11-15 08:24:25,157 StatusLogger.java:52 - Pool > Name Active Pending Completed Blocked All Time > Blocked > INFO [ScheduledTasks:1] 2019-11-15 08:24:25,168 StatusLogger.java:56 - > MutationStage 256 81824 3360532756 0 > 0 > INFO [ScheduledTasks:1] 2019-11-15 08:24:25,168 StatusLogger.java:56 - > ViewMutationStage 0 0 0 0 > 0 > INFO [ScheduledTasks:1] 2019-11-15 08:24:25,168 StatusLogger.java:56 - > ReadStage 0 0 62862266 0 > 0 > INFO [ScheduledTasks:1] 2019-11-15 08:24:25,169 StatusLogger.java:56 - > RequestResponseStage 0 0 2176659856 0 > 0 > INFO [ScheduledTasks:1] 2019-11-15 08:24:25,169 StatusLogger.java:56 - > ReadRepairStage 0 0 0 0 > 0 > INFO [ScheduledTasks:1] 2019-11-15 08:24:25,169 StatusLogger.java:56 - > CounterMutationStage 0 0 0 0 > 0 > ... > {noformat} > Judging from a 15min JFR session for both, 3.0.18 and 2.1.18 on a different > node, high-level, it looks like the code path underneath > {{BatchMessage.execute}} is producing ~ 10x more on-heap allocations in > 3.0.18 compared to 2.1.18. > !jfr_allocations.png! > Left => 3.0.18 > Right => 2.1.18 > JFRs zipped are exceeding the 60MB limit to directly attach to the ticket. I > can upload them, if there is another destination available. -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org