[
https://issues.apache.org/jira/browse/CASSANDRA-21141?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Dmitry Konstantinov updated CASSANDRA-21141:
--------------------------------------------
Attachment: CASSANDRA-21141_alloc.html
CASSANDRA-21141_wall.html
CASSANDRA-21141_cpu.html
> Reduce memory allocation during transformation of BatchStatement to Mutation
> ----------------------------------------------------------------------------
>
> Key: CASSANDRA-21141
> URL: https://issues.apache.org/jira/browse/CASSANDRA-21141
> Project: Apache Cassandra
> Issue Type: Improvement
> Components: CQL/Interpreter
> Reporter: Dmitry Konstantinov
> Assignee: Dmitry Konstantinov
> Priority: Normal
> Fix For: 5.x
>
> Attachments: CASSANDRA-21141_alloc.html, CASSANDRA-21141_cpu.html,
> CASSANDRA-21141_wall.html, image-2026-01-28-09-39-38-183.png,
> trunk_alloc.html, trunk_cpu.html, trunk_wall.html
>
> Time Spent: 10m
> Remaining Estimate: 0h
>
> We allocate a lot of objects during a transformation of BatchStatement to
> Mutation. In many typical scenarios we can have a fast path and reduce the
> amount of allocated objects (as well as make the correspondent logic faster)
> Allocation framegraph:
> !image-2026-01-28-09-39-38-183.png|width=600!
> [^trunk_alloc.html]
> [^trunk_cpu.html]
> [^trunk_wall.html]
> Suggested optimisations:
> * force hash3_x64_128 inlining to help JIT with escape analysis and long[]
> heap allocation elimination, so the hash function value (long[2]) is not
> allocated on heap -
> [link|https://github.com/apache/cassandra/pull/4589/changes/02d5ae650c9581ea061fb1255e2078a278697b6d]
> * serializedRowBodySize: avoid capturing lambda allocation per cell by
> moving capturing arguments to SerializationHelper (same optimization as it
> was done in serializeRowBody for flushing some time ago) -
> [link|https://github.com/apache/cassandra/pull/4589/changes/34a3d7126351630eb91be1ba9546a6e3c84d9359]
> * UpdateParameters: allocate DeletionTime on demand (it is not needed if we
> do insert/updates) -
> [link|https://github.com/apache/cassandra/pull/4589/changes/f8f57ea14f0c40fabb0f049a79146f403c88a009]
> * Add fast path in valuesAsClustering logic for the typical scenario when we
> specify a single clustering key (a single row) to modify -
> [link|https://github.com/apache/cassandra/pull/4589/changes/e11961cf457a4545951cbfa0d20e2b929d5ae453]
> * Add fast path in nonTokenRestrictionValues logic for the typical scenario
> when we specify a single partition key (a single row) to modify, optimize
> also the case if a partition or clustering key is a single column -
> [link|https://github.com/apache/cassandra/pull/4589/changes/b7fe9cc34c0a6c0c3d20b12fc2ccd8a11f98f460]
> * BatchStatement: check if many similar rows for the same table are written
> unconditionally, in this case we can avoid columns info merging and builders
> allocation -
> [link|https://github.com/apache/cassandra/pull/4589/changes/d011dfa68b88fa2d52c9a661d4945c719febf1d5]
> * Avoid ClusteringIndexSliceFilter allocation if a write does not required a
> read (plain usual write), avoid iterator allocation, use array instead of
> ArrayList for perStatementOptions which does not grow dynamically -
> [link|https://github.com/apache/cassandra/pull/4589/changes/0e4abb36105457d7f5e630d6e8b40b560794ae2e]
> Forecasted reduction for heap allocations in a batch write test ~ 21%:
> {code:java}
> Total GC memory : 347.198 GiB
> vs
> Total GC memory : 272.358 GiB
> {code}
>
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]