[
https://issues.apache.org/jira/browse/CASSANDRA-20166?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18042880#comment-18042880
]
Dmitry Konstantinov commented on CASSANDRA-20166:
-------------------------------------------------
h3. Load
1 partition text column, 1 clustering text column, 5 value text columns,
inserts are done using 10-row batches.
cassandra-stress "user profile=./batch_profile.yaml no-warmup
ops(insert=1,partition-select=0) n=10m" -rate threads=100 -node <IP>
h3. Test environment
1 cassandra server node = m8i.4xlarge (16 vCPU, x86_64, 64 GiB RAM, EBS)
cassandra-stress = c5.9xlarge
h3. Profiling data
Memory allocation profile is collected using Async profiler tool (-e alloc)
Allocation profile before: [^CASSANDRA-20166_before_alloc.html]
HeapByteBuffer has 11.17% of allocations, it is the allocated class.
!heap_allocations_profile_before.png|width=540!
Allocation profile after: [^CASSANDRA-20166_after_alloc.html]
HeapByteBuffer is dropped down to 3.67% of allocations (we still allocate it
for partition and clustering keys; it is too complicated to adjust the logic of
parsing for them to use byte[])
!image-2025-12-04-19-25-48-624.png|width=540!
> Avoid ByteBuffer allocation during decoding of prepared CQL write requests
> --------------------------------------------------------------------------
>
> Key: CASSANDRA-20166
> URL: https://issues.apache.org/jira/browse/CASSANDRA-20166
> Project: Apache Cassandra
> Issue Type: Improvement
> Components: CQL/Interpreter
> Reporter: Dmitry Konstantinov
> Assignee: Dmitry Konstantinov
> Priority: Normal
> Fix For: 5.x
>
> Attachments: CASSANDRA-20166-trunk_ci_summary.htm,
> CASSANDRA-20166-trunk_results_details.tar.xz,
> CASSANDRA-20166_after_alloc.html, CASSANDRA-20166_before_alloc.html,
> async_profiler_alloc.png, heap_allocations_profile_before.png,
> image-2024-12-26-17-33-39-031.jpg, image-2024-12-26-17-33-39-031.png,
> image-2024-12-26-17-35-05-485.png, image-2025-12-04-19-25-48-624.png
>
> Time Spent: 50m
> Remaining Estimate: 0h
>
> A lot of ByteBuffer objects are allocated when we decode CQL queries,
> frequently the space spent for such objects is large than the actual amount
> of data received.
> There was a similar optimization (use byte[] directly and wrap them into
> ArrayCell instead of BufferCell) done some time ago for the place where a
> Mutation object is deserializing during a Cassandra cross-node communication
> or reading from a disk: CASSANDRA-15393
> While a complete replacement of ByteBuffer with byte[] during CQL decoding
> step looks like a very complex task (ByteBuffer is a part of too many
> entities involved into CQL parsing) we can optimize 20% of logic to get 80%
> of benefit by focusing only on batch and modification statements when
> prepared statements are used and cell values are provided as bind variables.
> !image-2024-12-26-17-33-39-031.jpg|width=570!
> In case of 10-symbol String values I used for a test the wrapping ByteBuffer
> objects are costlier than inner byte[] with data:
>
> !image-2024-12-26-17-35-05-485.png|width=570!
> Async profiler (-e alloc) view:
> !async_profiler_alloc.png|width=570!
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]