[
https://issues.apache.org/jira/browse/CASSANDRA-20176?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17915849#comment-17915849
]
Dmitry Konstantinov commented on CASSANDRA-20176:
-------------------------------------------------
{quote}So, if you feel a burning desire to improve its allocation rate feel
free.\{quote}
[~benedict] thank you for the feedback.
Cannot say that it is very burning :), it is more like an interesting puzzle to
solve (but I have a lot of other puzzles too). In any case I want to check
before the memory allocation rate of the spin wait logic in case of 50-70% CPU
usage as you mentioned. If it is less than 5% I will probably pause for while.
If it is more than 5% then I will try to think in background a bit more about a
possible alternative for the spin wait logic which uses CLSM as a concurrent
heap structure now (I am speaking about a local change of the data structure
used to keep spinning threads only, not more; definitely not a major
refactoring of SEP).
I would appreciate if you can recommend somebody for reviewing of concurrency
related stories like this (I have some other ideas) in future.
P.S. If some help with performance testing/tuning for Accord is needed I would
love to apply some time there.
> Reduce memory allocation in SEP Worker spin wait logic
> ------------------------------------------------------
>
> Key: CASSANDRA-20176
> URL: https://issues.apache.org/jira/browse/CASSANDRA-20176
> Project: Apache Cassandra
> Issue Type: Improvement
> Components: Local/Other
> Reporter: Dmitry Konstantinov
> Assignee: Dmitry Konstantinov
> Priority: Normal
> Attachments: image-2025-01-01-13-14-02-562.png,
> image-2025-01-01-13-15-16-767.png, ttop_disabled_sep.txt, ttop_enabled_sep.txt
>
>
> There is a visible memory allocation within spin waiting logic in SEP
> Executor: org.apache.cassandra.concurrent.SEPWorker#doWaitSpin for some
> workloads. For example it is observed for a writing test described in
> CASSANDRA-20165 where ~8.5% of total allocations are from this logic:
> !image-2025-01-01-13-14-02-562.png|width=570!
> !image-2025-01-01-13-15-16-767.png|width=570!
> The idea of this parking is to avoid unpark signalling costs. The logic
> selects a random time period to park a thread by LockSupport.parkNanos and
> put the thread into a ConcurrentSkipListMap using wake up time as a key, so
> the map is used as a concurrent priority queue. Once the parking is finished
> - the thread removes itself from the map. When we neede to schedule a task -
> we take a spinning thread with the smallest wake up time from the map.
> We can try to implement another algorithm for this logic without memory
> allocation overheads, for example based on a Timing Wheel data structure.
> Note: it also makes sense to check granularity of actual parking time
> (https://hazelcast.com/blog/locksupport-parknanos-under-the-hood-and-the-curious-case-of-parking/)
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]