[
https://issues.apache.org/jira/browse/CASSANDRA-20176?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17909159#comment-17909159
]
Dmitry Konstantinov commented on CASSANDRA-20176:
-------------------------------------------------
Hi [~benedict], thank you for taking a look and sharing your thoughts. Yes, I
agree that such revise is a much better option, I have this question in my mind
too for quite a long time. The main problem with it - I afraid it can be a very
time consuming story with an unclear final :). At least I have got such
impression after analyzing the tickets like these some time ago:
https://issues.apache.org/jira/browse/CASSANDRA-10989
https://issues.apache.org/jira/browse/CASSANDRA-4718
https://issues.apache.org/jira/browse/CASSANDRA-16499
https://issues.apache.org/jira/browse/CASSANDRA-1632
So, I can help here but it looks like we need to define more clear goals and
first steps to start with.
In this direction I think virtual threads can be a good alternative in long
term, using of virtual threads can help to avoid blocking of native transport
request threads with coordination results awaiting. And the code should be much
easier compared for example to a reactive approach, which one based on my
expirience is a nightmare from a troubleshooting point of view...
The problems with virtual threads are:
- they are available only in latest Java versions while Cassandra is quite
conservative in adopting of new versions..
- they are not very friendly with mutable thread locals which we use quite
widely in Cassandra code now
Returning back to the original story/idea, in production people usually try to
not saturate their system under normal workload, so some amount of spinning I
suppose still have place. While it is not an issue from CPU consumption point
of view, an extra memory consumption still may have an impact on GC/CPU caches
efficiency.
8.5% of allocations does not look big but the issue is that overall memory
allocation distributed across many such places, each one eats a bit but in sum
such overheads are not less or even more than main allocations directly related
to an actual data processing.
So, my idea was: if there is a cheap enough way to make a local change and get
a profit without massive re-design/re-testing - it would be beneficial. If it
is not an option - then agree, it is not worth it. I will try to think a bit
about it in background and if I find a simple way to do it - I will share it to
review.
A short version:
- for the current story I will try think a bit in background to find a simple
local change without spending a lot of efforts
- regarding the overall executor revising I am ready to help with it but I
think we need a plan where to start with
> Reduce memory allocation in SEP Worker spin wait logic
> ------------------------------------------------------
>
> Key: CASSANDRA-20176
> URL: https://issues.apache.org/jira/browse/CASSANDRA-20176
> Project: Apache Cassandra
> Issue Type: Improvement
> Components: Local/Other
> Reporter: Dmitry Konstantinov
> Assignee: Dmitry Konstantinov
> Priority: Normal
> Attachments: image-2025-01-01-13-14-02-562.png,
> image-2025-01-01-13-15-16-767.png
>
>
> There is a quite massive memory allocation within spin waiting logic in SEP
> Executor: org.apache.cassandra.concurrent.SEPWorker#doWaitSpin for some
> workloads. For example it is observed for a writing test described in
> CASSANDRA-20165 where ~8.5% of total allocations are from this logic:
> !image-2025-01-01-13-14-02-562.png|width=570!
> !image-2025-01-01-13-15-16-767.png|width=570!
> The idea of this parking is to avoid unpark signalling costs. The logic
> selects a random time period to park a thread by LockSupport.parkNanos and
> put the thread into a ConcurrentSkipListMap using wake up time as a key, so
> the map is used as a concurrent priority queue. Once the parking is finished
> - the thread removes itself from the map. When we neede to schedule a task -
> we take a spinning thread with the smallest wake up time from the map.
> We can try to implement another algorithm for this logic without memory
> allocation overheads, for example based on a Timing Wheel data structure.
> Note: it also makes sense to check granularity of actual parking time
> (https://hazelcast.com/blog/locksupport-parknanos-under-the-hood-and-the-curious-case-of-parking/)
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]