[
https://issues.apache.org/jira/browse/CASSANDRA-20298?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17925021#comment-17925021
]
Dmitry Konstantinov edited comment on CASSANDRA-20298 at 2/7/25 8:58 PM:
-------------------------------------------------------------------------
and the second part of the puzzle - why our mutation threads are NOT rate
limited by memory allocator logic which should block to await for a free space
and protect from OOM.
We insert just few rows and flush our memtable, we use slab allocator for our
memtables, so we allocate a region, consume a bit from it and switch to a new
one when we switch to a new memtable during a flush. From memory allocator
accounting point of view we consumed a bit but in reality we have a lot of
memory used by slabs. Here we have a kind of internal memory fragmentation
issue, when for our pathological case the majority of memory is wasted in the
unused recycling slabs.
In total, we have a combination of livelock issue in tracker view update logic
+ slab memory fragmentation untracked by memtable memory allocator. Both are
very interesting and educative and we can make the logic more safe but I doubt
if we can face them in real workloads (the second one probably can be observed
when we have a lot of tables but it is a different story) to spend time on it...
was (Author: dnk):
and the second part of the puzzle - why our mutation threads are NOT rate
limited by memory allocator logic which should block to await for a free space
and protect from OOM.
We insert just few rows and flush our memtable, we use slab allocator for our
memtables, so we allocate a region, consume a bit from it and switch to a new
one when we switch to a new memtable during a flush. From memory allocator
accounting point of view we consumed a bit but in really we have a lot of
memory used by slabs. Here we have a kind of internal memory fragmentation
issue, when for our pathological case the majority of memory is wasted in the
unused recycling slabs.
In total, we have a combination of livelock issue in tracker view update logic
+ slab memory fragmentation untracked by memtable memory allocator. Both are
very interesting and educative and we can make the logic more safe but I doubt
if we can face them in real workloads (the second one probably can be observed
when we have a lot of tables but it is a different story) to spend time on it...
> Test failure CommitLogCQLTest.testSwitchMemtable
> ------------------------------------------------
>
> Key: CASSANDRA-20298
> URL: https://issues.apache.org/jira/browse/CASSANDRA-20298
> Project: Apache Cassandra
> Issue Type: Bug
> Components: Local/Commit Log
> Reporter: Brandon Williams
> Assignee: Dmitry Konstantinov
> Priority: Normal
> Fix For: 5.0.x
>
> Attachments:
> TEST-org.apache.cassandra.db.commitlog.CommitLogCQLTest.log,
> TEST-org.apache.cassandra.db.commitlog.CommitLogCQLTest.xml, steps.log,
> test_log_with_heap_histo_and_thread_dump.txt
>
> Time Spent: 10m
> Remaining Estimate: 0h
>
> Seen here:
> https://app.circleci.com/pipelines/github/driftx/cassandra/1831/workflows/0de1611d-d409-4d15-8171-dcf7183a8c61/jobs/112290/tests
> {noformat}
> unit.framework.AssertionFailedError: Forked Java VM exited abnormally. Please
> note the time in the report does not reflect the time until the VM exit.
> at jdk.internal.reflect.GeneratedMethodAccessor4.invoke(Unknown Source)
> at
> java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.base/java.util.Vector.forEach(Vector.java:1365)
> at jdk.internal.reflect.GeneratedMethodAccessor4.invoke(Unknown Source)
> at
> java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at jdk.internal.reflect.GeneratedMethodAccessor4.invoke(Unknown Source)
> at
> java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.base/java.util.Vector.forEach(Vector.java:1365)
> at jdk.internal.reflect.GeneratedMethodAccessor4.invoke(Unknown Source)
> {noformat}
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]