[ 
https://issues.apache.org/jira/browse/CASSANDRA-21390?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18089623#comment-18089623
 ] 

Dmitry Konstantinov commented on CASSANDRA-21390:
-------------------------------------------------

Interim summary:
 * based on inputs from [~samueldlightfoot]  (see also: [^findings_mask.md]) 
the issue is reproduced on several clusters with the following configuration 
(it does not mean that it is only reproduced in such conditions):
version: 5.0.6
allocation type: offheap_objects
memtable type: SkipListMemtable
affected tables: cluster (Cassandra Reaper)
 * we have found 3 potential scenarios but all of them due to different reasons 
cannot be considered as really happened on the reported envs:
 ** suggested by [~smiklosovic] : override of clustering row with out of order 
timestamp, the updated row is incorrectly accounted into heap usage while it is 
not actually stored (similar to CASSANDRA-18125), MR: 
[https://github.com/apache/cassandra/pull/4886/changes]
 ** suggested by [~samueldlightfoot] : if an exception is thrown from 
tryUpdateData then we may miss an update of memtable but increment usage in 
finally block (in theory it is possible but we have not seen any such 
exceptions in the log)
 ** suggested by [~dnk]: we have asymmetry in BTree heap usage measurement: 
when we remove a complex column we measure heap size for inner BTree cells 
using org.apache.cassandra.utils.btree.BTree#sizeOnHeapOf which includes size 
maps. When we update a BTree by adding elements into it then we do not include 
size maps. It requires more than 31 element in set to reproduce and usage of 
+/- set operations which are not present in Reaper code

So, the mystery still remains :) 

> TrieMemtable MemtableReclaimMemory AssertionError: Negative released in 
> MemtablePool$SubPool
> --------------------------------------------------------------------------------------------
>
>                 Key: CASSANDRA-21390
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-21390
>             Project: Apache Cassandra
>          Issue Type: Bug
>          Components: Local/Memtable
>            Reporter: Praveen Reddy Arra
>            Assignee: Dmitry Konstantinov
>            Priority: Normal
>         Attachments: findings_mask.md, image-2026-05-21-09-17-49-716.png, 
> sstabledump
>
>          Time Spent: 1h
>  Remaining Estimate: 0h
>
> We have started seeing this fatal exception in Apache Cassandra 5.0.6 on one 
> of our clusters.
> {code:java}
> [ERROR] [MemtableReclaimMemory:1] cluster_id=xxx ip_address=xxx.xxx.xxx.xxx 
> JVMStabilityInspector.java:70 - Exception in thread 
> Thread[MemtableReclaimMemory:1,5,MemtableReclaimMemory]
> java.lang.AssertionError: Negative released: -4332
> at 
> org.apache.cassandra.utils.memory.MemtablePool$SubPool.released(MemtablePool.java:194)
> at 
> org.apache.cassandra.utils.memory.MemtableAllocator$SubAllocator.releaseAll(MemtableAllocator.java:153)
> at 
> org.apache.cassandra.utils.memory.MemtableAllocator$SubAllocator.setDiscarded(MemtableAllocator.java:144)
> at 
> org.apache.cassandra.utils.memory.MemtableAllocator.setDiscarded(MemtableAllocator.java:95)
> at 
> org.apache.cassandra.utils.memory.NativeAllocator.setDiscarded(NativeAllocator.java:205)
> at 
> org.apache.cassandra.db.memtable.AbstractAllocatorMemtable.discard(AbstractAllocatorMemtable.java:171)
> at 
> org.apache.cassandra.db.memtable.TrieMemtable.discard(TrieMemtable.java:163)
> at 
> org.apache.cassandra.db.ColumnFamilyStore$Flush$1.runMayThrow(ColumnFamilyStore.java:1398)
> at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:26)
> at 
> org.apache.cassandra.concurrent.ExecutionFailure$1.run(ExecutionFailure.java:133)
> at 
> java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136)
> at 
> java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635)
> at 
> io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
> at java.base/java.lang.Thread.run(Thread.java:842)
> {code}
> {code:yaml}
> memtable_allocation_type - heap_buffers
> file_cache_enabled: true
> file_cache_size: 2048MiB
> memtable:
>   configurations:
>     skiplist:
>       class_name: SkipListMemtable
>     trie:
>       class_name: TrieMemtable
>     default:
>       inherits: trie
> {code}
> it looks similar to the open MemtableReclaimMemory assert issue in 
> [CASSANDRA-18159|https://issues.apache.org/jira/browse/CASSANDRA-18159]
> Environment is RHEL 8.10 with OpenJDK 17 and 16GB heap, -ea enabled.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to