Branimir Lambov created CASSANDRA-16318:
-------------------------------------------
Summary: Memtable heap size is severely underestimated
Key: CASSANDRA-16318
URL: https://issues.apache.org/jira/browse/CASSANDRA-16318
Project: Cassandra
Issue Type: Bug
Components: Local/Memtable
Reporter: Branimir Lambov
Attachments: image-2020-12-09-10-57-21-994.png,
image-2020-12-09-11-01-31-273.png
We seem to be estimating the size of the on-heap memtable metadata to be around
half of what it actually is. For example, during a [read benchmark which writes
1 million single-long
rows|https://github.com/blambov/cassandra/blob/memtable-heap/test/microbench/org/apache/cassandra/test/microbench/instance/ReadTestSmallPartitions.java]
the memtable reports
{code}
1000000 ops, 58.174MiB serialized bytes, 385.284MiB (19%) on heap, 0.000KiB
(0%) off-heap
{code}
while a heap dump taken at this point:
!image-2020-12-09-10-57-21-994.png!
lists an usage of about 666MB altogether.
Switching to {{offheap_objects}}, the reported numbers are
{code}
1000000 ops, 58.174MiB serialized bytes, 233.650MiB (12%) on heap, 53.406MiB
(3%) off-heap
{code}
while actual heap usage:
!image-2020-12-09-11-01-31-273.png!
is about 442MB.
Looking at the code we definitely are not counting the
{{AtomicBTreePartition.Holder}}, {{EncodingStats}}, liveness and deletion info
objects associated with each partition, and most probably others.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]