Branimir Lambov created CASSANDRA-16318:
-------------------------------------------

             Summary: Memtable heap size is severely underestimated
                 Key: CASSANDRA-16318
                 URL: https://issues.apache.org/jira/browse/CASSANDRA-16318
             Project: Cassandra
          Issue Type: Bug
          Components: Local/Memtable
            Reporter: Branimir Lambov
         Attachments: image-2020-12-09-10-57-21-994.png, 
image-2020-12-09-11-01-31-273.png

We seem to be estimating the size of the on-heap memtable metadata to be around 
half of what it actually is. For example, during a [read benchmark which writes 
1 million single-long 
rows|https://github.com/blambov/cassandra/blob/memtable-heap/test/microbench/org/apache/cassandra/test/microbench/instance/ReadTestSmallPartitions.java]
 the memtable reports
{code}
1000000 ops, 58.174MiB serialized bytes, 385.284MiB (19%) on heap, 0.000KiB 
(0%) off-heap
{code}
while a heap dump taken at this point:
 !image-2020-12-09-10-57-21-994.png! 
lists an usage of about 666MB altogether.

Switching to {{offheap_objects}}, the reported numbers are
{code}
1000000 ops, 58.174MiB serialized bytes, 233.650MiB (12%) on heap, 53.406MiB 
(3%) off-heap
{code}
while actual heap usage:
 !image-2020-12-09-11-01-31-273.png! 
is about 442MB.

Looking at the code we definitely are not counting the 
{{AtomicBTreePartition.Holder}}, {{EncodingStats}}, liveness and deletion info 
objects associated with each partition, and most probably others.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to