Hi, While working on integrating the new version of Jamm to Cassandra, I realized that our way to create slabs and how we measure their memory consumption may not be optimal.
For the sake of simplicity I will only talk about read-write heap buffers here. Considering that the same principles can be applied to other types of buffers. Looking at *org.apache.cassandra.utils.memory.SlabAllocator.Region::allocate* what we do is: data.duplicate().position((newOffset)).limit(newOffset + size) This results in a ByteBuffer with the same capacity as the original one with a position and a limit that can be moved outside the new limit that we defined. >From a measurement perspective if we want to avoid taking into account the shared array structure for each buffer, we need to be able to determine if your buffer can be considered as a slab or not. The current approach is to consider as a slab anything where the underlying array size is greater than the number of remaining bytes. This approach seems fragile to me. If we were using the slice method to create our slab: data.duplicate().position((newOffset)).limit(newOffset + size).slice() The slab ByteBuffer would have a capacity that represent the real size of the slab and will prevent us to change the position or limit to an incorrect value. It will allow us to reliably identify a slab buffer as its capacity will always be smaller than the underlying array. For DirectBuffer using slice after a duplicate is not a good idea before Java 12 due to a Java bug (https://bugs.openjdk.org/browse/JDK-8208362) which would result in using 64 extra bytes by direct slab buffer. Nevertheless, I was wondering if there were other reasons for not using slice when allocating slabs and if we should not consider using them for heap buffer for the moment and for all buffers once we do not support only Java 17 and 21.