[ 
https://issues.apache.org/jira/browse/CASSANDRA-20280?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dmitry Konstantinov updated CASSANDRA-20280:
--------------------------------------------
    Fix Version/s: 5.x

> More compact native memory layout for NativeCell
> ------------------------------------------------
>
>                 Key: CASSANDRA-20280
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-20280
>             Project: Apache Cassandra
>          Issue Type: Improvement
>          Components: Local/Memtable
>            Reporter: Dmitry Konstantinov
>            Assignee: Dmitry Konstantinov
>            Priority: Normal
>             Fix For: 5.x
>
>         Attachments: image-2025-02-01-15-10-15-176.png, 
> image-2025-02-01-15-21-22-653.png
>
>
> To capture an idea here, I going to return to it after finishing with the 
> current in-progress tickets.
> The current NativeCell has the following native memory layout:
> !image-2025-02-01-15-21-22-653.png|width=400!
> So, when we store an integer value (4 bytes) we have total size 25 bytes.
> For an ASCII string with 10 symbols - 31 bytes.
> If we can make it more compact -> more data can be stored in memtables and 
> with a potentially better cache locality.
> The idea is to use the first byte to store more flags to differentiate 
> typical use cases:
>  # A usual cell without TTL
>  # Alive: we do not need to store localDeletionInfo
>  # Value is frequently a small value and if it is not more than 128 bytes, we 
> can use 1 byte to store length (varint is an alternative but it is harder to 
> calculate offset for data after it)
> So, we can introduce flags in the first byte such as:
>  * has path
>  * has TTL
>  * has delete info
>  * one byte length
> When we read some component of the cell - we read flags and calculate offsets 
> for the component using information from the flags
> For example: 
> !image-2025-02-01-15-10-15-176.png|width=400!
> These changes are local and incapsulated in NativeCell logic.
> Additional but more complicated options:
>  * to reduce the size is to use a delta encoding for timestamp (similar to 
> what we have in SSTables) but it is more complicated logic which will require 
> to store somewhere (on a partition or memtable level) base timestamp and 
> store delta (as a short or int) in the cell if the delta is small enough 
> (timestamp technically can be set by a client to strange value, so we have 
> support still an option with a full long).
>  * if timestamp and LocalDelInfo has the same values (taking in account micro 
> to milliseconds conversion) we can use another flag bit to mark it and store 
> only timestamp and calculate LocalDelInfo from it.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to