[ 
https://issues.apache.org/jira/browse/CASSANDRA-20280?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dmitry Konstantinov updated CASSANDRA-20280:
--------------------------------------------
    Status: Open  (was: Triage Needed)

> More compact native memory layout for NativeCell
> ------------------------------------------------
>
>                 Key: CASSANDRA-20280
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-20280
>             Project: Apache Cassandra
>          Issue Type: Improvement
>          Components: Local/Memtable
>            Reporter: Dmitry Konstantinov
>            Assignee: Dmitry Konstantinov
>            Priority: Normal
>             Fix For: 5.x
>
>         Attachments: image-2025-02-01-15-10-15-176.png, 
> image-2025-02-01-15-21-22-653.png
>
>
> To capture an idea here, I going to return to it soon after finishing with 
> the current in-progress tickets.
> The current NativeCell has the following native memory layout:
> !image-2025-02-01-15-21-22-653.png|width=400!
> So, when we store an integer value (4 bytes) we have total size 25 bytes.
> For an ASCII string with 10 symbols - 31 bytes.
> If we can make it more compact -> more data can be stored in memtables and 
> with a potentially better CPU cache usage on lookup.
> The idea is to use the first byte to store more flags to differentiate 
> typical use cases:
>  # A usual cell without TTL
>  # Alive: we do not need to store localDeletionInfo
>  # Value is frequently a small value and if it is not more than 128 bytes 
> (256 using unsigned byte), we can use 1 byte to store length (varint is an 
> alternative but it is harder to calculate offset for data after it)
> So, we can introduce flags in the first byte such as:
>  * has path (the existing one)
>  * has TTL
>  * has delete info
>  * one byte length
> When we read some component of the cell - we read flags and calculate offsets 
> for the component using information from the flags
> Using this approach we can reduce native memory overhead to the following 
> values in the following typical cases: 
> !image-2025-02-01-15-10-15-176.png|width=400!
> These changes are local and incapsulated in NativeCell logic. The downside of 
> the approach - extra calculations are needed to lookup a component from a 
> NativeCell  (read flags, if component is present then calculate an offset 
> based on it and read the component by the offset)
> Additional but more complicated options:
>  * use a delta encoding for timestamp (similar to what we have in SSTables) 
> but it is more complicated logic which will require to store somewhere (on a 
> partition or memtable level) base timestamp and store delta (as a short or 
> int) in the cell if the delta is small enough (timestamp technically can be 
> set by a client to any value, so we have to support still the option with a 
> full long).
>  * if timestamp and LocalDelInfo has the same values (taking in account micro 
> to milliseconds conversion) we can use another flag bit to mark it and store 
> only timestamp and calculate LocalDelInfo from it.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to