[
https://issues.apache.org/jira/browse/IGNITE-6499?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Vladimir Ozerov reassigned IGNITE-6499:
---------------------------------------
Assignee: (was: Vladimir Ozerov)
> Compact NULL fields binary representation
> -----------------------------------------
>
> Key: IGNITE-6499
> URL: https://issues.apache.org/jira/browse/IGNITE-6499
> Project: Ignite
> Issue Type: Improvement
> Components: binary
> Affects Versions: 2.1
> Reporter: Alexandr Kuramshin
>
> Current compact footer implementation writes offset for the every field in
> schema. Depending on serialized size of an object offset may be 1, 2 or 4
> bytes.
> Imagine an object with some 100 fields are null. It takes from 100 to 400
> bytes overhead. For middle-sized objects (about 260 bytes) it doubles the
> memory usage. For a small-sized objects (about 40 bytes) the memory usage
> increased by factor 3 or 4.
> Proposed two optimizations, the both should be implemented, the most optimal
> implementation should be selected dynamically upon object marshalling.
> 1. Write field index and offset for the only non-null fields in footer.
> Should be used for objects with a few non-null fields.
> 2. Write footer header then field offsets for the only non-null fields as
> follows
> [0] bit mask for first 8 fields, 0 - field is null, 1 - field is non-null
> [1] cumulative sum of "1" bits
> [2] bit mask for the next 8 fields
> [3] cumulative sum of "1" bits
> ... and so on
> [N1...N2] offset of first non-null field
> [N3...N4] offset of next non-null field
> ... and so on
> If we want to read fields from 0 to 7, then we read first footer byte, step
> through bits and find the offset index for non-null field or find that field
> is null.
> If we want to read fields from 8, then we read two footer bytes, take start
> offset index from the first byte, and then step through bits and find the
> offset index for non-null field or find that field is null.
> This supports up to 255 non-null fields per binary object.
> Overhead would be only 24 bytes per 100 null fields instead of 200 bytes for
> the middle-sized object.
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)