Alexandr Kuramshin created IGNITE-6499:
------------------------------------------

             Summary: Compact NULL fields binary representation
                 Key: IGNITE-6499
                 URL: https://issues.apache.org/jira/browse/IGNITE-6499
             Project: Ignite
          Issue Type: Improvement
          Components: binary
    Affects Versions: 2.1
            Reporter: Alexandr Kuramshin
            Assignee: Vladimir Ozerov


Current compact footer implementation writes offset for the every field in 
schema. Depending on serialized size of an object offset may be 1, 2 or 4 bytes.

Imagine an object with some 100 fields are null. It takes from 100 to 400 bytes 
overhead. For middle-sized objects (about 260 bytes) it doubles the memory 
usage. For a small-sized objects (about 40 bytes) the memory usage increased by 
factor 3 or 4.

Proposed two optimizations, the both should be implemented, the most optimal 
implementation should be selected dynamically upon object marshalling.

1. Write field ID and offset for the only non-null fields in footer.

2. Write footer header then field offsets for the only non-null fields as 
follows

[0] bit mask for first 8 fields, 0 - field is null, 1 - field is non-null
[1] cumulative sum of "1" bits
[2] bit mask for the next 8 fields
[3] cumulative sum of "1" bits
... and so on
[N1...N2] offset of first non-null field
[N3...N4] offset of next non-null field
... and so on

If we want to read fields from 0 to 7, then we read first footer byte, step 
through bits and find the offset index for non-null field or find that field is 
null.

If we want to read fields from 8, then we read two footer bytes, take start 
offset from the first byte, and then step through bits and find the offset 
index for non-null field or find that field is null.

This supports up to 255 non-null fields per binary object.

Overhead would be only 24 bytes per 100 null fields instead of 200 bytes for 
the middle-sized object.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to