Thanks guanshi for starting this discussion.

I saw your suggestion in the document regarding omitting certain
fields. The question is whether we should introduce a compact format.

Indeed, there may be situations where there are many values, and
introducing a compact format makes sense.

Consider:

-- head
version:                    1 byte
row count:                4 bytes int
non-null value bitmap number:        4 bytes int
has null value: 1 byte
null value offset: 4 bytes if has null value
value x:                               var bytes for any data type (as
bitmap identifier)
offset:                                  4 bytes int

-- body
serialized bitmap1
serialized bitmap2
serialized bitmap3

Optimization:

Offset can be a negative number, and when it is negative, it
represents that there is only one value, and its position is the
inverse of the negative value.

Best,
Jingsong

On Thu, Jun 27, 2024 at 3:50 PM guanshi <1649067...@qq.com.invalid> wrote:
>
> Hello, this is the bitmap index format I designed, and I hope to discuss it 
> with everyone:
> https://docs.google.com/document/d/1zKp_kqfoYgfmvfZ3DcNVMIXoYdxAsbe73eu98KEe3a8/edit?usp=sharing

Reply via email to