Yes, DataOutputStream writes in BIG_INDIAN.
> 2024年7月1日 上午10:04,Jingsong Li <jingsongl...@gmail.com> 写道:
>
> Hi guanshi,
>
> All types are BIG_ENDIAN, right?
>
> Best,
> Jingsong
>
> On Fri, Jun 28, 2024 at 3:14 PM guanshi <1649067...@qq.com.invalid> wrote:
>>
>> Okay, I have updated the document:
>> https://docs.google.com/document/d/1zKp_kqfoYgfmvfZ3DcNVMIXoYdxAsbe73eu98KEe3a8/edit?usp=sharing
>>
>>> 2024年6月28日 下午2:05,Jingsong Li <jingsongl...@gmail.com> 写道:
>>>
>>> Can you update the google doc?
>>>
>>> On Fri, Jun 28, 2024 at 2:02 PM guanshi <1649067...@qq.com.invalid> wrote:
>>>>
>>>> I have taken your optimization suggestions, if there is only one value,
>>>> offset will be negative, and its position is the inverse of the negative
>>>> value, this situation does not require serialization of roaringBitmap.
>>>>
>>>>> 2024年6月28日 下午12:50,Jingsong Li <jingsongl...@gmail.com> 写道:
>>>>>
>>>>> if there is one value, what is size of the roaringbitmap?
>>>>>
>>>>> guanshi <1649067...@qq.com.invalid>于2024年6月28日 周五11:25写道:
>>>>>
>>>>>> Sorry, I forgot about the situation where there is only one null value,
>>>>>> your design does not need to be changed.
>>>>>>
>>>>>>> 2024年6月28日 上午10:53,guanshi <1649067...@qq.com> 写道:
>>>>>>>
>>>>>>> Null value offset is not necessary because null value bitmap can be
>>>>>> placed at the beginning of body.
>>>>>>>
>>>>>>>> 2024年6月27日 下午6:46,Jingsong Li <jingsongl...@gmail.com> 写道:
>>>>>>>>
>>>>>>>> Thanks guanshi for starting this discussion.
>>>>>>>>
>>>>>>>> I saw your suggestion in the document regarding omitting certain
>>>>>>>> fields. The question is whether we should introduce a compact format.
>>>>>>>>
>>>>>>>> Indeed, there may be situations where there are many values, and
>>>>>>>> introducing a compact format makes sense.
>>>>>>>>
>>>>>>>> Consider:
>>>>>>>>
>>>>>>>> -- head
>>>>>>>> version: 1 byte
>>>>>>>> row count: 4 bytes int
>>>>>>>> non-null value bitmap number: 4 bytes int
>>>>>>>> has null value: 1 byte
>>>>>>>> null value offset: 4 bytes if has null value
>>>>>>>> value x: var bytes for any data type (as
>>>>>>>> bitmap identifier)
>>>>>>>> offset: 4 bytes int
>>>>>>>>
>>>>>>>> -- body
>>>>>>>> serialized bitmap1
>>>>>>>> serialized bitmap2
>>>>>>>> serialized bitmap3
>>>>>>>>
>>>>>>>> Optimization:
>>>>>>>>
>>>>>>>> Offset can be a negative number, and when it is negative, it
>>>>>>>> represents that there is only one value, and its position is the
>>>>>>>> inverse of the negative value.
>>>>>>>>
>>>>>>>> Best,
>>>>>>>> Jingsong
>>>>>>>>
>>>>>>>> On Thu, Jun 27, 2024 at 3:50 PM guanshi <1649067...@qq.com.invalid>
>>>>>> wrote:
>>>>>>>>>
>>>>>>>>> Hello, this is the bitmap index format I designed, and I hope to
>>>>>> discuss it with everyone:
>>>>>>>>>
>>>>>> https://docs.google.com/document/d/1zKp_kqfoYgfmvfZ3DcNVMIXoYdxAsbe73eu98KEe3a8/edit?usp=sharing
>>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>
>>