Yes, DataOutputStream writes in BIG_INDIAN.

> 2024年7月1日 上午10:04,Jingsong Li <jingsongl...@gmail.com> 写道:
> 
> Hi guanshi,
> 
> All types are BIG_ENDIAN, right?
> 
> Best,
> Jingsong
> 
> On Fri, Jun 28, 2024 at 3:14 PM guanshi <1649067...@qq.com.invalid> wrote:
>> 
>> Okay, I have updated the document:
>> https://docs.google.com/document/d/1zKp_kqfoYgfmvfZ3DcNVMIXoYdxAsbe73eu98KEe3a8/edit?usp=sharing
>> 
>>> 2024年6月28日 下午2:05,Jingsong Li <jingsongl...@gmail.com> 写道:
>>> 
>>> Can you update the google doc?
>>> 
>>> On Fri, Jun 28, 2024 at 2:02 PM guanshi <1649067...@qq.com.invalid> wrote:
>>>> 
>>>> I have taken your optimization suggestions, if there is only one value, 
>>>> offset will be negative, and its position is the inverse of the negative 
>>>> value, this situation does not require serialization of roaringBitmap.
>>>> 
>>>>> 2024年6月28日 下午12:50,Jingsong Li <jingsongl...@gmail.com> 写道:
>>>>> 
>>>>> if there is one value, what is size of the roaringbitmap?
>>>>> 
>>>>> guanshi <1649067...@qq.com.invalid>于2024年6月28日 周五11:25写道:
>>>>> 
>>>>>> Sorry, I forgot about the situation where there is only one null value,
>>>>>> your design does not need to be changed.
>>>>>> 
>>>>>>> 2024年6月28日 上午10:53,guanshi <1649067...@qq.com> 写道:
>>>>>>> 
>>>>>>> Null value offset is not necessary because null value bitmap can be
>>>>>> placed at the beginning of body.
>>>>>>> 
>>>>>>>> 2024年6月27日 下午6:46,Jingsong Li <jingsongl...@gmail.com> 写道:
>>>>>>>> 
>>>>>>>> Thanks guanshi for starting this discussion.
>>>>>>>> 
>>>>>>>> I saw your suggestion in the document regarding omitting certain
>>>>>>>> fields. The question is whether we should introduce a compact format.
>>>>>>>> 
>>>>>>>> Indeed, there may be situations where there are many values, and
>>>>>>>> introducing a compact format makes sense.
>>>>>>>> 
>>>>>>>> Consider:
>>>>>>>> 
>>>>>>>> -- head
>>>>>>>> version:                    1 byte
>>>>>>>> row count:                4 bytes int
>>>>>>>> non-null value bitmap number:        4 bytes int
>>>>>>>> has null value: 1 byte
>>>>>>>> null value offset: 4 bytes if has null value
>>>>>>>> value x:                               var bytes for any data type (as
>>>>>>>> bitmap identifier)
>>>>>>>> offset:                                  4 bytes int
>>>>>>>> 
>>>>>>>> -- body
>>>>>>>> serialized bitmap1
>>>>>>>> serialized bitmap2
>>>>>>>> serialized bitmap3
>>>>>>>> 
>>>>>>>> Optimization:
>>>>>>>> 
>>>>>>>> Offset can be a negative number, and when it is negative, it
>>>>>>>> represents that there is only one value, and its position is the
>>>>>>>> inverse of the negative value.
>>>>>>>> 
>>>>>>>> Best,
>>>>>>>> Jingsong
>>>>>>>> 
>>>>>>>> On Thu, Jun 27, 2024 at 3:50 PM guanshi <1649067...@qq.com.invalid>
>>>>>> wrote:
>>>>>>>>> 
>>>>>>>>> Hello, this is the bitmap index format I designed, and I hope to
>>>>>> discuss it with everyone:
>>>>>>>>> 
>>>>>> https://docs.google.com/document/d/1zKp_kqfoYgfmvfZ3DcNVMIXoYdxAsbe73eu98KEe3a8/edit?usp=sharing
>>>>>>> 
>>>>>> 
>>>>>> 
>>>>> 
>>>> 
>> 

Reply via email to