Hi guanshi, All types are BIG_ENDIAN, right?
Best, Jingsong On Fri, Jun 28, 2024 at 3:14 PM guanshi <1649067...@qq.com.invalid> wrote: > > Okay, I have updated the document: > https://docs.google.com/document/d/1zKp_kqfoYgfmvfZ3DcNVMIXoYdxAsbe73eu98KEe3a8/edit?usp=sharing > > > 2024年6月28日 下午2:05,Jingsong Li <jingsongl...@gmail.com> 写道: > > > > Can you update the google doc? > > > > On Fri, Jun 28, 2024 at 2:02 PM guanshi <1649067...@qq.com.invalid> wrote: > >> > >> I have taken your optimization suggestions, if there is only one value, > >> offset will be negative, and its position is the inverse of the negative > >> value, this situation does not require serialization of roaringBitmap. > >> > >>> 2024年6月28日 下午12:50,Jingsong Li <jingsongl...@gmail.com> 写道: > >>> > >>> if there is one value, what is size of the roaringbitmap? > >>> > >>> guanshi <1649067...@qq.com.invalid>于2024年6月28日 周五11:25写道: > >>> > >>>> Sorry, I forgot about the situation where there is only one null value, > >>>> your design does not need to be changed. > >>>> > >>>>> 2024年6月28日 上午10:53,guanshi <1649067...@qq.com> 写道: > >>>>> > >>>>> Null value offset is not necessary because null value bitmap can be > >>>> placed at the beginning of body. > >>>>> > >>>>>> 2024年6月27日 下午6:46,Jingsong Li <jingsongl...@gmail.com> 写道: > >>>>>> > >>>>>> Thanks guanshi for starting this discussion. > >>>>>> > >>>>>> I saw your suggestion in the document regarding omitting certain > >>>>>> fields. The question is whether we should introduce a compact format. > >>>>>> > >>>>>> Indeed, there may be situations where there are many values, and > >>>>>> introducing a compact format makes sense. > >>>>>> > >>>>>> Consider: > >>>>>> > >>>>>> -- head > >>>>>> version: 1 byte > >>>>>> row count: 4 bytes int > >>>>>> non-null value bitmap number: 4 bytes int > >>>>>> has null value: 1 byte > >>>>>> null value offset: 4 bytes if has null value > >>>>>> value x: var bytes for any data type (as > >>>>>> bitmap identifier) > >>>>>> offset: 4 bytes int > >>>>>> > >>>>>> -- body > >>>>>> serialized bitmap1 > >>>>>> serialized bitmap2 > >>>>>> serialized bitmap3 > >>>>>> > >>>>>> Optimization: > >>>>>> > >>>>>> Offset can be a negative number, and when it is negative, it > >>>>>> represents that there is only one value, and its position is the > >>>>>> inverse of the negative value. > >>>>>> > >>>>>> Best, > >>>>>> Jingsong > >>>>>> > >>>>>> On Thu, Jun 27, 2024 at 3:50 PM guanshi <1649067...@qq.com.invalid> > >>>> wrote: > >>>>>>> > >>>>>>> Hello, this is the bitmap index format I designed, and I hope to > >>>> discuss it with everyone: > >>>>>>> > >>>> https://docs.google.com/document/d/1zKp_kqfoYgfmvfZ3DcNVMIXoYdxAsbe73eu98KEe3a8/edit?usp=sharing > >>>>> > >>>> > >>>> > >>> > >> >