Hi Zhonghang, I'm glad to receive your feedback. I see that you have adopted another approach in the design of the > dictionary. I would like to do some specific tests on this part later. If > the performance is generally better, this dictionary can also be applied to > the bitmap index.
Yes, the design of the dictionary was inspired by the bitmap index V2, and it can be used in the bitmap index theoretically. I am looking forward to seeing that the dictionary can be used to improve the performance of bitmap index. Also, please feel free to ask me any questions about it. I'm happy to help! Best, Tan Jialiang. zhonghang <1649067...@qq.com.invalid> 于2025年7月12日周六 14:44写道: > Hi JiaLiang, > > > Thanks for your contribution, this feature looks very useful, it allows us > to > apply bsi indexes to all comparable types. > > > I don't have any better suggestions for improvement at the moment,I > see > that you have adopted another approach in the design of the > dictionary. > I would like to do some specific tests on this part later. If the > performance is > generally better, this dictionary can also be applied to the bitmap index. > > > > Best, > Zhonghang > > > > > > zhonghang > 1649067...@qq.com > > > > > > > > > ------------------聽原始邮件聽------------------ > 发件人: > "dev" > < > tanjialiang1...@gmail.com>; > 发送时间:聽2025年7月11日(星期五) 下午4:42 > 收件人:聽"dev"<dev@paimon.apache.org>; > > 主题:聽[DISCUSS] PIP-XXX: Introduce the range-bitmap file index > > > > Hi devs, > > I would like to start a discussion about PIP-XXX: Introduce the > range-bitmap file index [1]. > > Currently, we support the bitmap and bsi indexes, each of which has its own > advantages and disadvantages. > > In the bitmap index: > 1. The bitmap v2 index performs very well in the EQ predicate evaluation, > but it also only supports this type of evaluation. > 2. In high-base scenarios, a relatively large number of bitmaps will be > required, which may result in a large index file. > > In the bsi index: > 1. The bsi index supports the EQ and Range predicates, but only for numeric > data types. > > To resolve the shortcomings of both bitmap index and bsi index, I would > like to propose a new type of index: range-bitmap. > > It combines all the advantages of both bitmap and bsi index. It supports > the EQ and Range predicates evaluation, as well as index building for all > basic data types, particularly STRING, DOUBLE and FLOAT. Compared to bitmap > v2 indexes, it reduces the number of bitmaps by a log鈧� factor. > > In addition, the range-bitmap index performs better than the bsi index in > all cases of evaluation, I propose that we mark the bsi index as > deprecated. > > See the implementation [2]. > > Looking forward to your feedback, thanks! > > [1] > > https://docs.google.com/document/d/14YXPtCUmvjwozdLhgWJdPgHrVYOTdv9uiC1N2GlNvG4/edit?usp=sharing > [2] https://github.com/Tan-JiaLiang/paimon/tree/feature/rangebitmapV2 > > Best, > Tan JiaLiang.