Thanks JiaLiang for starting this discussion. Copied to PIP-33.
[1] https://cwiki.apache.org/confluence/display/PAIMON/PIP-33%3A+Introduce+the+range-bitmap+file+index On Fri, Jul 11, 2025 at 4:43 PM jialiang tan <tanjialiang1...@gmail.com> wrote: > > Hi devs, > > I would like to start a discussion about PIP-XXX: Introduce the > range-bitmap file index [1]. > > Currently, we support the bitmap and bsi indexes, each of which has its own > advantages and disadvantages. > > In the bitmap index: > 1. The bitmap v2 index performs very well in the EQ predicate evaluation, > but it also only supports this type of evaluation. > 2. In high-base scenarios, a relatively large number of bitmaps will be > required, which may result in a large index file. > > In the bsi index: > 1. The bsi index supports the EQ and Range predicates, but only for numeric > data types. > > To resolve the shortcomings of both bitmap index and bsi index, I would > like to propose a new type of index: range-bitmap. > > It combines all the advantages of both bitmap and bsi index. It supports > the EQ and Range predicates evaluation, as well as index building for all > basic data types, particularly STRING, DOUBLE and FLOAT. Compared to bitmap > v2 indexes, it reduces the number of bitmaps by a log₂ factor. > > In addition, the range-bitmap index performs better than the bsi index in > all cases of evaluation, I propose that we mark the bsi index as deprecated. > > See the implementation [2]. > > Looking forward to your feedback, thanks! > > [1] > https://docs.google.com/document/d/14YXPtCUmvjwozdLhgWJdPgHrVYOTdv9uiC1N2GlNvG4/edit?usp=sharing > [2] https://github.com/Tan-JiaLiang/paimon/tree/feature/rangebitmapV2 > > Best, > Tan JiaLiang.