Hi Jialiang, Benchmarks look exciting to me! I think it is time to create PR for this.
Best, Jingsong On Mon, Jul 14, 2025 at 2:08 PM jialiang tan <tanjialiang1...@gmail.com> wrote: > > Hi Zhonghang, > I'm glad to receive your feedback. > > I see that you have adopted another approach in the design of the > > dictionary. I would like to do some specific tests on this part later. If > > the performance is generally better, this dictionary can also be applied to > > the bitmap index. > > > Yes, the design of the dictionary was inspired by the bitmap index V2, and > it can be used in the bitmap index theoretically. > > I am looking forward to seeing that the dictionary can be used to improve > the performance of bitmap index. > > Also, please feel free to ask me any questions about it. I'm happy to help! > > Best, > Tan Jialiang. > > zhonghang <1649067...@qq.com.invalid> 于2025年7月12日周六 14:44写道: > > > Hi JiaLiang, > > > > > > Thanks for your contribution, this feature looks very useful, it allows us > > to > > apply bsi indexes to all comparable types. > > > > > > I don't have any better suggestions for improvement at the moment,I > > see > > that you have adopted another approach in the design of the > > dictionary. > > I would like to do some specific tests on this part later. If the > > performance is > > generally better, this dictionary can also be applied to the bitmap index. > > > > > > > > Best, > > Zhonghang > > > > > > > > > > > > zhonghang > > 1649067...@qq.com > > > > > > > > > > > > > > > > > > ------------------聽原始邮件聽------------------ > > 发件人: > > "dev" > > < > > tanjialiang1...@gmail.com>; > > 发送时间:聽2025年7月11日(星期五) 下午4:42 > > 收件人:聽"dev"<dev@paimon.apache.org>; > > > > 主题:聽[DISCUSS] PIP-XXX: Introduce the range-bitmap file index > > > > > > > > Hi devs, > > > > I would like to start a discussion about PIP-XXX: Introduce the > > range-bitmap file index [1]. > > > > Currently, we support the bitmap and bsi indexes, each of which has its own > > advantages and disadvantages. > > > > In the bitmap index: > > 1. The bitmap v2 index performs very well in the EQ predicate evaluation, > > but it also only supports this type of evaluation. > > 2. In high-base scenarios, a relatively large number of bitmaps will be > > required, which may result in a large index file. > > > > In the bsi index: > > 1. The bsi index supports the EQ and Range predicates, but only for numeric > > data types. > > > > To resolve the shortcomings of both bitmap index and bsi index, I would > > like to propose a new type of index: range-bitmap. > > > > It combines all the advantages of both bitmap and bsi index. It supports > > the EQ and Range predicates evaluation, as well as index building for all > > basic data types, particularly STRING, DOUBLE and FLOAT. Compared to bitmap > > v2 indexes, it reduces the number of bitmaps by a log鈧� factor. > > > > In addition, the range-bitmap index performs better than the bsi index in > > all cases of evaluation, I propose that we mark the bsi index as > > deprecated. > > > > See the implementation [2]. > > > > Looking forward to your feedback, thanks! > > > > [1] > > > > https://docs.google.com/document/d/14YXPtCUmvjwozdLhgWJdPgHrVYOTdv9uiC1N2GlNvG4/edit?usp=sharing > > [2] https://github.com/Tan-JiaLiang/paimon/tree/feature/rangebitmapV2 > > > > Best, > > Tan JiaLiang.