Greetings, This is a discussion on the [email protected] mailing list concerning the addition of bloom filters to commons. Please take a look and comment there.
Thx, Claude On Thu, Sep 12, 2019 at 3:44 PM Xiangdong Huang <[email protected]> wrote: > +1 for bloom filter! > +1 for implementation (but seems no license file in the repo...) > > By the way, it seems that there are some new variants of bloom filter, > e.g., supporting range query. > I am not sure whether do we need the variants, e.g., for supporting check > whether a timeseries set "root.a.b.*.speed" exist. > > Best, > ----------------------------------- > Xiangdong Huang > School of Software, Tsinghua University > > 黄向东 > 清华大学 软件学院 > > > Claude Warren <[email protected]> 于2019年9月12日周四 上午12:29写道: > >> In my reading of the short message it seems like it would make sense to >> use >> a bloom filter to determine if the "gear" is in the file. I have the >> library that I am proposing to move to commons. It can be found at >> >> https://github.com/Claudenw/BloomFilter/tree/MultiFilter/src/main/java/org/xenei/bloomfilter >> >> Claude >> >> On Tue, Sep 10, 2019 at 3:45 PM Julian Feinauer < >> [email protected]> wrote: >> >> > Hi, >> > >> > I like the idea. I'm just adding Claude here as we talked yesterday >> about >> > a bloom filter implementation he has already done. >> > >> > @[email protected] <[email protected]> what do you think? : ) >> > >> > Julian >> > ------------------------------ >> > *From:* Tian Jiang <[email protected]> >> > *Sent:* Tuesday, September 10, 2019 5:14:33 AM >> > *To:* [email protected] <[email protected]> >> > *Subject:* Add bloom filters to TsFile >> > >> > >> > >> > Greetings, >> > >> > >> > The recent readings remind me that the bloom filter is standard >> equipment >> > in K-VDBs. Although IoTDB is not one of them (at least not typically), >> the >> > bloom filter still helps a lot in various situations. For example, our >> > recent experiments gave us an illusion that the time series in a storage >> > group remains unchanged. However, that is not the case. >> > >> > >> > Naturally, in real situations, the number of time series grows over >> time, >> > due to reasons like adding new gears. The old files do not contain such >> a >> > time series. Without the help of bloom filters, we have to check each >> old >> > file only to find that there is no such time series. To my knowledge, >> this >> > may take a lot of time. >> > >> > >> > So, I suggest we add a bloom filter (or some more efficient one) to each >> > TsFile to help skip unwanted files. >> > >> > >> > | | >> > Tian Jiang >> > | >> > | >> > [email protected] >> > | >> > 签名由网易邮箱大师定制 >> > >> >
