+1 for bloom filter! +1 for implementation (but seems no license file in the repo...)
By the way, it seems that there are some new variants of bloom filter, e.g., supporting range query. I am not sure whether do we need the variants, e.g., for supporting check whether a timeseries set "root.a.b.*.speed" exist. Best, ----------------------------------- Xiangdong Huang School of Software, Tsinghua University 黄向东 清华大学 软件学院 Claude Warren <[email protected]> 于2019年9月12日周四 上午12:29写道: > In my reading of the short message it seems like it would make sense to use > a bloom filter to determine if the "gear" is in the file. I have the > library that I am proposing to move to commons. It can be found at > > https://github.com/Claudenw/BloomFilter/tree/MultiFilter/src/main/java/org/xenei/bloomfilter > > Claude > > On Tue, Sep 10, 2019 at 3:45 PM Julian Feinauer < > [email protected]> wrote: > > > Hi, > > > > I like the idea. I'm just adding Claude here as we talked yesterday about > > a bloom filter implementation he has already done. > > > > @[email protected] <[email protected]> what do you think? : ) > > > > Julian > > ------------------------------ > > *From:* Tian Jiang <[email protected]> > > *Sent:* Tuesday, September 10, 2019 5:14:33 AM > > *To:* [email protected] <[email protected]> > > *Subject:* Add bloom filters to TsFile > > > > > > > > Greetings, > > > > > > The recent readings remind me that the bloom filter is standard equipment > > in K-VDBs. Although IoTDB is not one of them (at least not typically), > the > > bloom filter still helps a lot in various situations. For example, our > > recent experiments gave us an illusion that the time series in a storage > > group remains unchanged. However, that is not the case. > > > > > > Naturally, in real situations, the number of time series grows over time, > > due to reasons like adding new gears. The old files do not contain such a > > time series. Without the help of bloom filters, we have to check each old > > file only to find that there is no such time series. To my knowledge, > this > > may take a lot of time. > > > > > > So, I suggest we add a bloom filter (or some more efficient one) to each > > TsFile to help skip unwanted files. > > > > > > | | > > Tian Jiang > > | > > | > > [email protected] > > | > > 签名由网易邮箱大师定制 > > >
