Re: [IOTDB-726] CheckPoint of MTree

Jialin Qiao Tue, 16 Jun 2020 23:16:44 -0700

Hi,

Using rocksdb for tag management maybe fine. However, if we move the MTree into 
RocksDB, the write performance should be measured first, because for each data 
point in insertion, we need to check in the MTree.


The serialization is implemented mainly by the class ReadWriteIOUtils.

Thanks,
--
Jialin Qiao
School of Software, Tsinghua University

乔嘉林
清华大学 软件学院

> -----原始邮件-----
> 发件人: "Julian Feinauer" <[email protected]>
> 发送时间: 2020-06-15 19:26:05 (星期一)
> 收件人: "[email protected]" <[email protected]>
> 抄送: 
> 主题: Re: [IOTDB-726] CheckPoint of MTree
> 
> Hey,
> 
> I just wonder i fit makes sense to rely on something like RocksDB e.g. for 
> inverted Indexes and fast lookups.
> What do you currently use to serialize / deserialize the MTree? 
> There are different methods with big performance differences, I remember.
> 
> Julian
> 
> Am 15.06.20, 13:14 schrieb "孙泽嵩" <[email protected]>:
> 
>     Hi Julian,
> 
>     Currently I’m just using plain text file.
> 
>     But I could consider and try with RocksDB : )
>     I also noticed that there is an issue related to RocksDB integration [1].
> 
> 
>     [1] https://issues.apache.org/jira/browse/IOTDB-767
> 
> 
>     Best,
>     -----------------------------------
>     Zesong Sun
>     School of Software, Tsinghua University
> 
>     孙泽嵩
>     清华大学 软件学院
> 
>     > 2020年6月15日 19:00，Julian Feinauer <[email protected]> 写道：
>     > 
>     > Hi Zesong,
>     > 
>     > this is an excellent Idea!
>     > Do you serialize the snapshot as plain text file?
>     > Or would it make sense to use something like RocksDB for something like 
> that?
>     > 
>     > Julian
>     > 
>     > Am 15.06.20, 12:12 schrieb "孙泽嵩" <[email protected]>:
>     > 
>     >    Greetings,
>     > 
>     >    I’m currently working on issue [IOTDB-726] CheckPoint of MTree [1]
>     > 
>     >    In the situation that there exist a large number of timeseries, it 
> would take a long time to restart IoTDB by reading mlog.txt and executing the 
> commands line by line.
>     >    For example, it takes about 2 minutes to restart with 20M timeseries.
>     > 
>     >    To solve this problem, “checkpoint” is designed and added to MTree 
> to reduce the time of reading mlog when IoTDB restarts: 
>     >    Generate a snapshot, which includes the serialization of MTree, 
> every time mlog reaches a certain number of lines.
>     >    When a new snapshot is generated, the old one is deleted. Snapshot 
> file and mlog.txt are in the same directory.
>     > 
>     >    Users could configure the threshold number of the mlog lines. By 
> default, a snapshot is generated for every 100k lines.
>     > 
>     >    I’ve already made a demo and proved that the method could speed up 
> the restarting process.
>     >    As for the reading mlog.txt and initializing MTree part, it reduces 
> time by 28.3% (16.6s with origin method, 11.9s with new demo, both for 2M 
> timeseries).
>     > 
>     >    I would like to make a PR afterwards. If you have any suggestions 
> about the design, feel free to discuss with me.
>     > 
>     > 
>     >    [1] https://issues.apache.org/jira/browse/IOTDB-726
>     > 
>     > 
>     >    Best,
>     >    -----------------------------------
>     >    Zesong Sun
>     >    School of Software, Tsinghua University
>     > 
>     >    孙泽嵩
>     >    清华大学 软件学院
>     > 
>     > 
> 
>

Re: [IOTDB-726] CheckPoint of MTree

Reply via email to