Greetings,

I’m currently working on issue [IOTDB-726] CheckPoint of MTree [1]

In the situation that there exist a large number of timeseries, it would take a 
long time to restart IoTDB by reading mlog.txt and executing the commands line 
by line.
For example, it takes about 2 minutes to restart with 20M timeseries.

To solve this problem, “checkpoint” is designed and added to MTree to reduce 
the time of reading mlog when IoTDB restarts: 
Generate a snapshot, which includes the serialization of MTree, every time mlog 
reaches a certain number of lines.
When a new snapshot is generated, the old one is deleted. Snapshot file and 
mlog.txt are in the same directory.

Users could configure the threshold number of the mlog lines. By default, a 
snapshot is generated for every 100k lines.

I’ve already made a demo and proved that the method could speed up the 
restarting process.
As for the reading mlog.txt and initializing MTree part, it reduces time by 
28.3% (16.6s with origin method, 11.9s with new demo, both for 2M timeseries).

I would like to make a PR afterwards. If you have any suggestions about the 
design, feel free to discuss with me.


[1] https://issues.apache.org/jira/browse/IOTDB-726


Best,
-----------------------------------
Zesong Sun
School of Software, Tsinghua University

孙泽嵩
清华大学 软件学院

Reply via email to