Hi Zesong, this is an excellent Idea! Do you serialize the snapshot as plain text file? Or would it make sense to use something like RocksDB for something like that?
Julian Am 15.06.20, 12:12 schrieb "孙泽嵩" <[email protected]>: Greetings, I’m currently working on issue [IOTDB-726] CheckPoint of MTree [1] In the situation that there exist a large number of timeseries, it would take a long time to restart IoTDB by reading mlog.txt and executing the commands line by line. For example, it takes about 2 minutes to restart with 20M timeseries. To solve this problem, “checkpoint” is designed and added to MTree to reduce the time of reading mlog when IoTDB restarts: Generate a snapshot, which includes the serialization of MTree, every time mlog reaches a certain number of lines. When a new snapshot is generated, the old one is deleted. Snapshot file and mlog.txt are in the same directory. Users could configure the threshold number of the mlog lines. By default, a snapshot is generated for every 100k lines. I’ve already made a demo and proved that the method could speed up the restarting process. As for the reading mlog.txt and initializing MTree part, it reduces time by 28.3% (16.6s with origin method, 11.9s with new demo, both for 2M timeseries). I would like to make a PR afterwards. If you have any suggestions about the design, feel free to discuss with me. [1] https://issues.apache.org/jira/browse/IOTDB-726 Best, ----------------------------------- Zesong Sun School of Software, Tsinghua University 孙泽嵩 清华大学 软件学院
