Hi Zesong,

this is an excellent Idea!
Do you serialize the snapshot as plain text file?
Or would it make sense to use something like RocksDB for something like that?

Julian

Am 15.06.20, 12:12 schrieb "孙泽嵩" <[email protected]>:

    Greetings,

    I’m currently working on issue [IOTDB-726] CheckPoint of MTree [1]

    In the situation that there exist a large number of timeseries, it would 
take a long time to restart IoTDB by reading mlog.txt and executing the 
commands line by line.
    For example, it takes about 2 minutes to restart with 20M timeseries.

    To solve this problem, “checkpoint” is designed and added to MTree to 
reduce the time of reading mlog when IoTDB restarts: 
    Generate a snapshot, which includes the serialization of MTree, every time 
mlog reaches a certain number of lines.
    When a new snapshot is generated, the old one is deleted. Snapshot file and 
mlog.txt are in the same directory.

    Users could configure the threshold number of the mlog lines. By default, a 
snapshot is generated for every 100k lines.

    I’ve already made a demo and proved that the method could speed up the 
restarting process.
    As for the reading mlog.txt and initializing MTree part, it reduces time by 
28.3% (16.6s with origin method, 11.9s with new demo, both for 2M timeseries).

    I would like to make a PR afterwards. If you have any suggestions about the 
design, feel free to discuss with me.


    [1] https://issues.apache.org/jira/browse/IOTDB-726


    Best,
    -----------------------------------
    Zesong Sun
    School of Software, Tsinghua University

    孙泽嵩
    清华大学 软件学院


Reply via email to