Hi, Currently, the snapshot is triggered every xxx lines in mlog.txt. When meeting 20M timeseries, the default 10k lines will cause too many snapshot, which will block the creating. However, if we enlarge the condition to 1M, the last 1M will take about 6s to recover, about 160K per second.
So, my concern is how long it takes to create a snapshot for 1M/10M timeseries? And how long it takes to recover from a 1M timeseries snapshot. Besides, how about adding a "create snapshot for schema" sql to let users trigger this manually? Thanks, -- Jialin Qiao School of Software, Tsinghua University 乔嘉林 清华大学 软件学院 > -----原始邮件----- > 发件人: "孙泽嵩" <[email protected]> > 发送时间: 2020-06-15 19:14:08 (星期一) > 收件人: [email protected] > 抄送: > 主题: Re: [IOTDB-726] CheckPoint of MTree > > Hi Julian, > > Currently I’m just using plain text file. > > But I could consider and try with RocksDB : ) > I also noticed that there is an issue related to RocksDB integration [1]. > > > [1] https://issues.apache.org/jira/browse/IOTDB-767 > > > Best, > ----------------------------------- > Zesong Sun > School of Software, Tsinghua University > > 孙泽嵩 > 清华大学 软件学院 > > > 2020年6月15日 19:00,Julian Feinauer <[email protected]> 写道: > > > > Hi Zesong, > > > > this is an excellent Idea! > > Do you serialize the snapshot as plain text file? > > Or would it make sense to use something like RocksDB for something like > > that? > > > > Julian > > > > Am 15.06.20, 12:12 schrieb "孙泽嵩" <[email protected]>: > > > > Greetings, > > > > I’m currently working on issue [IOTDB-726] CheckPoint of MTree [1] > > > > In the situation that there exist a large number of timeseries, it would > > take a long time to restart IoTDB by reading mlog.txt and executing the > > commands line by line. > > For example, it takes about 2 minutes to restart with 20M timeseries. > > > > To solve this problem, “checkpoint” is designed and added to MTree to > > reduce the time of reading mlog when IoTDB restarts: > > Generate a snapshot, which includes the serialization of MTree, every > > time mlog reaches a certain number of lines. > > When a new snapshot is generated, the old one is deleted. Snapshot file > > and mlog.txt are in the same directory. > > > > Users could configure the threshold number of the mlog lines. By > > default, a snapshot is generated for every 100k lines. > > > > I’ve already made a demo and proved that the method could speed up the > > restarting process. > > As for the reading mlog.txt and initializing MTree part, it reduces time > > by 28.3% (16.6s with origin method, 11.9s with new demo, both for 2M > > timeseries). > > > > I would like to make a PR afterwards. If you have any suggestions about > > the design, feel free to discuss with me. > > > > > > [1] https://issues.apache.org/jira/browse/IOTDB-726 > > > > > > Best, > > ----------------------------------- > > Zesong Sun > > School of Software, Tsinghua University > > > > 孙泽嵩 > > 清华大学 软件学院 > > > > >
