Re: [IOTDB-726] CheckPoint of MTree

孙泽嵩 Thu, 18 Jun 2020 01:20:05 -0700

Hi,

Good opinions!


> how about adding a "create snapshot for schema" sql to let users trigger this 
> manually

I’ll add this sql in a new PR.

>  how long it takes to recover from a 1M timeseries snapshot.

Based on my previous experiment, it takes about 6s as you said.

> how long it takes to create a snapshot for 1M/10M timeseries?

I didn’t time this … I’ll do an experiment after fixing the suggested changes 
in current PR [1]


[1] https://github.com/apache/incubator-iotdb/pull/1384


Best,
-----------------------------------
Zesong Sun
School of Software, Tsinghua University

孙泽嵩
清华大学 软件学院

> 2020年6月18日 14:39，Jialin Qiao <[email protected]> 写道：
> 
> Hi,
> 
> Currently, the snapshot is triggered every xxx lines in mlog.txt. 
> When meeting 20M timeseries, the default 10k lines will cause too many 
> snapshot, which will block the creating.
> However, if we enlarge the condition to 1M, the last 1M will take about 6s to 
> recover, about 160K per second.
> 
> So, my concern is how long it takes to create a snapshot for 1M/10M 
> timeseries? And how long it takes to recover from a 1M timeseries snapshot.
> 
> Besides, how about adding a "create snapshot for schema" sql to let users 
> trigger this manually?
> 
> Thanks,
> --
> Jialin Qiao
> School of Software, Tsinghua University
> 
> 乔嘉林
> 清华大学 软件学院
> 
>> -----原始邮件-----
>> 发件人: "孙泽嵩" <[email protected]>
>> 发送时间: 2020-06-15 19:14:08 (星期一)
>> 收件人: [email protected]
>> 抄送: 
>> 主题: Re: [IOTDB-726] CheckPoint of MTree
>> 
>> Hi Julian,
>> 
>> Currently I’m just using plain text file.
>> 
>> But I could consider and try with RocksDB : )
>> I also noticed that there is an issue related to RocksDB integration [1].
>> 
>> 
>> [1] https://issues.apache.org/jira/browse/IOTDB-767
>> 
>> 
>> Best,
>> -----------------------------------
>> Zesong Sun
>> School of Software, Tsinghua University
>> 
>> 孙泽嵩
>> 清华大学 软件学院
>> 
>>> 2020年6月15日 19:00，Julian Feinauer <[email protected]> 写道：
>>> 
>>> Hi Zesong,
>>> 
>>> this is an excellent Idea!
>>> Do you serialize the snapshot as plain text file?
>>> Or would it make sense to use something like RocksDB for something like 
>>> that?
>>> 
>>> Julian
>>> 
>>> Am 15.06.20, 12:12 schrieb "孙泽嵩" <[email protected]>:
>>> 
>>>   Greetings,
>>> 
>>>   I’m currently working on issue [IOTDB-726] CheckPoint of MTree [1]
>>> 
>>>   In the situation that there exist a large number of timeseries, it would 
>>> take a long time to restart IoTDB by reading mlog.txt and executing the 
>>> commands line by line.
>>>   For example, it takes about 2 minutes to restart with 20M timeseries.
>>> 
>>>   To solve this problem, “checkpoint” is designed and added to MTree to 
>>> reduce the time of reading mlog when IoTDB restarts: 
>>>   Generate a snapshot, which includes the serialization of MTree, every 
>>> time mlog reaches a certain number of lines.
>>>   When a new snapshot is generated, the old one is deleted. Snapshot file 
>>> and mlog.txt are in the same directory.
>>> 
>>>   Users could configure the threshold number of the mlog lines. By default, 
>>> a snapshot is generated for every 100k lines.
>>> 
>>>   I’ve already made a demo and proved that the method could speed up the 
>>> restarting process.
>>>   As for the reading mlog.txt and initializing MTree part, it reduces time 
>>> by 28.3% (16.6s with origin method, 11.9s with new demo, both for 2M 
>>> timeseries).
>>> 
>>>   I would like to make a PR afterwards. If you have any suggestions about 
>>> the design, feel free to discuss with me.
>>> 
>>> 
>>>   [1] https://issues.apache.org/jira/browse/IOTDB-726
>>> 
>>> 
>>>   Best,
>>>   -----------------------------------
>>>   Zesong Sun
>>>   School of Software, Tsinghua University
>>> 
>>>   孙泽嵩
>>>   清华大学 软件学院
>>> 
>>> 
>>

Re: [IOTDB-726] CheckPoint of MTree

Reply via email to