Hi Jialin,

I did an experiment for 1M timeseries, and the serialization process costs 
971ms.

Maybe we could consider creating a snapshot when the MTree is not changed for a 
long time (for example, one hour).

In this way, the client will not be stuck and users may not even notice it.


Best,
-----------------------------------
Zesong Sun
School of Software, Tsinghua University

孙泽嵩
清华大学 软件学院

> 2020年6月18日 16:19,孙泽嵩 <[email protected]> 写道:
> 
> Hi,
> 
> Good opinions!
> 
>> how about adding a "create snapshot for schema" sql to let users trigger 
>> this manually
> 
> I’ll add this sql in a new PR.
> 
>> how long it takes to recover from a 1M timeseries snapshot.
> 
> Based on my previous experiment, it takes about 6s as you said.
> 
>> how long it takes to create a snapshot for 1M/10M timeseries?
> 
> I didn’t time this … I’ll do an experiment after fixing the suggested changes 
> in current PR [1]
> 
> 
> [1] https://github.com/apache/incubator-iotdb/pull/1384
> 
> 
> Best,
> -----------------------------------
> Zesong Sun
> School of Software, Tsinghua University
> 
> 孙泽嵩
> 清华大学 软件学院
> 
>> 2020年6月18日 14:39,Jialin Qiao <[email protected]> 写道:
>> 
>> Hi,
>> 
>> Currently, the snapshot is triggered every xxx lines in mlog.txt. 
>> When meeting 20M timeseries, the default 10k lines will cause too many 
>> snapshot, which will block the creating.
>> However, if we enlarge the condition to 1M, the last 1M will take about 6s 
>> to recover, about 160K per second.
>> 
>> So, my concern is how long it takes to create a snapshot for 1M/10M 
>> timeseries? And how long it takes to recover from a 1M timeseries snapshot.
>> 
>> Besides, how about adding a "create snapshot for schema" sql to let users 
>> trigger this manually?
>> 
>> Thanks,
>> --
>> Jialin Qiao
>> School of Software, Tsinghua University
>> 
>> 乔嘉林
>> 清华大学 软件学院
>> 
>>> -----原始邮件-----
>>> 发件人: "孙泽嵩" <[email protected]>
>>> 发送时间: 2020-06-15 19:14:08 (星期一)
>>> 收件人: [email protected]
>>> 抄送: 
>>> 主题: Re: [IOTDB-726] CheckPoint of MTree
>>> 
>>> Hi Julian,
>>> 
>>> Currently I’m just using plain text file.
>>> 
>>> But I could consider and try with RocksDB : )
>>> I also noticed that there is an issue related to RocksDB integration [1].
>>> 
>>> 
>>> [1] https://issues.apache.org/jira/browse/IOTDB-767
>>> 
>>> 
>>> Best,
>>> -----------------------------------
>>> Zesong Sun
>>> School of Software, Tsinghua University
>>> 
>>> 孙泽嵩
>>> 清华大学 软件学院
>>> 
>>>> 2020年6月15日 19:00,Julian Feinauer <[email protected]> 写道:
>>>> 
>>>> Hi Zesong,
>>>> 
>>>> this is an excellent Idea!
>>>> Do you serialize the snapshot as plain text file?
>>>> Or would it make sense to use something like RocksDB for something like 
>>>> that?
>>>> 
>>>> Julian
>>>> 
>>>> Am 15.06.20, 12:12 schrieb "孙泽嵩" <[email protected]>:
>>>> 
>>>>  Greetings,
>>>> 
>>>>  I’m currently working on issue [IOTDB-726] CheckPoint of MTree [1]
>>>> 
>>>>  In the situation that there exist a large number of timeseries, it would 
>>>> take a long time to restart IoTDB by reading mlog.txt and executing the 
>>>> commands line by line.
>>>>  For example, it takes about 2 minutes to restart with 20M timeseries.
>>>> 
>>>>  To solve this problem, “checkpoint” is designed and added to MTree to 
>>>> reduce the time of reading mlog when IoTDB restarts: 
>>>>  Generate a snapshot, which includes the serialization of MTree, every 
>>>> time mlog reaches a certain number of lines.
>>>>  When a new snapshot is generated, the old one is deleted. Snapshot file 
>>>> and mlog.txt are in the same directory.
>>>> 
>>>>  Users could configure the threshold number of the mlog lines. By default, 
>>>> a snapshot is generated for every 100k lines.
>>>> 
>>>>  I’ve already made a demo and proved that the method could speed up the 
>>>> restarting process.
>>>>  As for the reading mlog.txt and initializing MTree part, it reduces time 
>>>> by 28.3% (16.6s with origin method, 11.9s with new demo, both for 2M 
>>>> timeseries).
>>>> 
>>>>  I would like to make a PR afterwards. If you have any suggestions about 
>>>> the design, feel free to discuss with me.
>>>> 
>>>> 
>>>>  [1] https://issues.apache.org/jira/browse/IOTDB-726
>>>> 
>>>> 
>>>>  Best,
>>>>  -----------------------------------
>>>>  Zesong Sun
>>>>  School of Software, Tsinghua University
>>>> 
>>>>  孙泽嵩
>>>>  清华大学 软件学院
>>>> 
>>>> 
>>> 
> 

Reply via email to