Re: [IOTDB-726] CheckPoint of MTree

Jialin Qiao Fri, 19 Jun 2020 01:44:08 -0700

Hi,

The goal of checkpoint of MTree is accelerating the deserialization when 
restarting.


So, just find an idle time of MTree and snapshot it asynchronously is ok.

Thanks,
--
Jialin Qiao
School of Software, Tsinghua University

乔嘉林
清华大学 软件学院

> -----原始邮件-----
> 发件人: "Xiangdong Huang" <[email protected]>
> 发送时间: 2020-06-19 16:36:21 (星期五)
> 收件人: dev <[email protected]>
> 抄送: 
> 主题: Re: [IOTDB-726] CheckPoint of MTree
> 
> > I did an experiment for 1M timeseries, and the serialization process
> costs 971ms.
> 
> 971ms for Serializing 1M timeseries, but 6 seconds for deserializing?
> 
> > I didn’t time this … I’ll do an experiment after fixing the suggested
> changes in current PR [1]
> 
> The problem of current PR is that your snapshot is larger and larger along
> with the system running.
> Any idea about this case?
> 
> Best,
> -----------------------------------
> Xiangdong Huang
> School of Software, Tsinghua University
> 
>  黄向东
> 清华大学 软件学院
> 
> 
> 孙泽嵩 <[email protected]> 于2020年6月19日周五 下午2:20写道：
> 
> > Wow, thanks, Julian!
> >
> > Let me try and do experiments to get the best result : )
> >
> > Best,
> > -----------------------------------
> > Zesong Sun
> > School of Software, Tsinghua University
> >
> > 孙泽嵩
> > 清华大学 软件学院
> >
> > > 2020年6月19日 14:14，Julian Feinauer <[email protected]> 写道：
> > >
> > > Oh and another note. By using a faster serialization Lib than Java
> > default we could ideally speed up the process up to 10x.
> > >
> > > See eg here https://github.com/RuedigerMoeller/fast-serialization
> > >
> > > Julian
> > >
> > > Holen Sie sich Outlook für Android<https://aka.ms/ghei36>
> > >
> > > ________________________________
> > > From: Julian Feinauer <[email protected]>
> > > Sent: Friday, June 19, 2020 8:11:56 AM
> > > To: [email protected] <[email protected]>
> > > Subject: Re: [IOTDB-726] CheckPoint of MTree
> > >
> > > What about using some kind of cache that spills to disk. That way we
> > would be up in no time and just lazy load devices when needed.
> > >
> > > I remember that eh cache has such features (
> > https://www.baeldung.com/ehcache) but there are other implementations as
> > well.
> > >
> > > Julian
> > >
> > > Holen Sie sich Outlook für Android<https://aka.ms/ghei36>
> > >
> > > ________________________________
> > > From: 孙泽嵩 <[email protected]>
> > > Sent: Friday, June 19, 2020 7:57:51 AM
> > > To: [email protected] <[email protected]>
> > > Subject: Re: [IOTDB-726] CheckPoint of MTree
> > >
> > > Hi Jialin,
> > >
> > > I did an experiment for 1M timeseries, and the serialization process
> > costs 971ms.
> > >
> > > Maybe we could consider creating a snapshot when the MTree is not
> > changed for a long time (for example, one hour).
> > >
> > > In this way, the client will not be stuck and users may not even notice
> > it.
> > >
> > >
> > > Best,
> > > -----------------------------------
> > > Zesong Sun
> > > School of Software, Tsinghua University
> > >
> > > 孙泽嵩
> > > 清华大学 软件学院
> > >
> > >> 2020年6月18日 16:19，孙泽嵩 <[email protected]> 写道：
> > >>
> > >> Hi,
> > >>
> > >> Good opinions!
> > >>
> > >>> how about adding a "create snapshot for schema" sql to let users
> > trigger this manually
> > >>
> > >> I’ll add this sql in a new PR.
> > >>
> > >>> how long it takes to recover from a 1M timeseries snapshot.
> > >>
> > >> Based on my previous experiment, it takes about 6s as you said.
> > >>
> > >>> how long it takes to create a snapshot for 1M/10M timeseries?
> > >>
> > >> I didn’t time this … I’ll do an experiment after fixing the suggested
> > changes in current PR [1]
> > >>
> > >>
> > >> [1] https://github.com/apache/incubator-iotdb/pull/1384
> > >>
> > >>
> > >> Best,
> > >> -----------------------------------
> > >> Zesong Sun
> > >> School of Software, Tsinghua University
> > >>
> > >> 孙泽嵩
> > >> 清华大学 软件学院
> > >>
> > >>> 2020年6月18日 14:39，Jialin Qiao <[email protected]> 写道：
> > >>>
> > >>> Hi,
> > >>>
> > >>> Currently, the snapshot is triggered every xxx lines in mlog.txt.
> > >>> When meeting 20M timeseries, the default 10k lines will cause too many
> > snapshot, which will block the creating.
> > >>> However, if we enlarge the condition to 1M, the last 1M will take
> > about 6s to recover, about 160K per second.
> > >>>
> > >>> So, my concern is how long it takes to create a snapshot for 1M/10M
> > timeseries? And how long it takes to recover from a 1M timeseries snapshot.
> > >>>
> > >>> Besides, how about adding a "create snapshot for schema" sql to let
> > users trigger this manually?
> > >>>
> > >>> Thanks,
> > >>> --
> > >>> Jialin Qiao
> > >>> School of Software, Tsinghua University
> > >>>
> > >>> 乔嘉林
> > >>> 清华大学 软件学院
> > >>>
> > >>>> -----原始邮件-----
> > >>>> 发件人: "孙泽嵩" <[email protected]>
> > >>>> 发送时间: 2020-06-15 19:14:08 (星期一)
> > >>>> 收件人: [email protected]
> > >>>> 抄送:
> > >>>> 主题: Re: [IOTDB-726] CheckPoint of MTree
> > >>>>
> > >>>> Hi Julian,
> > >>>>
> > >>>> Currently I’m just using plain text file.
> > >>>>
> > >>>> But I could consider and try with RocksDB : )
> > >>>> I also noticed that there is an issue related to RocksDB integration
> > [1].
> > >>>>
> > >>>>
> > >>>> [1] https://issues.apache.org/jira/browse/IOTDB-767
> > >>>>
> > >>>>
> > >>>> Best,
> > >>>> -----------------------------------
> > >>>> Zesong Sun
> > >>>> School of Software, Tsinghua University
> > >>>>
> > >>>> 孙泽嵩
> > >>>> 清华大学 软件学院
> > >>>>
> > >>>>> 2020年6月15日 19:00，Julian Feinauer <[email protected]> 写道：
> > >>>>>
> > >>>>> Hi Zesong,
> > >>>>>
> > >>>>> this is an excellent Idea!
> > >>>>> Do you serialize the snapshot as plain text file?
> > >>>>> Or would it make sense to use something like RocksDB for something
> > like that?
> > >>>>>
> > >>>>> Julian
> > >>>>>
> > >>>>> Am 15.06.20, 12:12 schrieb "孙泽嵩" <[email protected]>:
> > >>>>>
> > >>>>> Greetings,
> > >>>>>
> > >>>>> I’m currently working on issue [IOTDB-726] CheckPoint of MTree [1]
> > >>>>>
> > >>>>> In the situation that there exist a large number of timeseries, it
> > would take a long time to restart IoTDB by reading mlog.txt and executing
> > the commands line by line.
> > >>>>> For example, it takes about 2 minutes to restart with 20M timeseries.
> > >>>>>
> > >>>>> To solve this problem, “checkpoint” is designed and added to MTree
> > to reduce the time of reading mlog when IoTDB restarts:
> > >>>>> Generate a snapshot, which includes the serialization of MTree,
> > every time mlog reaches a certain number of lines.
> > >>>>> When a new snapshot is generated, the old one is deleted. Snapshot
> > file and mlog.txt are in the same directory.
> > >>>>>
> > >>>>> Users could configure the threshold number of the mlog lines. By
> > default, a snapshot is generated for every 100k lines.
> > >>>>>
> > >>>>> I’ve already made a demo and proved that the method could speed up
> > the restarting process.
> > >>>>> As for the reading mlog.txt and initializing MTree part, it reduces
> > time by 28.3% (16.6s with origin method, 11.9s with new demo, both for 2M
> > timeseries).
> > >>>>>
> > >>>>> I would like to make a PR afterwards. If you have any suggestions
> > about the design, feel free to discuss with me.
> > >>>>>
> > >>>>>
> > >>>>> [1] https://issues.apache.org/jira/browse/IOTDB-726
> > >>>>>
> > >>>>>
> > >>>>> Best,
> > >>>>> -----------------------------------
> > >>>>> Zesong Sun
> > >>>>> School of Software, Tsinghua University
> > >>>>>
> > >>>>> 孙泽嵩
> > >>>>> 清华大学 软件学院
> > >>>>>
> > >>>>>
> > >>>>
> > >>
> > >
> >
> >

Re: [IOTDB-726] CheckPoint of MTree

Reply via email to