Re: [IOTDB-726] CheckPoint of MTree

Julian Feinauer Fri, 19 Jun 2020 03:11:17 -0700

Yes, we could then also use all cores for deserialization (if thats the 
bottleneck) for reloading all of them.
Or generally store only some K in one file and then open another one then we 
could again take care of parallelism.


J

Am 19.06.20, 11:35 schrieb "Xiangdong Huang" <[email protected]>:

    > Another thing we could consider is to chunk them according to their
    namespaces in folders / files or any other struct.

    according to the Storage group names, for example.

    -----------------------------------
    Xiangdong Huang
    School of Software, Tsinghua University

     黄向东
    清华大学 软件学院


    Julian Feinauer <[email protected]> 于2020年6月19日周五 下午4:54写道：

    > Another thing we could consider is to chunk them according to their
    > namespaces in folders / files or any other struct. Then we could
    > efficiently do lazy loading and only pick what we really need.
    >
    > WDYT?
    >
    > Am 19.06.20, 10:36 schrieb "Xiangdong Huang" <[email protected]>:
    >
    >     > I did an experiment for 1M timeseries, and the serialization process
    >     costs 971ms.
    >
    >     971ms for Serializing 1M timeseries, but 6 seconds for deserializing?
    >
    >     > I didn’t time this … I’ll do an experiment after fixing the 
suggested
    >     changes in current PR [1]
    >
    >     The problem of current PR is that your snapshot is larger and larger
    > along
    >     with the system running.
    >     Any idea about this case?
    >
    >     Best,
    >     -----------------------------------
    >     Xiangdong Huang
    >     School of Software, Tsinghua University
    >
    >      黄向东
    >     清华大学 软件学院
    >
    >
    >     孙泽嵩 <[email protected]> 于2020年6月19日周五 下午2:20写道：
    >
    >     > Wow, thanks, Julian!
    >     >
    >     > Let me try and do experiments to get the best result : )
    >     >
    >     > Best,
    >     > -----------------------------------
    >     > Zesong Sun
    >     > School of Software, Tsinghua University
    >     >
    >     > 孙泽嵩
    >     > 清华大学 软件学院
    >     >
    >     > > 2020年6月19日 14:14，Julian Feinauer <[email protected]>
    > 写道：
    >     > >
    >     > > Oh and another note. By using a faster serialization Lib than Java
    >     > default we could ideally speed up the process up to 10x.
    >     > >
    >     > > See eg here https://github.com/RuedigerMoeller/fast-serialization
    >     > >
    >     > > Julian
    >     > >
    >     > > Holen Sie sich Outlook für Android<https://aka.ms/ghei36>
    >     > >
    >     > > ________________________________
    >     > > From: Julian Feinauer <[email protected]>
    >     > > Sent: Friday, June 19, 2020 8:11:56 AM
    >     > > To: [email protected] <[email protected]>
    >     > > Subject: Re: [IOTDB-726] CheckPoint of MTree
    >     > >
    >     > > What about using some kind of cache that spills to disk. That way
    > we
    >     > would be up in no time and just lazy load devices when needed.
    >     > >
    >     > > I remember that eh cache has such features (
    >     > https://www.baeldung.com/ehcache) but there are other
    > implementations as
    >     > well.
    >     > >
    >     > > Julian
    >     > >
    >     > > Holen Sie sich Outlook für Android<https://aka.ms/ghei36>
    >     > >
    >     > > ________________________________
    >     > > From: 孙泽嵩 <[email protected]>
    >     > > Sent: Friday, June 19, 2020 7:57:51 AM
    >     > > To: [email protected] <[email protected]>
    >     > > Subject: Re: [IOTDB-726] CheckPoint of MTree
    >     > >
    >     > > Hi Jialin,
    >     > >
    >     > > I did an experiment for 1M timeseries, and the serialization
    > process
    >     > costs 971ms.
    >     > >
    >     > > Maybe we could consider creating a snapshot when the MTree is not
    >     > changed for a long time (for example, one hour).
    >     > >
    >     > > In this way, the client will not be stuck and users may not even
    > notice
    >     > it.
    >     > >
    >     > >
    >     > > Best,
    >     > > -----------------------------------
    >     > > Zesong Sun
    >     > > School of Software, Tsinghua University
    >     > >
    >     > > 孙泽嵩
    >     > > 清华大学 软件学院
    >     > >
    >     > >> 2020年6月18日 16:19，孙泽嵩 <[email protected]> 写道：
    >     > >>
    >     > >> Hi,
    >     > >>
    >     > >> Good opinions!
    >     > >>
    >     > >>> how about adding a "create snapshot for schema" sql to let users
    >     > trigger this manually
    >     > >>
    >     > >> I’ll add this sql in a new PR.
    >     > >>
    >     > >>> how long it takes to recover from a 1M timeseries snapshot.
    >     > >>
    >     > >> Based on my previous experiment, it takes about 6s as you said.
    >     > >>
    >     > >>> how long it takes to create a snapshot for 1M/10M timeseries?
    >     > >>
    >     > >> I didn’t time this … I’ll do an experiment after fixing the
    > suggested
    >     > changes in current PR [1]
    >     > >>
    >     > >>
    >     > >> [1] https://github.com/apache/incubator-iotdb/pull/1384
    >     > >>
    >     > >>
    >     > >> Best,
    >     > >> -----------------------------------
    >     > >> Zesong Sun
    >     > >> School of Software, Tsinghua University
    >     > >>
    >     > >> 孙泽嵩
    >     > >> 清华大学 软件学院
    >     > >>
    >     > >>> 2020年6月18日 14:39，Jialin Qiao <[email protected]> 写道：
    >     > >>>
    >     > >>> Hi,
    >     > >>>
    >     > >>> Currently, the snapshot is triggered every xxx lines in 
mlog.txt.
    >     > >>> When meeting 20M timeseries, the default 10k lines will cause
    > too many
    >     > snapshot, which will block the creating.
    >     > >>> However, if we enlarge the condition to 1M, the last 1M will 
take
    >     > about 6s to recover, about 160K per second.
    >     > >>>
    >     > >>> So, my concern is how long it takes to create a snapshot for
    > 1M/10M
    >     > timeseries? And how long it takes to recover from a 1M timeseries
    > snapshot.
    >     > >>>
    >     > >>> Besides, how about adding a "create snapshot for schema" sql to
    > let
    >     > users trigger this manually?
    >     > >>>
    >     > >>> Thanks,
    >     > >>> --
    >     > >>> Jialin Qiao
    >     > >>> School of Software, Tsinghua University
    >     > >>>
    >     > >>> 乔嘉林
    >     > >>> 清华大学 软件学院
    >     > >>>
    >     > >>>> -----原始邮件-----
    >     > >>>> 发件人: "孙泽嵩" <[email protected]>
    >     > >>>> 发送时间: 2020-06-15 19:14:08 (星期一)
    >     > >>>> 收件人: [email protected]
    >     > >>>> 抄送:
    >     > >>>> 主题: Re: [IOTDB-726] CheckPoint of MTree
    >     > >>>>
    >     > >>>> Hi Julian,
    >     > >>>>
    >     > >>>> Currently I’m just using plain text file.
    >     > >>>>
    >     > >>>> But I could consider and try with RocksDB : )
    >     > >>>> I also noticed that there is an issue related to RocksDB
    > integration
    >     > [1].
    >     > >>>>
    >     > >>>>
    >     > >>>> [1] https://issues.apache.org/jira/browse/IOTDB-767
    >     > >>>>
    >     > >>>>
    >     > >>>> Best,
    >     > >>>> -----------------------------------
    >     > >>>> Zesong Sun
    >     > >>>> School of Software, Tsinghua University
    >     > >>>>
    >     > >>>> 孙泽嵩
    >     > >>>> 清华大学 软件学院
    >     > >>>>
    >     > >>>>> 2020年6月15日 19:00，Julian Feinauer 
<[email protected]>
    > 写道：
    >     > >>>>>
    >     > >>>>> Hi Zesong,
    >     > >>>>>
    >     > >>>>> this is an excellent Idea!
    >     > >>>>> Do you serialize the snapshot as plain text file?
    >     > >>>>> Or would it make sense to use something like RocksDB for
    > something
    >     > like that?
    >     > >>>>>
    >     > >>>>> Julian
    >     > >>>>>
    >     > >>>>> Am 15.06.20, 12:12 schrieb "孙泽嵩" <[email protected]
    > >:
    >     > >>>>>
    >     > >>>>> Greetings,
    >     > >>>>>
    >     > >>>>> I’m currently working on issue [IOTDB-726] CheckPoint of MTree
    > [1]
    >     > >>>>>
    >     > >>>>> In the situation that there exist a large number of
    > timeseries, it
    >     > would take a long time to restart IoTDB by reading mlog.txt and
    > executing
    >     > the commands line by line.
    >     > >>>>> For example, it takes about 2 minutes to restart with 20M
    > timeseries.
    >     > >>>>>
    >     > >>>>> To solve this problem, “checkpoint” is designed and added to
    > MTree
    >     > to reduce the time of reading mlog when IoTDB restarts:
    >     > >>>>> Generate a snapshot, which includes the serialization of 
MTree,
    >     > every time mlog reaches a certain number of lines.
    >     > >>>>> When a new snapshot is generated, the old one is deleted.
    > Snapshot
    >     > file and mlog.txt are in the same directory.
    >     > >>>>>
    >     > >>>>> Users could configure the threshold number of the mlog lines.
    > By
    >     > default, a snapshot is generated for every 100k lines.
    >     > >>>>>
    >     > >>>>> I’ve already made a demo and proved that the method could
    > speed up
    >     > the restarting process.
    >     > >>>>> As for the reading mlog.txt and initializing MTree part, it
    > reduces
    >     > time by 28.3% (16.6s with origin method, 11.9s with new demo, both
    > for 2M
    >     > timeseries).
    >     > >>>>>
    >     > >>>>> I would like to make a PR afterwards. If you have any
    > suggestions
    >     > about the design, feel free to discuss with me.
    >     > >>>>>
    >     > >>>>>
    >     > >>>>> [1] https://issues.apache.org/jira/browse/IOTDB-726
    >     > >>>>>
    >     > >>>>>
    >     > >>>>> Best,
    >     > >>>>> -----------------------------------
    >     > >>>>> Zesong Sun
    >     > >>>>> School of Software, Tsinghua University
    >     > >>>>>
    >     > >>>>> 孙泽嵩
    >     > >>>>> 清华大学 软件学院
    >     > >>>>>
    >     > >>>>>
    >     > >>>>
    >     > >>
    >     > >
    >     >
    >     >
    >
    >

Re: [IOTDB-726] CheckPoint of MTree

Reply via email to