Re: Re: Re: Avoid long-tail insertion

Jialin Qiao Mon, 24 Jun 2019 05:25:21 -0700

Yes, there are many changes. The branch I am working on is 
feature_async_close_tsfile. 
Anyone interested is welcome to join and discuss.


Best,
--
Jialin Qiao
School of Software, Tsinghua University

乔嘉林
清华大学 软件学院

> -----原始邮件-----
> 发件人: "Xiangdong Huang" <[email protected]>
> 发送时间: 2019-06-23 10:59:29 (星期日)
> 收件人: [email protected]
> 抄送: 
> 主题: Re: Re: Avoid long-tail insertion
> 
> Hi,
> 
> Once your work branch is almost ready, let me know so I can help to review.
> I think it is a HUGE PR...
> 
> -----------------------------------
> Xiangdong Huang
> School of Software, Tsinghua University
> 
>  黄向东
> 清华大学 软件学院
> 
> 
> Jialin Qiao <[email protected]> 于2019年6月22日周六 下午9:57写道：
> 
> > Hi Xiangdong,
> >
> > I will merge this patch. Let "Directories" manage the folders of both
> > sequence and unSequence files is good.
> >
> > However, the naming of "Directories" is not clear. It would be better to
> > rename to "DirectoryManager"
> >
> > Best,
> > --
> > Jialin Qiao
> > School of Software, Tsinghua University
> >
> > 乔嘉林
> > 清华大学 软件学院
> >
> > > -----原始邮件-----
> > > 发件人: "Xiangdong Huang" <[email protected]>
> > > 发送时间: 2019-06-22 16:35:29 (星期六)
> > > 收件人: [email protected]
> > > 抄送:
> > > 主题: Re: Avoid long-tail insertion
> > >
> > > Hi jialin,
> > >
> > > I submit some modifications for:
> > >
> > > * add the overflow data folder location setting in the
> > > iotdb-engine.properties;
> > > * let Directories.java to manage the above folder.
> > >
> > > If you need to refactor the overflow when you solving the long tail
> > issue,
> > > you can apply the patch from [1] first to simplify your work.
> > >
> > > [1]
> > >
> > https://issues.apache.org/jira/secure/attachment/12972547/overflow-folder.patch
> > >
> > > Best,
> > > -----------------------------------
> > > Xiangdong Huang
> > > School of Software, Tsinghua University
> > >
> > >  黄向东
> > > 清华大学 软件学院
> > >
> > >
> > > Xiangdong Huang <[email protected]> 于2019年6月22日周六 下午3:19写道：
> > >
> > > > If you change the process like this, i.e., there are more than one
> > > > unsealed TsFiles for each storage group, then  you have to modify the
> > WAL
> > > > module.. Because current WAL module only recognizes the last unsealed
> > > > TsFile..
> > > >
> > > > By the way, "sealed" is better than "closed", I think..  A sealed file
> > > > means the file which has the magic string at the head and the tail.
> > > >
> > > > Best,
> > > > -----------------------------------
> > > > Xiangdong Huang
> > > > School of Software, Tsinghua University
> > > >
> > > >  黄向东
> > > > 清华大学 软件学院
> > > >
> > > >
> > > > Jialin Qiao <[email protected]> 于2019年6月22日周六 下午2:54写道：
> > > >
> > > >>
> > > >> Hi, I am solving the long-tail latency problem.
> > > >>
> > > >> There are some cases (blocking points) that blocking the insertion.
> > For a
> > > >> better understanding of this problem, I first introduce the writing
> > process
> > > >> of IoTDB:
> > > >>
> > > >> IoTDB maintains several independent engines (storage group) that
> > supports
> > > >> read and write. In the following, we focus on one engine. A engine
> > > >> maintains several closed data files and one unclosed data file that
> > > >> receives appended data. In memory, there is only one working memtable
> > (m1)
> > > >> that receives writes. There is also another memtable (m2) that will
> > take
> > > >> place m1 when m1 is full and being flushed.
> > > >>
> > > >> When a data item is inserted:
> > > >>
> > > >> (1)We insert it into the working memtable.
> > > >> (2)We check the size of the memtable. If it reaches a threshold, we
> > > >> submit a flush task “after the previous flush task is finished” and
> > switch
> > > >> the two memtables.
> > > >> (3)We check the size of the unclosed file. If it reaches a threshold,
> > we
> > > >> close it “after the previous flush task is finished”.
> > > >>
> > > >> In the above steps, all the "after the previous flush task is
> > finished"
> > > >> will block the insertion process. One solution is to make all flush
> > and
> > > >> close task asynchronous. Some questions need to carefully considered:
> > > >>
> > > >> (1) Many memtables may be flushed concurrently to an unclosed file.
> > How
> > > >> to guarantee the order of serialization?
> > > >> (2) Once a close task is submitted, a new unclosed file will be
> > created
> > > >> and receives appended data. So there will exists many unclosed files.
> > How
> > > >> the query and compaction process will be impacted?
> > > >>
> > > >> Thanks,
> > > >>
> > > >> Jialin Qiao
> > > >> School of Software, Tsinghua University
> > > >>
> > > >> 乔嘉林
> > > >> 清华大学 软件学院
> > > >>
> > > >> > -----原始邮件-----
> > > >> > 发件人: "Xiangdong Huang" <[email protected]>
> > > >> > 发送时间: 2019-06-04 23:08:34 (星期二)
> > > >> > 收件人: [email protected], "江天" <[email protected]>
> > > >> > 抄送:
> > > >> > 主题: Re: [jira] [Created] (IOTDB-112) Avoid long tail insertion
> > which is
> > > >> caused by synchronized close-bufferwrite
> > > >> >
> > > >> > I attached the histogram of the latency in the JIRA.
> > > >> >
> > > >> > The x-axis is the latency while the y-axis is the cumulative
> > > >> distribution.
> > > >> > We can see that about 30% insertion can be finished in 20ms, and 60%
> > > >> > insertion can be finished in 40ms even though the IoTDB instance is
> > > >> serving
> > > >> > for a heavy workload... So, eliminating the long tail insertion can
> > make
> > > >> > the average latency far better.
> > > >> >
> > > >> > If someone is working on the refactor_overflow or
> > refactor_bufferwrite,
> > > >> > please pay attention to the code branch for this issue.
> > > >> >
> > > >> > Best,
> > > >> >
> > > >> > -----------------------------------
> > > >> > Xiangdong Huang
> > > >> > School of Software, Tsinghua University
> > > >> >
> > > >> >  黄向东
> > > >> > 清华大学 软件学院
> > > >> >
> > > >> >
> > > >> > xiangdong Huang (JIRA) <[email protected]> 于2019年6月4日周二 下午11:00写道：
> > > >> >
> > > >> > > xiangdong Huang created IOTDB-112:
> > > >> > > -------------------------------------
> > > >> > >
> > > >> > >              Summary: Avoid long tail insertion which is caused by
> > > >> > > synchronized close-bufferwrite
> > > >> > >                  Key: IOTDB-112
> > > >> > >                  URL:
> > https://issues.apache.org/jira/browse/IOTDB-112
> > > >> > >              Project: Apache IoTDB
> > > >> > >           Issue Type: Improvement
> > > >> > >             Reporter: xiangdong Huang
> > > >> > >
> > > >> > >
> > > >> > > In our test, IoTDB has a good insertion performance, and the
> > average
> > > >> > > latency can be ~200 ms in a given workload and hardware.
> > > >> > >
> > > >> > > However, when we draw the histogram of the latency, we find that
> > 97.5%
> > > >> > > latencies are less than 200 ms, while 2.7% latencies are greater.
> > The
> > > >> > > result shows that there are some long tail latency.
> > > >> > >
> > > >> > > Then we find that some insertion latencies are about 30 seconds...
> > > >> (but
> > > >> > > the ratio is less than 0.5%). Indeed, for each connection, a long
> > tail
> > > >> > > insertion appears per 1 or 2 minutes....
> > > >> > >
> > > >> > > By reading source codes, I think it is because that in the
> > insertion
> > > >> > > function,
> > > >> > >
> > > >> > > `private void insertBufferWrite(FileNodeProcessor
> > fileNodeProcessor,
> > > >> long
> > > >> > > timestamp,
> > > >> > >  boolean isMonitor, TSRecord tsRecord, String deviceId)`,
> > > >> > >
> > > >> > > if the corresponding TsFile is too large, the function is blocked
> > > >> until
> > > >> > > the memtable is flushed on disk and the TsFile is sealed (we call
> > it
> > > >> as
> > > >> > > closing a TsFile). The latencies of the long tail insertions are
> > very
> > > >> close
> > > >> > > to the time cost of flushing and sealing a TsFile.
> > > >> > >
> > > >> > > So, if we set the closing function using the async mode, we can
> > avoid
> > > >> the
> > > >> > > long tail insertion.
> > > >> > >
> > > >> > > However,  there are some side effects we have to fix:
> > > >> > >  # At the same time, if a new insertion comes, then a new memtable
> > > >> should
> > > >> > > be assigned, and a new unsealed TsFile is created;
> > > >> > >  # That means that there are more than 1 unsealed TsFiles if the
> > > >> system is
> > > >> > > crashed before the closing function is finished. So, we have to
> > > >> modify the
> > > >> > > startup process to recover these files.
> > > >> > >
> > > >> > > Is there any other side effect that I have to pay attention to?
> > > >> > >
> > > >> > >
> > > >> > >
> > > >> > > --
> > > >> > > This message was sent by Atlassian JIRA
> > > >> > > (v7.6.3#76005)
> > > >> > >
> > > >>
> > > >
> >

Re: Re: Re: Avoid long-tail insertion

Reply via email to