Re: add vm(hot compaction) in tsfile processor

Xiangdong Huang Mon, 20 Jul 2020 01:04:03 -0700

Hi Lingzhe,

Suggest you give up your email client...


Or, do not use any rich-format in the mailing list.

Best,
-----------------------------------
Xiangdong Huang
School of Software, Tsinghua University

 黄向东
清华大学 软件学院


445073309 <[email protected]> 于2020年7月20日周一 下午3:47写道：

> Hi,
>
>
> I convert figure to symbolic..
> &gt; * when we flush 5(max_vm_num) times, the current level will do
> compaction
> &gt; to the next level
> 1 1 1 1 1
> |&nbsp; / / / /
> 5
> &gt; * when we flush all 11 times, the compaction procedure is
> 1 1 1 1 1&nbsp; &nbsp;1 1 1 1 1&nbsp; &nbsp;1
> |&nbsp; / / / /&nbsp; &nbsp; &nbsp;|&nbsp; / / / /
> 5&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;5
> &gt; * if we close the tsfile, the whole compaction procedure will be
> 1 1 1 1 1&nbsp; &nbsp;1 1 1 1 1&nbsp; &nbsp;1
> |&nbsp; / / / /&nbsp; &nbsp; &nbsp;|&nbsp; / / / /&nbsp; &nbsp; /
> 5&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;5&nbsp; &nbsp;
> &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;/
>
> |&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; /&nbsp; &nbsp;
> &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;/
> 11
>
>
> &gt; Suppose the parameter is 5. Then in level 2, will you merge 4 new VM
> files
> &gt; to the bigger one, or merge 5 VM files?
> I will merge 5 VM files to a bigger one&nbsp;in level 2.
>
>
> -----------------------------------
> Lingzhe Zhang
> School of Software, Tsinghua University
>
> 张凌哲
> 清华大学 软件学院
> ------------------&nbsp;原始邮件&nbsp;------------------
> 发件人:
>                                                   "dev"
>                                                                 <
> [email protected]&gt;;
> 发送时间:&nbsp;2020年7月20日(星期一) 下午3:39
> 收件人:&nbsp;"dev"<[email protected]&gt;;
>
> 主题:&nbsp;Re: add vm(hot compaction) in tsfile processor
>
>
>
> Hi,
>
> Did you attach some figures? The mailing list does not allow figures..
>
> Suppose the parameter is 5. Then in level 2, will you merge 4 new VM files
> to the bigger one, or merge 5 VM files?
>
> Best,
> -----------------------------------
> Xiangdong Huang
> School of Software, Tsinghua University
>
> &nbsp;黄向东
> 清华大学 软件学院
>
>
> 445073309 <[email protected]&gt; 于2020年7月20日周一 下午3:31写道：
>
> &gt; Hi,
> &gt;
> &gt; max_vm_num means that the most number of vm files relation to a
> tsfile's
> &gt; level.
> &gt; for example, we set max_vm_num=5 and we flush 11 times, then the
> &gt; compaction procedure can be described as below:
> &gt; * when we flush 5(max_vm_num) times, the current level will do
> compaction
> &gt; to the next level
> &gt; * when we flush all 11 times, the compaction procedure is
> &gt; * if we close the tsfile, the whole compaction procedure will be
> &gt;
> &gt;
> &gt; And we set default max_vm_num=5 in current version, if user do not
> know
> &gt; which value is suitable, just use the default value is enough to make
> chunk
> &gt; larger.
> &gt; Best,
> &gt; -----------------------------------
> &gt; Lingzhe Zhang
> &gt; School of Software, Tsinghua University
> &gt;
> &gt; 张凌哲
> &gt; 清华大学 软件学院
> &gt;
> &gt;
> &gt; ------------------ 原始邮件 ------------------
> &gt; *发件人:* "dev" <[email protected]&gt;;
> &gt; *发送时间:* 2020年7月20日(星期一) 下午3:10
> &gt; *收件人:* "dev"<[email protected]&gt;;
> &gt; *主题:* Re: add vm(hot compaction) in tsfile processor
> &gt;
> &gt; Hi Lingzhe,
> &gt;
> &gt; &gt;max_vm_num: indicates that a TsFileProcessor has at most the
> number of
> &gt; virtual memory files
> &gt;
> &gt; what does this mean? and how do I know what value is suitable? (For
> &gt; example, if I set it as 1, is there any impact?)
> &gt;
> &gt; Best,
> &gt; -----------------------------------
> &gt; Xiangdong Huang
> &gt; School of Software, Tsinghua University
> &gt;
> &gt;&nbsp; 黄向东
> &gt; 清华大学 软件学院
> &gt;
> &gt;
> &gt; 445073309 <[email protected]&gt; 于2020年7月20日周一 下午12:42写道：
> &gt;
> &gt; &gt; Hi,
> &gt; &gt;
> &gt; &gt;
> &gt; &gt; I met a problem that iotdb will write small chunk data when lack
> of
> &gt; &gt; memtable num, this causes the system to query hot data more
> slowly.
> &gt; &gt;
> &gt; &gt;
> &gt; &gt; So I create a new type of file -- vm file, and use it to do hot
> &gt; compaction
> &gt; &gt; in flush processor. With this, we can flexiblily controll the
> size of
> &gt; each
> &gt; &gt; chunk. The configuration and usage changes will be described as
> follows:
> &gt; &gt; * add a new parameter enable_vm in iotdb-engine.properties:
> indicates
> &gt; &gt; whether to use virtual memory
> &gt; &gt; * use parameter avg_series_point_number_threshold in
> &gt; iotdb-engine.properties:
> &gt; &gt; indicates the minimum average number of chunk data points after
> hot
> &gt; &gt; compaction
> &gt; &gt; * add a new parameter max_vm_num in iotdb-engine.properties:
> &gt; &gt; indicates that a TsFileProcessor has at most the number of
> virtual memory
> &gt; &gt; files
> &gt; &gt; * add a new parameter max_merge_chunk_num_in_tsfile in
> &gt; iotdb-engine.properties:
> &gt; &gt; indicates the vm files max merge times
> &gt; &gt; * the suffix of the vm file is'.vm', and the naming convention is
> &gt; &gt; {tsfile_name}-{level}-{timestamp}.vm
> &gt; &gt;
> &gt; &gt; And there are many detail changes like:
> &gt; &gt; * set virtual memory file list List<List<TsFileResource&gt;&gt;
> &gt; &gt; vmTsFileResources for each TsFileProcessor, add
> &gt; &gt; List<List<RestorableTsFileIOWriter&gt;&gt; vmWriters for
> management
> &gt; &gt; * in the recover process, the recovery of the vm file is newly
> added, and
> &gt; &gt; the corresponding TsFileProcessor is injected after the recovery
> &gt; &gt;
> &gt; &gt; The compaction strategy is now writen like
> LeveledCompactionStrategy in
> &gt; Cassandra,
> &gt; &gt; and it can be optimized later.
> &gt; &gt;
> &gt; &gt; I put the detail zh-doc in the attachment.
> &gt; &gt;
> &gt; &gt; Thanks,
> &gt; &gt; --
> &gt; &gt; Lingzhe Zhang
> &gt; &gt; School of Software, Tsinghua University
> &gt; &gt;
> &gt; &gt; 张凌哲
> &gt; &gt; 清华大学 软件学院
> &gt; &gt;
> &gt;
> &gt;

Re: add vm(hot compaction) in tsfile processor

Reply via email to