Hi Lingzhe, Suggest you give up your email client...
Or, do not use any rich-format in the mailing list. Best, ----------------------------------- Xiangdong Huang School of Software, Tsinghua University 黄向东 清华大学 软件学院 445073309 <[email protected]> 于2020年7月20日周一 下午3:47写道: > Hi, > > > I convert figure to symbolic.. > > * when we flush 5(max_vm_num) times, the current level will do > compaction > > to the next level > 1 1 1 1 1 > | / / / / > 5 > > * when we flush all 11 times, the compaction procedure is > 1 1 1 1 1 1 1 1 1 1 1 > | / / / / | / / / / > 5 5 > > * if we close the tsfile, the whole compaction procedure will be > 1 1 1 1 1 1 1 1 1 1 1 > | / / / / | / / / / / > 5 5 > / > > | / > / > 11 > > > > Suppose the parameter is 5. Then in level 2, will you merge 4 new VM > files > > to the bigger one, or merge 5 VM files? > I will merge 5 VM files to a bigger one in level 2. > > > ----------------------------------- > Lingzhe Zhang > School of Software, Tsinghua University > > 张凌哲 > 清华大学 软件学院 > ------------------ 原始邮件 ------------------ > 发件人: > "dev" > < > [email protected]>; > 发送时间: 2020年7月20日(星期一) 下午3:39 > 收件人: "dev"<[email protected]>; > > 主题: Re: add vm(hot compaction) in tsfile processor > > > > Hi, > > Did you attach some figures? The mailing list does not allow figures.. > > Suppose the parameter is 5. Then in level 2, will you merge 4 new VM files > to the bigger one, or merge 5 VM files? > > Best, > ----------------------------------- > Xiangdong Huang > School of Software, Tsinghua University > > 黄向东 > 清华大学 软件学院 > > > 445073309 <[email protected]> 于2020年7月20日周一 下午3:31写道: > > > Hi, > > > > max_vm_num means that the most number of vm files relation to a > tsfile's > > level. > > for example, we set max_vm_num=5 and we flush 11 times, then the > > compaction procedure can be described as below: > > * when we flush 5(max_vm_num) times, the current level will do > compaction > > to the next level > > * when we flush all 11 times, the compaction procedure is > > * if we close the tsfile, the whole compaction procedure will be > > > > > > And we set default max_vm_num=5 in current version, if user do not > know > > which value is suitable, just use the default value is enough to make > chunk > > larger. > > Best, > > ----------------------------------- > > Lingzhe Zhang > > School of Software, Tsinghua University > > > > 张凌哲 > > 清华大学 软件学院 > > > > > > ------------------ 原始邮件 ------------------ > > *发件人:* "dev" <[email protected]>; > > *发送时间:* 2020年7月20日(星期一) 下午3:10 > > *收件人:* "dev"<[email protected]>; > > *主题:* Re: add vm(hot compaction) in tsfile processor > > > > Hi Lingzhe, > > > > >max_vm_num: indicates that a TsFileProcessor has at most the > number of > > virtual memory files > > > > what does this mean? and how do I know what value is suitable? (For > > example, if I set it as 1, is there any impact?) > > > > Best, > > ----------------------------------- > > Xiangdong Huang > > School of Software, Tsinghua University > > > > 黄向东 > > 清华大学 软件学院 > > > > > > 445073309 <[email protected]> 于2020年7月20日周一 下午12:42写道: > > > > > Hi, > > > > > > > > > I met a problem that iotdb will write small chunk data when lack > of > > > memtable num, this causes the system to query hot data more > slowly. > > > > > > > > > So I create a new type of file -- vm file, and use it to do hot > > compaction > > > in flush processor. With this, we can flexiblily controll the > size of > > each > > > chunk. The configuration and usage changes will be described as > follows: > > > * add a new parameter enable_vm in iotdb-engine.properties: > indicates > > > whether to use virtual memory > > > * use parameter avg_series_point_number_threshold in > > iotdb-engine.properties: > > > indicates the minimum average number of chunk data points after > hot > > > compaction > > > * add a new parameter max_vm_num in iotdb-engine.properties: > > > indicates that a TsFileProcessor has at most the number of > virtual memory > > > files > > > * add a new parameter max_merge_chunk_num_in_tsfile in > > iotdb-engine.properties: > > > indicates the vm files max merge times > > > * the suffix of the vm file is'.vm', and the naming convention is > > > {tsfile_name}-{level}-{timestamp}.vm > > > > > > And there are many detail changes like: > > > * set virtual memory file list List<List<TsFileResource>> > > > vmTsFileResources for each TsFileProcessor, add > > > List<List<RestorableTsFileIOWriter>> vmWriters for > management > > > * in the recover process, the recovery of the vm file is newly > added, and > > > the corresponding TsFileProcessor is injected after the recovery > > > > > > The compaction strategy is now writen like > LeveledCompactionStrategy in > > Cassandra, > > > and it can be optimized later. > > > > > > I put the detail zh-doc in the attachment. > > > > > > Thanks, > > > -- > > > Lingzhe Zhang > > > School of Software, Tsinghua University > > > > > > 张凌哲 > > > 清华大学 软件学院 > > > > > > >
