Hi, Did you attach some figures? The mailing list does not allow figures..
Suppose the parameter is 5. Then in level 2, will you merge 4 new VM files to the bigger one, or merge 5 VM files? Best, ----------------------------------- Xiangdong Huang School of Software, Tsinghua University 黄向东 清华大学 软件学院 445073309 <sure...@foxmail.com> 于2020年7月20日周一 下午3:31写道: > Hi, > > max_vm_num means that the most number of vm files relation to a tsfile's > level. > for example, we set max_vm_num=5 and we flush 11 times, then the > compaction procedure can be described as below: > * when we flush 5(max_vm_num) times, the current level will do compaction > to the next level > * when we flush all 11 times, the compaction procedure is > * if we close the tsfile, the whole compaction procedure will be > > > And we set default max_vm_num=5 in current version, if user do not know > which value is suitable, just use the default value is enough to make chunk > larger. > Best, > ----------------------------------- > Lingzhe Zhang > School of Software, Tsinghua University > > 张凌哲 > 清华大学 软件学院 > > > ------------------ 原始邮件 ------------------ > *发件人:* "dev" <saint...@gmail.com>; > *发送时间:* 2020年7月20日(星期一) 下午3:10 > *收件人:* "dev"<dev@iotdb.apache.org>; > *主题:* Re: add vm(hot compaction) in tsfile processor > > Hi Lingzhe, > > >max_vm_num: indicates that a TsFileProcessor has at most the number of > virtual memory files > > what does this mean? and how do I know what value is suitable? (For > example, if I set it as 1, is there any impact?) > > Best, > ----------------------------------- > Xiangdong Huang > School of Software, Tsinghua University > > 黄向东 > 清华大学 软件学院 > > > 445073309 <sure...@foxmail.com> 于2020年7月20日周一 下午12:42写道: > > > Hi, > > > > > > I met a problem that iotdb will write small chunk data when lack of > > memtable num, this causes the system to query hot data more slowly. > > > > > > So I create a new type of file -- vm file, and use it to do hot > compaction > > in flush processor. With this, we can flexiblily controll the size of > each > > chunk. The configuration and usage changes will be described as follows: > > * add a new parameter enable_vm in iotdb-engine.properties: indicates > > whether to use virtual memory > > * use parameter avg_series_point_number_threshold in > iotdb-engine.properties: > > indicates the minimum average number of chunk data points after hot > > compaction > > * add a new parameter max_vm_num in iotdb-engine.properties: > > indicates that a TsFileProcessor has at most the number of virtual memory > > files > > * add a new parameter max_merge_chunk_num_in_tsfile in > iotdb-engine.properties: > > indicates the vm files max merge times > > * the suffix of the vm file is'.vm', and the naming convention is > > {tsfile_name}-{level}-{timestamp}.vm > > > > And there are many detail changes like: > > * set virtual memory file list List<List<TsFileResource>> > > vmTsFileResources for each TsFileProcessor, add > > List<List<RestorableTsFileIOWriter>> vmWriters for management > > * in the recover process, the recovery of the vm file is newly added, and > > the corresponding TsFileProcessor is injected after the recovery > > > > The compaction strategy is now writen like LeveledCompactionStrategy in > Cassandra, > > and it can be optimized later. > > > > I put the detail zh-doc in the attachment. > > > > Thanks, > > -- > > Lingzhe Zhang > > School of Software, Tsinghua University > > > > 张凌哲 > > 清华大学 软件学院 > > > >