Hi,
Thanks Lingzhe, this feature could improve the query performance a lot. max_vm_num limits the max number of vm in each level. The max_vm_num is 10 by default and the max_merge_chunk_num_in_tsfile is 100 now. Besides, I can't see you figures attached... Thanks, -- Jialin Qiao School of Software, Tsinghua University 乔嘉林 清华大学 软件学院 -----原始邮件----- 发件人:445073309 <[email protected]> 发送时间:2020-07-20 15:30:46 (星期一) 收件人: dev <[email protected]> 抄送: 主题: Re: add vm(hot compaction) in tsfile processor Hi, max_vm_num means that the most number of vm files relation to a tsfile's level. for example, we set max_vm_num=5 and we flush 11 times, then the compaction procedure can be described as below: * when we flush 5(max_vm_num) times, the current level will do compaction to the next level * when we flush all 11 times, the compaction procedure is * if we close the tsfile, the whole compaction procedure will be And we set default max_vm_num=5 in current version, if user do not know which value is suitable, just use the default value is enough to make chunk larger. Best, ----------------------------------- Lingzhe Zhang School of Software, Tsinghua University 张凌哲 清华大学 软件学院 ------------------ 原始邮件 ------------------ 发件人: "dev" <[email protected]>; 发送时间: 2020年7月20日(星期一) 下午3:10 收件人: "dev"<[email protected]>; 主题: Re: add vm(hot compaction) in tsfile processor Hi Lingzhe, >max_vm_num: indicates that a TsFileProcessor has at most the number of virtual memory files what does this mean? and how do I know what value is suitable? (For example, if I set it as 1, is there any impact?) Best, ----------------------------------- Xiangdong Huang School of Software, Tsinghua University 黄向东 清华大学 软件学院 445073309 <[email protected]> 于2020年7月20日周一 下午12:42写道: > Hi, > > > I met a problem that iotdb will write small chunk data when lack of > memtable num, this causes the system to query hot data more slowly. > > > So I create a new type of file -- vm file, and use it to do hot compaction > in flush processor. With this, we can flexiblily controll the size of each > chunk. The configuration and usage changes will be described as follows: > * add a new parameter enable_vm in iotdb-engine.properties: indicates > whether to use virtual memory > * use parameter avg_series_point_number_threshold in iotdb-engine.properties: > indicates the minimum average number of chunk data points after hot > compaction > * add a new parameter max_vm_num in iotdb-engine.properties: > indicates that a TsFileProcessor has at most the number of virtual memory > files > * add a new parameter max_merge_chunk_num_in_tsfile in > iotdb-engine.properties: > indicates the vm files max merge times > * the suffix of the vm file is'.vm', and the naming convention is > {tsfile_name}-{level}-{timestamp}.vm > > And there are many detail changes like: > * set virtual memory file list List<List<TsFileResource>> > vmTsFileResources for each TsFileProcessor, add > List<List<RestorableTsFileIOWriter>> vmWriters for management > * in the recover process, the recovery of the vm file is newly added, and > the corresponding TsFileProcessor is injected after the recovery > > The compaction strategy is now writen like LeveledCompactionStrategy in > Cassandra, > and it can be optimized later. > > I put the detail zh-doc in the attachment. > > Thanks, > -- > Lingzhe Zhang > School of Software, Tsinghua University > > 张凌哲 > 清华大学 软件学院 >
