Hi all,

In issue IOTDB-2080 [1], I conducted groups of experiments to compare the query 
speed, including raw query and aggregation query, 


of two different TsFile structures: index area together with data area; and 
index area separated from data area.


The index area includes MetadataIndexTree and BloomFilter, and the experiments 
cover the number of timeseries from 10 to 1000,0000.




You could see the experiment result in [2]. It is a very large table.


In summary, the advantages of separating MetadataIndex out of TsFile are as 
below:


* The index structure can be modified flexibly and can be customized without 
affecting the original data


* The index size is not limited and does not affect the size of tsfile.


* Because the index area file is small and the disk position operation is fast, 
the query speed may be accelerated. (Raw data query can save up to 15%, and 
aggregate query can save up to 30%)






Therefore, I support to move the index area (which refers to MetadataIndexTree 
and BloomFilter currently) out of TsFile.


What do you think?




[1] https://issues.apache.org/jira/browse/IOTDB-2080


[2] 
https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=195726035#id-%E5%AE%9E%E9%AA%8C%EF%BC%9ATsFileMedatadaIndexTree-(IV)WhetherseparateindexfromdatainTsFile

--
Zesong Sun
School of Software, Tsinghua University

孙泽嵩
清华大学 软件学院

Reply via email to