Hi all,
In issue IOTDB-2080 [1], I conducted groups of experiments to compare the query speed, including raw query and aggregation query, of two different TsFile structures: index area together with data area; and index area separated from data area. The index area includes MetadataIndexTree and BloomFilter, and the experiments cover the number of timeseries from 10 to 1000,0000. You could see the experiment result in [2]. It is a very large table. In summary, the advantages of separating MetadataIndex out of TsFile are as below: * The index structure can be modified flexibly and can be customized without affecting the original data * The index size is not limited and does not affect the size of tsfile. * Because the index area file is small and the disk position operation is fast, the query speed may be accelerated. (Raw data query can save up to 15%, and aggregate query can save up to 30%) Therefore, I support to move the index area (which refers to MetadataIndexTree and BloomFilter currently) out of TsFile. What do you think? [1] https://issues.apache.org/jira/browse/IOTDB-2080 [2] https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=195726035#id-%E5%AE%9E%E9%AA%8C%EF%BC%9ATsFileMedatadaIndexTree-(IV)WhetherseparateindexfromdatainTsFile -- Zesong Sun School of Software, Tsinghua University 孙泽嵩 清华大学 软件学院
