Hi, Qiao Wow, it's a very nice job. I also did the similar evaluation and got the same result as you. Hope the new version can be merged as soon as possible.
> -----原始邮件----- > 发件人: "Jialin Qiao" <[email protected]> > 发送时间: 2020-03-30 22:02:56 (星期一) > 收件人: dev-iotdb <[email protected]> > 抄送: > 主题: New TsFile Structure > > Hi, > > > The new TsFile structure (version 2) is ready [1]. > > > The write speed is not affected, the query is accelerated, especially > aggregation queries. > > > 【Performance evaluation】 > > > > Hardware: macOS 10.14.5 2.2 GHz Intel Core i7, 4G memory. > > Data set: 1 Storage group,1 device,3000 measurements,each timeseries has > 600000 data points, long data type > > IoTDB configuration: > > enable_parameter_adapter=false > tsfile_size_threshold=1024L > memtable_size_threshold=5010241024L > > [Write evaluation] > > new_TsFile:300569ms,14.76G,184 tsfiles > master:300418ms,14.73G,184 tsfiles > > [Query evaluation] > > select s1 from root.sg1.d1 > > new_TsFile: 1349ms > master: 2102ms > > select s1, s2, s3, s4, s5, s6, s7, s8, s9, s10 from root.sg1.d1 > > new_TsFile: 3268ms > master: 4621ms > > select * from root > > new_TsFile: 647934ms > master: 814206ms > > select count(s1) from root.sg1.d1 > > new_TsFile: 421ms > master: 1654ms > > select count(s1), count(s2), count(s3), count(s4), count(s5), count(s6), > count(s7), count(s8), count(s9), count(s10) from root.sg1.d1 > > new_TsFile: 1887ms > master: 4231ms > > select count(s1), count(s2), count(s3), count(s4), count(s5), count(s6), > count(s7), count(s8), count(s9), count(s10), count(s11), count(s12), > count(s13), count(s14), count(s15), count(s16), count(s17), count(s18), > count(s19), count(s20), count(s21), count(s22), count(s23), count(s24), > count(s25), count(s26), count(s27), count(s28), count(s29), count(s30) from > root.sg1.d1 > > new_TsFile: 3066ms > master: 6653ms > > select count(*) from root > > new_TsFile: 2243ms > master: 614638ms > > > > > > 【Design of new TsFile】 > > > In the previous version, the ChunkMetadata is stored by device. Therefore, if > we want to query one series, we need to read ChunkMetadatas of all > measurements of its device, which is time consuming. > > > In the new version, the ChunkMetadata is grouped by time series. Then, if we > want to query one series, we only need to read ChunkMetadata > of this series. A file level statistics TimeseriesMetadata is added for each > series to accelerate aggregations. > > > Besides, by modifying the schema management of TsFile, the constraints that > measurements that have the same name in one storage group should have the > same data type is broken. > > > > > [1] https://github.com/apache/incubator-iotdb/pull/855 > > > Thanks, > -- > Jialin Qiao > School of Software, Tsinghua University > > 乔嘉林 > 清华大学 软件学院
