Hi, Qiao

 Wow, it's a very nice job. I also did the similar evaluation and got the same 
result as you.
Hope the new version can be merged as soon as possible.


> -----原始邮件-----
> 发件人: "Jialin Qiao" <[email protected]>
> 发送时间: 2020-03-30 22:02:56 (星期一)
> 收件人: dev-iotdb <[email protected]>
> 抄送: 
> 主题: New TsFile Structure
> 
> Hi,
> 
> 
> The new TsFile structure (version 2) is ready [1]. 
> 
> 
> The write speed is not affected, the query is accelerated, especially 
> aggregation queries.
> 
> 
> 【Performance evaluation】
> 
> 
> 
> Hardware: macOS 10.14.5 2.2 GHz Intel Core i7, 4G memory.
> 
> Data set: 1 Storage group,1 device,3000 measurements,each timeseries has 
> 600000 data points, long data type
> 
> IoTDB configuration:
> 
> enable_parameter_adapter=false
> tsfile_size_threshold=1024L
> memtable_size_threshold=5010241024L
> 
> [Write evaluation]
> 
> new_TsFile:300569ms,14.76G,184 tsfiles
> master:300418ms,14.73G,184 tsfiles
> 
> [Query evaluation]
> 
> select s1 from root.sg1.d1
> 
> new_TsFile: 1349ms
> master: 2102ms
> 
> select s1, s2, s3, s4, s5, s6, s7, s8, s9, s10 from root.sg1.d1
> 
> new_TsFile: 3268ms
> master: 4621ms
> 
> select * from root
> 
> new_TsFile: 647934ms
> master: 814206ms
> 
> select count(s1) from root.sg1.d1
> 
> new_TsFile: 421ms
> master: 1654ms
> 
> select count(s1), count(s2), count(s3), count(s4), count(s5), count(s6), 
> count(s7), count(s8), count(s9), count(s10) from root.sg1.d1
> 
> new_TsFile: 1887ms
> master: 4231ms
> 
> select count(s1), count(s2), count(s3), count(s4), count(s5), count(s6), 
> count(s7), count(s8), count(s9), count(s10), count(s11), count(s12), 
> count(s13), count(s14), count(s15), count(s16), count(s17), count(s18), 
> count(s19), count(s20), count(s21), count(s22), count(s23), count(s24), 
> count(s25), count(s26), count(s27), count(s28), count(s29), count(s30) from 
> root.sg1.d1
> 
> new_TsFile: 3066ms
> master: 6653ms
> 
> select count(*) from root
> 
> new_TsFile: 2243ms
> master: 614638ms
> 
> 
> 
> 
> 
> 【Design of new TsFile】
> 
> 
> In the previous version, the ChunkMetadata is stored by device. Therefore, if 
> we want to query one series, we need to read ChunkMetadatas of all 
> measurements of its device, which is time consuming.
> 
> 
> In the new version, the ChunkMetadata is grouped by time series. Then, if we 
> want to query one series, we only need to read ChunkMetadata
>  of this series. A file level statistics TimeseriesMetadata is added for each 
> series to accelerate aggregations.
> 
> 
> Besides, by modifying the schema management of TsFile, the constraints that 
> measurements that have the same name in one storage group should have the 
> same data type is broken.
> 
> 
> 
> 
> [1] https://github.com/apache/incubator-iotdb/pull/855
> 
> 
> Thanks,
> --
> Jialin Qiao
> School of Software, Tsinghua University
> 
> 乔嘉林
> 清华大学 软件学院

Reply via email to