Hi,

Thanks for all the reviewers, the new TsFile is merged. 

To release 0.10.0, the next step is providing an online-upgrade function, which 
means you can launch IoTDB 0.10 on the data folder of 0.9.
The old TsFiles can be queried immediately and upgraded to new TsFiles in the 
background.

Thanks,
--
Jialin Qiao
School of Software, Tsinghua University

乔嘉林
清华大学 软件学院

> -----原始邮件-----
> 发件人: atoildw <[email protected]>
> 发送时间: 2020-03-31 10:35:37 (星期二)
> 收件人: dev <[email protected]>
> 抄送: "[email protected]" <[email protected]>
> 主题: Re: New TsFile Structure
> 
> Hi,
> 
> 
> Good job! 
> 
> 
> Looking forward to the next release, then I would like to deploy IoTDB in my 
> company
> 
> 
> 
> I saw the PR and there is a lot of code format, please try to avoid this next 
> time. It will make the PR review difficult.
> 
> 
> 
> —————
> 
> 
> DaWei Liu
> On 03/31/2020 10:04, Haonan Hou wrote:
> Hi,
> 
> I did the performance evaluation too, and got a similar conclusion.
> 
> Hardware: macOS 10.15.4 2.9 GHz Intel Core i5, 8G memory.
> Data set: 1 Storage group,1 device,3000 measurements,each timeseries has 
> 600000 data points, long data type
> 1. select s1 from root.sg1.d1
> new_TsFile: 2572
> master: 2666
> 2. select s1, s2, s3, s4, s5, s6, s7, s8, s9, s10 from root.sg1.d1
> new_TsFile: 5455
> master: 6146
> 3. select count(s1) from root.sg1.d1
> new_TsFile: 570
> master: 1510
> 4. select count(s1), count(s2), count(s3), count(s4), count(s5), count(s6), 
> count(s7), count(s8), count(s9), count(s10) from root.sg1.d1
> new_TsFile: 2132
> master: 3675
> 5. "select count(s1), count(s2), count(s3), count(s4), count(s5), count(s6), 
> count(s7), count(s8), count(s9), count(s10), count(s11), count(s12), 
> count(s13), count(s14), count(s15), count(s16), count(s17), count(s18), 
> count(s19), count(s20) from root.sg1.d1"
> new_TsFile: 2874
> master: 5357
> Thanks,
> Haonan Hou
> 
> 
> On Mar 30, 2020, at 10:02 PM, Jialin Qiao 
> <[email protected]<mailto:[email protected]>> wrote:
> 
> Hi,
> 
> 
> The new TsFile structure (version 2) is ready [1].
> 
> 
> The write speed is not affected, the query is accelerated, especially 
> aggregation queries.
> 
> 
> 【Performance evaluation】
> 
> 
> 
> Hardware: macOS 10.14.5 2.2 GHz Intel Core i7, 4G memory.
> 
> Data set: 1 Storage group,1 device,3000 measurements,each timeseries has 
> 600000 data points, long data type
> 
> IoTDB configuration:
> 
> enable_parameter_adapter=false
> tsfile_size_threshold=1024L
> memtable_size_threshold=5010241024L
> 
> [Write evaluation]
> 
> new_TsFile:300569ms,14.76G,184 tsfiles
> master:300418ms,14.73G,184 tsfiles
> 
> [Query evaluation]
> 
> select s1 from root.sg1.d1
> 
> new_TsFile: 1349ms
> master: 2102ms
> 
> select s1, s2, s3, s4, s5, s6, s7, s8, s9, s10 from root.sg1.d1
> 
> new_TsFile: 3268ms
> master: 4621ms
> 
> select * from root
> 
> new_TsFile: 647934ms
> master: 814206ms
> 
> select count(s1) from root.sg1.d1
> 
> new_TsFile: 421ms
> master: 1654ms
> 
> select count(s1), count(s2), count(s3), count(s4), count(s5), count(s6), 
> count(s7), count(s8), count(s9), count(s10) from root.sg1.d1
> 
> new_TsFile: 1887ms
> master: 4231ms
> 
> select count(s1), count(s2), count(s3), count(s4), count(s5), count(s6), 
> count(s7), count(s8), count(s9), count(s10), count(s11), count(s12), 
> count(s13), count(s14), count(s15), count(s16), count(s17), count(s18), 
> count(s19), count(s20), count(s21), count(s22), count(s23), count(s24), 
> count(s25), count(s26), count(s27), count(s28), count(s29), count(s30) from 
> root.sg1.d1
> 
> new_TsFile: 3066ms
> master: 6653ms
> 
> select count(*) from root
> 
> new_TsFile: 2243ms
> master: 614638ms
> 
> 
> 
> 
> 
> 【Design of new TsFile】
> 
> 
> In the previous version, the ChunkMetadata is stored by device. Therefore, if 
> we want to query one series, we need to read ChunkMetadatas of all 
> measurements of its device, which is time consuming.
> 
> 
> In the new version, the ChunkMetadata is grouped by time series. Then, if we 
> want to query one series, we only need to read ChunkMetadata
> of this series. A file level statistics TimeseriesMetadata is added for each 
> series to accelerate aggregations.
> 
> 
> Besides, by modifying the schema management of TsFile, the constraints that 
> measurements that have the same name in one storage group should have the 
> same data type is broken.
> 
> 
> 
> 
> [1] https://github.com/apache/incubator-iotdb/pull/855
> 
> 
> Thanks,
> --
> Jialin Qiao
> School of Software, Tsinghua University
> 
> 乔嘉林
> 清华大学 软件学院
> 

Reply via email to