Hi,  glad to kill Thanos :) 

 I agree to merge kill_thanos into master. We have already made an unofficial 
release version(v0.7.1) in Github for master as backups.

 As a reminder, except for these optimizations, some features are unavailable 
compared to the master branch:

 1. Update and Delete operations.

 2. Advanced queries: Aggregation, GroupByTime and Fill.

 Besides, the ‘hasNextBatch' and ‘nextBatch’ methods are implemented in TsFile, 
but most remain to be done in IoTDB engine. 

 The kill_thanos changes too much... We can add these features and further 
optimize the code with other PRs. 

 Best.

--
Jialin Qiao
School of Software, Tsinghua University

乔嘉林
清华大学 软件学院

> -----原始邮件-----
> 发件人: "Xiangdong Huang" <[email protected]>
> 发送时间: 2019-01-05 11:36:07 (星期六)
> 收件人: [email protected]
> 抄送: 
> 主题: Re: merge kill_thanos branch to the master branch
> 
> I think the biggest issue of the current master is that the package
> structures are chaotic.
> The issue prevents new developers to understand the project.
> It is a villain like Thanos in the Marvel Universe.  That's why the new
> branch is called kill_Thanos.
> 
> Except what Gaofei mentioned, the storage module, TsFile, is also
> refactored, and the file format has some changes.
> A brief introduction is at
> https://github.com/thulab/iotdb/wiki/%5BTsFile%5D-What-is-new-from-v0.7.0--to-Kill_Thanos
> 
> 
> In the kill_Thanos branch, the package structure is more clear, but there
> are still many source codes can be refactored better.
> However, it brings extra works to merge the modifications from master into
> the kill_Thanos.
> 
> Because all UT and IT in current kill_Thanos are passed, and the
> performance is better, I agree to merge the branches as soon as possible.
> 
> Best,
> -----------------------------------
> Xiangdong Huang
> School of Software, Tsinghua University
> 
>  黄向东
> 清华大学 软件学院
> 
> 
> Gaofei Cao <[email protected]> 于2019年1月5日周六 上午12:23写道:
> 
> > kill_thanos branch in thulab/iotdb has refactored most features as below.
> >
> >
> > 1. Replacing the single point calculation logic with a batch data load
> > behavior.
> > In previous branch, the most important two methods in the `Reader` of
> > IoTDB are `hasNext` and `next` methods, which examine that whether the
> > given query series has next point and calculate next point. Multiple
> > invoking of these two methods decreasing the performance of query, so we
> > added two new methods `hasNextBatch` and `nextBatch`. As a result, we will
> > load and transfer data in batch rather than a single point. These two
> > methods are friendly to CPU.
> >
> >
> > 2. Using nio.
> > In this branch, we replaced ByteArrayInputStream with NIO, taking the
> > advantage of java NIO. We used `Channel`, `Buffer`, `MMap` more frequently.
> >
> >
> > 3. Adding file stream manager.
> > In a query of IoTDB, multiple series may be queried, such as a sql `select
> > * from root.vehicle`. To avoid opened one tsfile multiple times, we
> > adopting a file stream manager, which ensure that one file will be opened
> > at most once in IoTDB queries. We adopt an `ExpiredTimeMap` to manage
> > opened file streams, and close some files when they are not used for a
> > given expired time.  Maybe there are better file stream reader management
> > methods, I will keep trace it.
> >
> >
> > 4. Optimizing filter efficiency.
> > Firstly, we removed the previously `Visitor Pattern` implementation of
> > filter, and adopted an intuitive implementation.
> > Secondly,  we optimized some filter logic to promote performance. For
> > example, in a sql `select sensor_0, sensor_1 from device_0 where sensor_1 >
> > 10`, we did some optimization to avoid  the duplicate data calculation of
> > `sensor_1`.
> >
> >
> > 5. Others, such as removing serialization of thrift, changing the file
> > format of TsFile, maybe someone else can make a supplement.
> >
> >
> > I suggest that merging it into master branch in the next week.
> >
> >
> > Experimental results show that the query test in kill_thanos branch has
> > approximately 30% ~ 60% performance promotion.
> >
> >
> > By the way, I am considering that how to get a standard, convincing test
> > data (in IoT domain) to test the writing and querying performance of
> > IoTDB.  Currently, we just use the data generated by `IoTDB Benchmark`
> > (another project, also available on github.com/thulab/iotdb-benchmark),
> > which generated 10w row records of 100device * 100sensor.
> >
> >
> > Thanks & Best Regards
> >
> >
> > -----------------------------------
> > Cao Gaofei (曹高飞)
> > School of Software,
> > Tsinghua University
> > -----------------------------------

Reply via email to