kill_thanos branch in thulab/iotdb has refactored most features as below.

1. Replacing the single point calculation logic with a batch data load behavior.
In previous branch, the most important two methods in the `Reader` of IoTDB are 
`hasNext` and `next` methods, which examine that whether the given query series 
has next point and calculate next point. Multiple invoking of these two methods 
decreasing the performance of query, so we added two new methods `hasNextBatch` 
and `nextBatch`. As a result, we will load and transfer data in batch rather 
than a single point. These two methods are friendly to CPU.


2. Using nio.
In this branch, we replaced ByteArrayInputStream with NIO, taking the advantage 
of java NIO. We used `Channel`, `Buffer`, `MMap` more frequently.


3. Adding file stream manager.
In a query of IoTDB, multiple series may be queried, such as a sql `select * 
from root.vehicle`. To avoid opened one tsfile multiple times, we adopting a 
file stream manager, which ensure that one file will be opened at most once in 
IoTDB queries. We adopt an `ExpiredTimeMap` to manage opened file streams, and 
close some files when they are not used for a given expired time.  Maybe there 
are better file stream reader management methods, I will keep trace it.


4. Optimizing filter efficiency.
Firstly, we removed the previously `Visitor Pattern` implementation of filter, 
and adopted an intuitive implementation.
Secondly,  we optimized some filter logic to promote performance. For example, 
in a sql `select sensor_0, sensor_1 from device_0 where sensor_1 > 10`, we did 
some optimization to avoid  the duplicate data calculation of `sensor_1`.


5. Others, such as removing serialization of thrift, changing the file format 
of TsFile, maybe someone else can make a supplement.


I suggest that merging it into master branch in the next week.


Experimental results show that the query test in kill_thanos branch has 
approximately 30% ~ 60% performance promotion.


By the way, I am considering that how to get a standard, convincing test data 
(in IoT domain) to test the writing and querying performance of IoTDB.  
Currently, we just use the data generated by `IoTDB Benchmark` (another 
project, also available on github.com/thulab/iotdb-benchmark), which generated 
10w row records of 100device * 100sensor.


Thanks & Best Regards


-----------------------------------
Cao Gaofei (??????)
School of Software,
Tsinghua University
-----------------------------------

Reply via email to