Hi, Interesting thoughts!
(1) The page level index could optimize the scenario that a chunk has many pages. When a chunk only has few pages, maybe reading a whole chunk at a time is good. We could leave it as an option. (2) The queried BatchData is never changed and discarded after returning to client through RPC. We could use a pool for BatchData, just like the MemtablePool to reuse BatchData. Thansk, -- Jialin Qiao School of Software, Tsinghua University 乔嘉林 清华大学 软件学院 > -----原始邮件----- > 发件人: "DaweiLiu (Jira)" <[email protected]> > 发送时间: 2020-02-21 23:48:00 (星期五) > 收件人: [email protected] > 抄送: > 主题: [jira] [Created] (IOTDB-509) Optimize TsFileReader to reduce unnecessary > GC and IO. > > DaweiLiu created IOTDB-509: > ------------------------------ > > Summary: Optimize TsFileReader to reduce unnecessary GC and IO. > Key: IOTDB-509 > URL: https://issues.apache.org/jira/browse/IOTDB-509 > Project: Apache IoTDB > Issue Type: Wish > Components: Core/TsFile > Reporter: DaweiLiu > > > I think there are still two parts of TsFile that can be optimized > # Reduce unnecessary IO. The current reading is carried out according to the > Chunk level. I think we can put pageindex together. When the time in the > filter contains the chunk time, all chunk data will be read out and returned > directly. When only intersecting, we can determine which pages to read out by > reading pageindex, thus reducing unnecessary data reading > # The reduction in the gc, read the data returned is based on batchData > structure, and the amount of data that is aligned with the page each time, > that is, each time when you call next () method reads, will the new a > batchData, if the query has experienced thousands of page, that means we have > the new 10000 batchData.So I think that we should isolate the data of the > page. We do io and serialization / decoding from the hard disk one page at a > time, but when it is handed over to the business, it should be a data > structure that can be reused. He is Fixed length, just like read (ByteBuffer) > in JDK > > > > -- > This message was sent by Atlassian Jira > (v8.3.4#803005)
