Hi,

Maybe we need to put PageHeaders into ChunkHeader, if we put PageHeaders into 
ChunkMetadata, we could not sequentially read TsFile.

Thanks,
--
Jialin Qiao
School of Software, Tsinghua University

乔嘉林
清华大学 软件学院

> -----原始邮件-----
> 发件人: "Dawei Liu" <[email protected]>
> 发送时间: 2020-02-24 16:40:05 (星期一)
> 收件人: [email protected]
> 抄送: 
> 主题: [DISCUSS] Optimize TsFile structure to reduce unnecessary IO
> 
> Hi,
> 
> In the current TsFile structure, PageHeader and PageData are compactly put 
> together in a Chunk, like a chain structure [1].
> 
> The basic unit that is read from the hard disk each time is the Chunk. 
> For the query scenario of device * sensor, it would appear that we read too 
> much data, 
> so we considered a new optimization direction: 
> use PageHeaders to filter the data first, then we can be more precise about 
> which Pages need to be read..
> 
> But we still have a debate about where to put the PageHeaders:
> 
> 1.put the PageHeader into the ChunkMetaData. 
> The nice thing about this is that we can start filtering the data once the IO 
> is done.
> 
> 2.put the PageHeader in the ChunkHeader.
> so we need to read the PageHeader one more time, but the advantage is that we 
> save more memory when we read the List of the device.
> 
> For details, please see [2]
> 
> What do you think?
> 
> 
> 
> Regards,
> ---
> Dawei Liu
> 
> 
> [1] 
> https://user-images.githubusercontent.com/33376433/69341240-26012300-0ca4-11ea-91a1-d516810cad44.png
>  
> <https://user-images.githubusercontent.com/33376433/69341240-26012300-0ca4-11ea-91a1-d516810cad44.png>
> [2] 
> https://issues.apache.org/jira/secure/attachment/12994279/131582515824_.pic_hd.jpg
>  
> <https://issues.apache.org/jira/secure/attachment/12994279/131582515824_.pic_hd.jpg>

Reply via email to