[ 
https://issues.apache.org/jira/browse/IOTDB-605?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17087425#comment-17087425
 ] 

Zesong Sun commented on IOTDB-605:
----------------------------------

I'm considering about a tree-structured level index as is shown in attachments.
* Each internal node is an array, which is consisted of triple-element items: 
[<device / measurement name, offset, next>, ... ]. These elements represent the 
children nodes of the internal node. 
* By the end of the array, there is an "empty" item with an empty string and 
the offset equals to the end offset of children elements, so that the length of 
every element could be easy to calculate.
* The field *next* is an enum value, which represents the type of the next 
child node. For example, *di* means the device index, *d* means the device, 
*mi* means the measurement index, and *m* means the measurement. Only *m* is 
leaf node with TimeseriesMetadata.
* The largest number of children nodes *N* could be configured by users. (In 
the examples, I set *N* = 10 for convenience) 
* The storage process is from bottom to up. Whenever the blocks > *N*, a parent 
level index will be generated and will be insisted into disk.
* The query process is from top to bottom with binary search of the array. 

I present some examples in the attachment: 5 devices with 5 measurements each; 
1 device with 150 measurements;  150 devices with 1 measurement each; 150 
devices with 150 measurements each. 
Does anyone has any opinions about this idea? I'd be very pleased to modify or 
supplement this proposal if there are any problems : )

> Add more levels of index in TsFileMetadata
> ------------------------------------------
>
>                 Key: IOTDB-605
>                 URL: https://issues.apache.org/jira/browse/IOTDB-605
>             Project: Apache IoTDB
>          Issue Type: Improvement
>            Reporter: Zesong Sun
>            Priority: Major
>         Attachments: Examples.png, Structure of MetadataIndex levels.png
>
>
> Facing a scenario that one device has 300k measurements, when we read one 
> time series in a TsFile, we need to deserialize 300k TimeseriesMetadata, 
> which costs about 250ms (just for reading metadata of one tsfile). This may 
> cause the query much slow.
> As this scenario is not rare, I think this should be optimized by adding more 
> indexes in TsFileMetadata.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to