Hi, Thank you for your reply. I am very happy that you can take my suggestion.
Thanks Dawei Liu > 2020年2月11日 下午6:04,Haonan Hou <[email protected]> 写道: > > Hi Dawei, > > Thank you so much that you share your opinion about new TsFile! > I am very happy to take your suggestions. > > You said we can remove TsOffsetArray and directly store the offset of > TimeseriesMetaData. I agree with you. It is better than my version. > Besides, for the optimization of TimeserieMetaData, I would like to discuss > with other people to determine which way is better. > > Best, > > Haonan Hou > > >> On Feb 11, 2020, at 5:35 PM, atoiLiu <[email protected]> wrote: >> >> Hi, >> >> I’m learning new TsFile in PR [1], but I think TsFileMetaData has a bad >> design. >> >> TsFileMetaData has a TsOffsetArray, TsOffsetArray is record every offset of >> TimeseriesMetaData, and use Map<deviceId, int[]> to record startIndex , >> endIndex of TsOffsetArray, it’s looks like : >> >> TsFileMetaData —>{ [0,1,2,3,4,5, ….] [ {deviceId(d0), [0,2] }, >> {deviceId(d1), [3,5] }, …. } } >> >> We can delete TsOffsetArray and store the offsets directly in the >> deviceIndexArray, then TsFileMatadata will has a Map<deviceId, List<Long>> >> to record . This change will save 4 bytes per device on disk, because every >> device just need record the number of offsets and offsets. it’s looks like: >> >> TsFileMetaData ---> [ {deviceId(d0), [0,1,2] }, {deviceId(d1), [3,4,5] }, … } >> >> >> In addition, TimeSeriesMetaData is an ordered structure on the hard disk, >> and the TimeSeriesMetaData for each device is linked together, so >> TsFileMetaData does not need to store all offset information, so there two >> optimization directions: >> >> 1. Save startTime , endTime and offset for each TimeSeriesMetaData in >> TsFileMetaData. The nice thing about this is that when you read >> TsFileMetaData from your hard drive, you can directly do a filter to filter >> which TimeSeriesMetaData is not necessary to read. >> >> >> 2. Only save the start TimeSeriesMetaData offset in TsFileMetaData so that >> you can loop through it and just need once to seek, it’s looks like : >> >> TsFileMetaData ---> [ {deviceId(d0), 0 }, {deviceId(d1), 3 }, … } >> >> >> >> [1] https://github.com/apache/incubator-iotdb/pull/736 >> <https://github.com/apache/incubator-iotdb/pull/736> >> >> Thanks >> >> Dawei Liu >
