Hi, +1 for version number :)
Thanks, -- Jialin Qiao School of Software, Tsinghua University 乔嘉林 清华大学 软件学院 > -----原始邮件----- > 发件人: "Christofer Dutz" <[email protected]> > 发送时间: 2020-10-12 14:38:34 (星期一) > 收件人: "[email protected]" <[email protected]> > 抄送: > 主题: Re: Share some experiment results about Gorilla encoding algorithm > > Whatever you do : don't call anything "old" or "new". > > In two years the new "new" might be the new "old"... What happens then?... > Append version numbers... That's sustainable... > > Chris > ________________________________ > Von: Jialin Qiao <[email protected]> > Gesendet: Montag, 12. Oktober 2020 08:32 > An: [email protected] <[email protected]> > Betreff: Re: Share some experiment results about Gorilla encoding algorithm > > Hi, > > Maintaining two versions of gorilla encoding is ok. > > Could we change the default time encoding from TS2_DIFF to Gorilla and keep > compatible? > > Thanks, > -- > Jialin Qiao > School of Software, Tsinghua University > > 乔嘉林 > 清华大学 软件学院 > > > -----原始邮件----- > > 发件人: "Steve Su" <[email protected]> > > 发送时间: 2020-10-11 23:52:55 (星期日) > > 收件人: dev <[email protected]> > > 抄送: > > 主题: Re: Share some experiment results about Gorilla encoding algorithm > > > > Hi, > > > > From my point of view, since the reimplementation of this algorithm does > > not change the structure of TsFile, there is no need to upgrade the version > > number of TsFile to 000003. > > > > I think we can change the name of the old Gorilla encoding to > > TSEncoding.OLD_GORILLA in the code under the premise of ensuring the > > compatibility of the old TsFiles, and then reserve TSEncoding.GORILLA for > > the re-implemented version. This may minimize the impact on users. > > > > What do you think? :) > > > > Steve Su > > > > ------------------ 原始邮件 ------------------ > > 发件人: "dev" <[email protected]>; > > 发送时间: 2020年10月10日(星期六) 晚上11:35 > > 收件人: "dev"<[email protected]>; > > 主题: Re: Share some experiment results about Gorilla encoding algorithm > > > > Hi, > > > > Nice! > > > > One question. So, if we reimplement the Gorilla algorithm, how to consider > > the version compatibility? > > > > 1. Upgrade the TsFile version to 000003, or > > 2. Add a new encoding name to the corrected gorilla. > > > > Best, > > ----------------------------------- > > Xiangdong Huang > > School of Software, Tsinghua University > > > > 黄向东 > > 清华大学 软件学院 > > > > > > Steve Su <[email protected]> 于2020年10月10日周六 下午10:20写道: > > > > > Hi, > > > > > > Recently, we realized that the Gorilla encoding algorithm that has been > > > used inside IoTDB may have some issues, because it will cause time series > > > data (the value part) to become more space-consuming after encoding. This > > > is not in line with expectations. Usually after using Gorilla encoding, > > > the > > > data will take up less space. > > > > > > I found a very good open source Gorilla algorithm implementation by > > > Michael on Github (see https://github.com/burmanm/gorilla-tsc). I > > > compared the difference in encoding / decoding time cost and compression > > > rate between the version implemented by Michael and the version used > > > internally by IoTDB, and found that the version used inside IoTDB does > > > have > > > a lot of room for improvement. > > > > > > See > > > https://cwiki.apache.org/confluence/display/IOTDB/Gorilla+encoding+algorithm > > > for more experiment details. > > > > > > I think we can refer to Michael's implementation to re-implement the > > > algorithm inside IoTDB to reduce the compression rate (fix potential > > > errors) and improve performance. I have created a JIRA (see > > > https://issues.apache.org/jira/browse/IOTDB-938) for this. If possible, I > > > would be happy to re-implement the algorithm. > > > > > > Thanks, > > > Steve Su
