Hi,

+1 for version number :)

Thanks,
--
Jialin Qiao
School of Software, Tsinghua University

乔嘉林
清华大学 软件学院

> -----原始邮件-----
> 发件人: "Christofer Dutz" <[email protected]>
> 发送时间: 2020-10-12 14:38:34 (星期一)
> 收件人: "[email protected]" <[email protected]>
> 抄送: 
> 主题: Re: Share some experiment results about Gorilla encoding algorithm
> 
> Whatever you do : don't call anything "old" or "new".
> 
> In two years the new "new" might be the new "old"... What happens then?... 
> Append version numbers... That's sustainable...
> 
> Chris
> ________________________________
> Von: Jialin Qiao <[email protected]>
> Gesendet: Montag, 12. Oktober 2020 08:32
> An: [email protected] <[email protected]>
> Betreff: Re: Share some experiment results about Gorilla encoding algorithm
> 
> Hi,
> 
> Maintaining two versions of gorilla encoding is ok.
> 
> Could we change the default time encoding from TS2_DIFF to Gorilla and keep 
> compatible?
> 
> Thanks,
> --
> Jialin Qiao
> School of Software, Tsinghua University
> 
> 乔嘉林
> 清华大学 软件学院
> 
> > -----原始邮件-----
> > 发件人: "Steve Su" <[email protected]>
> > 发送时间: 2020-10-11 23:52:55 (星期日)
> > 收件人: dev <[email protected]>
> > 抄送:
> > 主题: Re: Share some experiment results about Gorilla encoding algorithm
> >
> > Hi,
> >
> > From my point of view, since the reimplementation of this algorithm does 
> > not change the structure of TsFile, there is no need to upgrade the version 
> > number of TsFile to 000003.
> >
> > I think we can change the name of the old Gorilla encoding to 
> > TSEncoding.OLD_GORILLA in the code under the premise of ensuring the 
> > compatibility of the old TsFiles, and then reserve TSEncoding.GORILLA for 
> > the re-implemented version. This may minimize the impact on users.
> >
> > What do you think? :)
> >
> > Steve Su
> >
> > ------------------ 原始邮件 ------------------
> > 发件人: "dev" <[email protected]>;
> > 发送时间: 2020年10月10日(星期六) 晚上11:35
> > 收件人: "dev"<[email protected]>;
> > 主题: Re: Share some experiment results about Gorilla encoding algorithm
> >
> > Hi,
> >
> > Nice!
> >
> > One question. So, if we reimplement the Gorilla algorithm, how to consider
> > the version compatibility?
> >
> > 1. Upgrade the TsFile version to 000003, or
> > 2. Add a new encoding name to the corrected gorilla.
> >
> > Best,
> > -----------------------------------
> > Xiangdong Huang
> > School of Software, Tsinghua University
> >
> >  黄向东
> > 清华大学 软件学院
> >
> >
> > Steve Su <[email protected]> 于2020年10月10日周六 下午10:20写道:
> >
> > > Hi,
> > >
> > > Recently, we realized that the Gorilla encoding algorithm that has been
> > > used inside IoTDB may have some issues, because it will cause time series
> > > data (the value part) to become more space-consuming after encoding. This
> > > is not in line with expectations. Usually after using Gorilla encoding, 
> > > the
> > > data will take up less space.
> > >
> > > I found a very good open source Gorilla algorithm implementation by
> > > Michael on Github (see https://github.com/burmanm/gorilla-tsc). I
> > > compared the difference in encoding / decoding time cost and compression
> > > rate between the version implemented by Michael and the version used
> > > internally by IoTDB, and found that the version used inside IoTDB does 
> > > have
> > > a lot of room for improvement.
> > >
> > > See
> > > https://cwiki.apache.org/confluence/display/IOTDB/Gorilla+encoding+algorithm
> > > for more experiment details.
> > >
> > > I think we can refer to Michael's implementation to re-implement the
> > > algorithm inside IoTDB to reduce the compression rate (fix potential
> > > errors) and improve performance. I have created a JIRA (see
> > > https://issues.apache.org/jira/browse/IOTDB-938) for this. If possible, I
> > > would be happy to re-implement the algorithm.
> > >
> > > Thanks,
> > > Steve Su

Reply via email to