Hi, Recently, we realized that the Gorilla encoding algorithm that has been used inside IoTDB may have some issues, because it will cause time series data (the value part) to become more space-consuming after encoding. This is not in line with expectations. Usually after using Gorilla encoding, the data will take up less space.
I found a very good open source Gorilla algorithm implementation by Michael on Github (see https://github.com/burmanm/gorilla-tsc). I compared the difference in encoding / decoding time cost and compression rate between the version implemented by Michael and the version used internally by IoTDB, and found that the version used inside IoTDB does have a lot of room for improvement. See https://cwiki.apache.org/confluence/display/IOTDB/Gorilla+encoding+algorithm for more experiment details. I think we can refer to Michael's implementation to re-implement the algorithm inside IoTDB to reduce the compression rate (fix potential errors) and improve performance. I have created a JIRA (see https://issues.apache.org/jira/browse/IOTDB-938) for this. If possible, I would be happy to re-implement the algorithm. Thanks, Steve Su
