wangchao316 opened a new pull request #2621:
URL: https://github.com/apache/iotdb/pull/2621


   current regular data encoding algorithm:
   
   Calculate the difference between two adjacent values. The smallest 
difference is used as the equal-frequency frequency.
   Determine the data range of this batch of data based on the difference 
between the last value and the first value.
   Traverse this batch of data, use a BitSet, compare the difference between 
two adjacent values with the same frequency, and save the value true by default,
   If the value is not equal to the equal frequency, calculate the number of 
equal frequency differences and set the value to false at the corresponding 
position, indicating that the point is a missing point.
   
   this algorithm only can identity missing point, if have error point , it 
will throw exception..
   
   because BitSet only can do this thing, indicates whether the same frequency 
exists in a segment of data
   
   But there is some optimize point..
   
   If there is an abnormal value in a column of values, the algorithm is 
deviated if the difference is directly obtained to the minimum value.
   
   sample: 1000,1100,1800,1400,1500...
   
   current algorithm be do not use...
   
   1800 is a error point, we should identity error point, revise data.
   
   revise data should be : 1000,1100,1300,1400,1500
   
   After discussion with Mr. Huang , solution:
   1. The value cannot be in regular encoding.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to