New question #180787 on Graphite:
https://answers.launchpad.net/graphite/+question/180787

Once the Ceres database is available. One key element that differentiates the 
most cutting edge professional storage mechanisms is the aggregation and 
compression method.

Ceres has the possibility to implement more efficient algorithms for storing 
time series data. The use of fan interpolators or Straight Line Interpolative 
Methods to store only the relevant data points in a time series. This can 
achieve compression ratios of 10x versus raw data while also staying accurate 
to within a defined maximum deviation. This is particularly interesting for 
industrial process data or trends that typical aggregation methods like 
min/max/average render too coarse.

Sure average aggregation has a very high compression rati, but for more 
accuracy administrators typically add min and max aggregation. Thus ending up 
with 3 data points + 1 time value in RRDTool or 3 data points + 3 time values 
in Whisper for a  approx 20x compression ratio with high accuracy loss. 
Whereas, a little upfront processing can make 1 data point + 1 time value much 
more accurate AND achieve a 20-35x compression versus the raw data. All data 
points between two stored values can be interpolated to be a straight line 
between the two data points. This eliminates both time and data values to 
interpolation.

Google for DATA COMPRESSION FOR PROCESS HISTORIANS by Peter A. James. for a 
comparison of various algorithms.

This would be a great implementation mini project for anyone with a little 
statistical background and a bit of python knowledge!

-- 
You received this question notification because you are a member of
graphite-dev, which is an answer contact for Graphite.

_______________________________________________
Mailing list: https://launchpad.net/~graphite-dev
Post to     : [email protected]
Unsubscribe : https://launchpad.net/~graphite-dev
More help   : https://help.launchpad.net/ListHelp

Reply via email to