On Thu, Jan 03, 2013 at 11:14:18PM -0500, Francois Mikus wrote:
> Well... That is one key difference between rrd and whisper(graphite) 
> databases.
> 
> Whisper is one metric per file and one consolidation function. So you 
> have to choose, min, max, avg, median, etc.

This is an interesting concept. The only drawback seems for me performance wise,
as with rrd an update is locally memory wise (if the round robin db is mmap'ed) 
or on disk, while carbon(whisper) is spreading the data all over the disk.

> The obvious solution for a statistician is to use a better consolidation 
> function so that you do not need multiple ones.

I'm not very good in statistics, but if you are interested in the mean and the 
max values
(e.g. network traffic or cpu load) - can this be done with one consolidation 
function ?

> This may be possible with whisper databases but I am not convinced. With 
> the Ceres database format which allocates dynamically database space, it 
> is possible to use more advanced statistical consolidation functions 
> based on changing slopes. I will post the link to the algorithm 
> description. This is something I aim to get a statistics student to work 
> on in the future as a masters project or as a summer student project.
> 
> In the mean time, you can compute the min/max/average/etc dynamically at 
> graph generation based on the unconsolidated tiime-series data. But once 
> consolidated, you obviously have to choose one method.

Keeping all historic data is not a good option - that's why rrd's where 
invented in the 
first place. 

> Having N consolidation methods for a single data point is little sad. As 
> it doubles the amount of disk space used, even considering the is only a 
> single timestamp (in rrd) and multiple timestamps in Graphite (.wsp).

The multiple timestamps are worsening the memory/disk cluttering - hopefully 
ceres will bring a better solution.

> The right place for doing determining multiple consolidation methods in 
> the current architecture is upstream from carbon-cache to keep it 
> consistent with how Graphite works. Changing it in carbon-cache would 
> take a lot more testing and risk breaking something, IMO.
> 
> So for now, do it upstream in carbon aggregator or in Shinken

Do you have an example how to do this with carbon-aggregator ?
The documentation deals mainly with aggregation in time, which we
don't need and want here.
 
> module(shinken2rrd, or Shinken Graphite module). Which is not a big 
> problem. The Graphite module for Shinken can send data using pickled 
> batches for efficient communications and can deal with disconnections, 

As I already mentioned we have other metric sources besides shinken here, which 
can not
be changed so easily - last year changing carbon-cache was the simple way ;-)

> it can easily be extended to do a little magic. Sending more data to 
> Graphite is not the issue, but it will mean more data points, 2x or even 
> 3x the number of data points to update and more disk size. This becomes 
> a problem when you start to exceed 300K basic data points.

Having a rather heavyweight, python specific protocol which is connection 
oriented 
instead of a minimalistic udp packet which can be generated with nearly every 
programming language 
and even from the simplest device is just the opposite direction we want to go 
here.
Having only one (max 1500 byte) udp packet/node for the common metrics steals no
resources at all from our computations the equipment is bought for here. 

Just my 2 cents. 
Many thanks for your comments and
 greetings
  Hermann

-- 
Netzwerkadministration/Zentrale Dienste, Interdiziplinaeres 
Zentrum fuer wissenschaftliches Rechnen der Universitaet Heidelberg
IWR; INF 368; 69120 Heidelberg; Tel: (06221)54-8236 Fax: -5224
Email: hermann.la...@iwr.uni-heidelberg.de

------------------------------------------------------------------------------
Master Visual Studio, SharePoint, SQL, ASP.NET, C# 2012, HTML5, CSS,
MVC, Windows 8 Apps, JavaScript and much more. Keep your skills current
with LearnDevNow - 3,200 step-by-step video tutorials by Microsoft
MVPs and experts. SALE $99.99 this month only -- learn more at:
http://p.sf.net/sfu/learnmore_122412
_______________________________________________
Shinken-devel mailing list
Shinken-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/shinken-devel

Reply via email to