Hi Pierre, I can't answer your Collectd-specific questions, but I'm wondering why block-based approach is needed? If Collectd outputs data every 10 seconds, for example, isn't the value written out every 10 seconds already aggregated in some way?
Thanks, Otis -- Performance Monitoring * Log Analytics * Search Analytics Solr & Elasticsearch Support * http://sematext.com/ On Sun, Nov 3, 2013 at 5:04 AM, Pierre-Yves Ritschard <[email protected]> wrote: > Hi list, > > Right now in collectd we have read, write, notification and logging plugins > which cover most our use cases. > > I think the model falls short when implementing plugins like aggregation, > chaining or threshold. It seems as though we are missing an intermediate > endpoint to plug in metric manipulation when collection windows end. > > As some of you may know I've been playing with a lib which implements > generic metric manipulation, with a simple language (example syntax: > https://gist.github.com/pyr/7070364) > > Now that the syntax is well implemented in a contained library, I'm looking > for ways to implement it. I see two ways that "mangling" plugins might want > to interact with collectd: > > - in a streaming fashion: processing metrics as they come in > - in a block fashion: processing a full window of collected metrics > > Writing a streaming mangling plugin is an easy task, the "aggregation" > plugin is such an example, it registers a read plugin then marks the metrics > it generates with an attribute to avoid looping. filter_chains also > implement a similar mechanism allowing simple streaming handling. > > Writing block handling plugins is much more difficult, there doesn't seem to > be an idea of a full metric window event. So writing such plugins now need > to be done in one of two ways: > > - accumulate metrics and trigger processing at regular intervals > - accumulate metrics and trigger processing when enough events have been > input > > My current design expects a full window of metrics, it is a "pure" function > which for a specific window of metrics and configuration syntax will output > the same window of metrics augmented with a sink (a destination write > plugin) and potentially a state. > > This approach has the drawback of forcing accumulation at some point, which > might be a problem on aggregation instances but will be negligible on > node-local instances (actually given the in-memory size of metrics, it would > take a very busy aggregation instance to make this noticeable / > problematic). > > The simplest way of implementing this seems to be queuing up metrics in the > sent to the write plugin and scheduling processing when the read function is > called (waiting for a small delay to leave time for other read plugins to > submit their metrics). > > My current questions are: > > - are collectd users at large interested by an all-encompassing mangling > plugin (superseding the functionality found in chains, thresholds and > aggregatio plugins) ? > - would most people prefer a configuration that integrates in the main > collectd.conf ? It seems a bit unwieldy to me but could be doable > - is there a way I missed to accumulate metrics between poll intervals in a > sound way ? > > Thanks for your help putting this together! > - pyr > > > > _______________________________________________ > collectd mailing list > [email protected] > http://mailman.verplant.org/listinfo/collectd > _______________________________________________ collectd mailing list [email protected] http://mailman.verplant.org/listinfo/collectd
