I have been working recently on a branch to Gnaglia/monitor-core that
allows gmond to send metrics directly to an InfluxDB database.  It can be
found here:

  https://github.com/hawson/monitor-core/tree/influxdb


This purely a change to the gmond agent.  Other programs (e.g. gmetad,
gmetric) and components (the web UI) are not changed.  However, a logical
next phase could be to rework the WebUI and Gmetad to use InfluxDB as a
backend.


I've tried to keep the changes as isolated as possible, creating a new
lib/influxdb.c file for the new functionality, and hooking into gmond as
part of the existing Ganglia_collection_group_send() function. Thus, when a
packet would normally be sent to another gmond, it can also be sent to an
influxdb channel at the same time.  There are, of course, various other
changes sprinkled about to other files, mostly to add new gmond.conf
options.

The gmond.conf documentation, and default configuration file (from 'gmond
-t') has also been updated to cover the two new configuration options.

The first new option is an influxdb_send_channel stanza.  It is fairly
simple, with three options.

  influxdb_send_channel {
    host     = myinfluxdb.example.com
    port     = 8089
    default_tags = zone=us-east,host_class=hpc  //optional tags sent with
each metric
  }

The "host" and "port" attributes are required, and their purpose should be
obvious.  The "default_tags" attribute is optional.  Influxdb permits tags
associated with each time/key/value tuple; this is how hostnames are
stored, for example.  This attribute allows default tags to be associated
with every metric sent, for example to identify an HPC cluster, or AWS
zone, or other useful bit of metadata.

The other change to gmond.conf is also optional, but strongly recommended.
Every collection_group stanza may now have an optional "measurement"
attribute.  An example for the some of the system load metrics:

  collection_group {
    collect_every = 20
    time_threshold = 90
    measurement = "load"  // <<<<<<<<<<<<<<<  new atttribute
    metric {
      name = "load_one"
      title = "One Minute Load Average"
    }
    metric {
      name = "load_five"
      title = "Five Minute Load Average"
    }
    [...]
  }

This attribute is used to assist InfluxDB in organizing metrics into groups
of "measurements".  Measurements are similar in function to an SQL table
(InfluxDB is not an SQL database, and the analogy is not perfect).  Since
most metrics in a collection group tend to be similar (all CPU stats are
collected at the same time, network stats at another, etc), adding this
metric at the collection_group level seems to make the most sense.  If a
collection_group does not have a measurement attribute, the metric name
(e.g. "load_one"), is used instead; this is not recommended.  You may use a
measurement name in different collection_group stanzas.  This is
appropriate, if  there are simlar metrics with different collection and
send intervals. The new example gmond.conf file has a few examples of this.
Note that adding this support did require some minor reorganization of the
default gmond.conf file.

I know of several improvements that cold be made, but believe that the code
is fit for general review.

Comments, questions and corrections are all welcome.  The Github fork has
the "Issues" feature enabled, and things can be posted there

-- 
Jesse Becker
------------------------------------------------------------------------------
Find and fix application performance issues faster with Applications Manager
Applications Manager provides deep performance insights into multiple tiers of
your business applications. It resolves application problems quickly and
reduces your MTTR. Get your free trial!
https://ad.doubleclick.net/ddm/clk/302982198;130105516;z
_______________________________________________
Ganglia-developers mailing list
Ganglia-developers@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/ganglia-developers

Reply via email to