Collectd is great, IMHO. I was using collectd+graphite to gather and display stats for a large collection of VMs, servers, routers, and switches. Collectd itself was pretty low overhead, easy to configure (I managed configs via puppet) and Just Worked.
Graphite and carbon cache were a little more tricky to set up - carbon by default aggregates/averages older data, so if not setup correctly, when you go back a few months and try to drill into a graph at a 5 minute interval, you get unexpected results. I’d highly recommend looking at Graphite, as well. Once you get used to it, being able to apply functions[1] to aggregate, manipulate, and quickly find patterns in data is super useful (ex: look at all interfaces on this switch, only display graphs for the top 5 abnormal traffic). Jason Dixon has written some great blogs posts about it’s use on obfuscurity.com. John 1: https://graphite.readthedocs.org/en/latest/functions.html > On Mar 16, 2016, at 11:45 AM, Eric Kuhnke <eric.kuh...@gmail.com> wrote: > > Would anyone care to share their experience using collectd as an > alternative to rtg for high-resolution polling of interface traffic and > long term storage? > > I am investigating the various options for large data set size, lossless > long term traffic charting (not RRAs which lose precision over time). One > possible use is precision 95th billing. > > https://collectd.org/