Hello, We use heavily munin that is somehow very easy to deploy and scale very well. Very comparable to Ganglia/CACTI. Already packaged, very easy to automate deployment, large number of plugins, RRD, very simple/no fuss/sysadmin-like interface (http://munin-monitoring.org/). It supports various plugins for virtualization/contextualization (KVM, OpenVZ, etc.)
The virtualization plugins are very useful : provides I/O disk, Network, CPU, RAM per VM. Great help for bottleneck identification. I used ganglia a lot on numerous HPC clusters. Some problem/issues as mentioned with DNS, NAT and this kind of funny real life server stufff. I'm not sure ganglia support any virtualization technology. Not so helpful to optimize VM operation and find bottleneck because of scare shared resources. Collectd is great because compared to other tools, it has a smaller footprint and, as a consequence, a time resolution that is way better than comparable tools (10s per default for most of the C plugins compared to minute/5 minutes resolution for others!). This is really a great help for capacity planning and bottleneck analysis : you can have very detailed time series of relevant data. In term of virtualization and contextualization, collectd is great : It support interesting plugins like OpenVZ an Vserver and It support libvirt (I know, not necessarily the panacea) in order to gather Xen/Qemu/KVM statistics. It means that you only need to deploy Collectd on your hosts and you will be able to graph the basic vitals of all your VMs. Simplify simplify simplify. It give you I/O, Network, CPU ... per VM. Very useful tool... It only works for selected data however and for more advanced plugins, you still need to deploy collectd on every instance. This being said, I will not recommend ganglia because, AFAIK, it does not support contextualization/virtualization. Munin is very "sysadmin-like" : it produces static HTML pages, need a very simple "presentation server" and provide great data with very little effort and a minimal security risk. Somehow Web 1.0 interface but very usable and OK for most sysadmins ;-) Collectd has now a more evolved interface (including iPhone support I think !) using Jquery and other fancy Web 2.0 technology (http://kenny.belitzky.com/projects/collectd-web). It has the advantage to provide more detailed statistical data than munin. You can also store the statistical data in something else than RRD and, as a consequence, keep the time precision intact (for instance every 10s) at a cost (storage). Ben On Thu, Jun 16, 2011 at 11:02, Nicolas Barcet <[email protected]> wrote: > I think it would be good to have the server community's opinion on what > should be our preferred performance statistics aggregation solution in > Ubuntu. The 2 main contenders would be ganglia [1] and collectd [2], > but something even better might be out there that I do not know about. > > [1] http://ganglia.sourceforge.net/ > [2] http://collectd.org/ > > Thoughts? > Nick > > > -- > ubuntu-server mailing list > [email protected] > https://lists.ubuntu.com/mailman/listinfo/ubuntu-server > More info: https://wiki.ubuntu.com/ServerTeam > -- Benoit des Ligneris, Ph. D., CEO http://www.revolutionlinux.com/ Blog : Open Source catalyst http://openceo.blogspot.com/ Large Scale Thin Client - Open Source VDI http://ltsp-cluster.org/ -- ubuntu-server mailing list [email protected] https://lists.ubuntu.com/mailman/listinfo/ubuntu-server More info: https://wiki.ubuntu.com/ServerTeam
