> Is there interest in formalizing a hierarchical naming convention for > metrics in Ganglia? >
I agree that Ganglia's existing methods are very simplistic. On the positive side, they are very easy to understand and they are both sufficient and effective for simple situations On the other hand, there are various issues: a) multiple instances - net devices, filesystems, CPUs (maybe you've seen my recent release of ganglia-modules-solaris with per-core CPU support and per-disk IO stats?), b) dynamic names: do you really want to see `net_bytes_out_eth1' if eth1 is a USB device and tomorrow it might appear as eth2 or eth3? Or does Ganglia need to have some mapping functionality, so the name would appear as `net_bytes_out_wan' no matter what physical device name was used? The same issue applies to filesystems. c) use of an existing hierarchy: could we borrow from SNMP and use the OID, for example? Maybe a future version of Ganglia could just be a multicast transport for SNMP, and the gmetad would just poll the normal SNMP daemon to get the mappings of OID->real device names d) adding or removing devices (e.g. USB net or storage, virtual devices on a VM, provisioning a SAN filesystem over fibre channel) while Ganglia is running - at a very simplistic level, gmond could just restart itself when it notices a change, but if a system is very dynamic, it could appear that the daemon is flapping e) application-specific monitoring: e.g. you run two UAT environments, a demo environment and a production environment. Each application instance is a JVM. You move the UAT environments around between different servers, but you want to keep all the history from each JVM and associate it with the name of the environment rather than the name of the server. f) excluding some things from aggregation: in the per-core CPU monitoring, it doesn't mean anything to look at an aggregation of core no. 3 from each of your 10 hosts, especially if 4 of the hosts only have 2 cores. g) common solution with Nagios and other technologies: it may also be desirable to have some naming convention (with meta-data support) that can be shared, for example, something that could be used by Nagios, preferably with enough meta-data to allow auto-configuration of things that should be monitored My feeling is that all these types of issues should go on a roadmap for Ganglia 4 or beyond. It is probably not possible to address them all in one go, but if they are factored in to the next iteration of the protocol, then they can be added incrementally ------------------------------------------------------------------------------ Cloud Computing - Latest Buzzword or a Glimpse of the Future? This paper surveys cloud computing today: What are the benefits? Why are businesses embracing it? What are its payoffs and pitfalls? http://www.accelacomm.com/jaw/sdnl/114/51425149/ _______________________________________________ Ganglia-general mailing list Ganglia-general@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-general