On Fri, Aug 08, 2008 at 09:30:23AM +0100, [EMAIL PROTECTED] wrote: > From: Carlo Marcelo Arenas Belon [mailto:[EMAIL PROTECTED] > > [EMAIL PROTECTED] wrote: > > > > > > I've been looking at how we currently deploy Ganglia configuration > > > files in our organisation, and whether the process can be improved. > > > > gmond by design is able to work without a configuration for > > exactly this reason.
and I sadly found that with the current default configuration in 3.1.0 if you are also to use modular metrics we broke that design as gmond will segfault if trying to include other configurations and has no backing file that is a bug (this one is from libconfuse though) and should be fixed. > I've been giving this some more thought. There are a couple of issues > that come to mind: > > - The default configuration might not always be appropriate - would > people welcome a patch that disables this behaviour (through a configure > option, such as --without-default-configuration) no. if you don't want to use the default configuration all you have to do is deploy your own. > - Tools like Puppet are nice solutions, but in a really large > organisation with multiple business divisions and different generations > of UNIX and Windows, deploying Puppet universally can't be assumed. Puppet doesn't run in Windows. AD policies can be used though to distribute files if all you are interested is on distributing a configuration file to all windows servers and they are all in a domain. > For > a new cluster, it is not so difficult - but for an organisation wanting > Ganglia on existing infrastructure, it becomes quite a challenge. Configuration management is always a challenge, but lucky for us ganglia doesn't have to do any of that because it is a cluster monitoring tool that can run without a configuration. > Therefore, it may be necessary to have some functionality for > discovering the configuration built in to gmond you are missing half the problem here, as there also has to be a way to trigger gmond to reload its configuration (now a restart is required) and that means that there has to be also a secure way to instruct remote gmonds to restart themselves (ssh, puppet or cfengine are used for this). > - Looking at the format of gmond.conf, it seems that it could be > partially or fully obtained from an LDAP server. Parameters could be > arranged in some kind of heirarchy, e.g. some parameters may be > specified once for the entire cluster, whereas others may be over-ridden > on a per-host basis. agree but LDAP managed configurations get ugly very fast and introduce a single point of failure (yes, I have still the scars from managing Netscape's Mail and LDAP infrastructures that for some reason though that storing LDAP's own configuration in LDAP was a good idea). a web service that generates a configuration on demand is usually easier to scale and maintain. > Whether the configuration server uses LDAP or something else, how should > it be found? Here are some ideas I had: > - a configuration option for hard-wiring the configuration server > hostname then you have to handle two different configuration formats in the application for the configuration file (or hopefully two configuration files) and you are back into square 1 with "configuration management" issues. > - a command line parameter then you have to hardcode it into your startup scripts, which will pull it from some configuration file most likely and you are back into square 1 with "configuration management" issues > - a custom DHCP option > - DNS discovery, looking for a server `ganglia-config.mydomain.com', > similar to the HTTP proxy discover mechanism > - a stub configuration file, with enough information for gmond to find > the central configuration server - an IRC server all nodes connect to for control and command (jabber as well) - a news server all nodes are subscribed too (NNTP, RSS or ATOM) - a multicast based protocol for notification to the nodes (better if secured) - any way to propagate information to nodes which no central entity knows of by design and therefore will need to delegate recursively. for anyone of those it will be better if you cache the configuration file locally anyway, to avoid single point of failures and scalability issues and as I said before you are only half way done as you have to also manage all the nodes (or at least the gmond in those nodes) from some central location for most of the solutions proposed to work and that will replicate some of the features that configuration management solutions provide for. Carlo ------------------------------------------------------------------------- This SF.Net email is sponsored by the Moblin Your Move Developer's challenge Build the coolest Linux based applications with Moblin SDK & win great prizes Grand prize is a trip for two to an Open Source event anywhere in the world http://moblin-contest.org/redirect.php?banner_id=100&url=/ _______________________________________________ Ganglia-developers mailing list Ganglia-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-developers