On Fri, Jan 7, 2011 at 15:25, Bernard Li <[email protected]> wrote: > Hi all: > > Since the release of Ganglia 3.1, we have introduced the new > configuration option send_metadata_interval in gmond.conf. This is > set to 0 by default and the user must set this to a sane number if > using unicast otherwise if gmonds are restarted, hosts may appear to > be offline (this is documented in the release notes). A bug has > already been filed: > > http://bugzilla.ganglia.info/cgi-bin/bugzilla/show_bug.cgi?id=242 > > We recently have a lot of users having this issue and Vladimir > recommend that we just set a sane number as the default and be done > with it, since we end up spending a lot of time on IRC/mailing-list to > solve the same problem over and over again. > > Since there have been some commits to the 3.1 branch since tagging > 3.1.7, I propose we just copy 3.1.7 tag, update the send_meta_data > interval in the configuration file and release that as 3.1.8. > > This is not the normal procedure for making a release, so I'd like to > get some feedback from other developers. > > BTW I am thinking of setting send_metadata_interval to 30 seconds. > Also, does anybody know if this setting affects multicast setups in > any way?
I think that it's fine to set this to a non-zero value, but I wonder if 30 seconds is too high. I did a quick set of checking on the actual packets that are sent--and specifically the metadata packets. I haven't been able to really delve into the code to figure exactly what's going on (this part of the code is't terribly transparent to me), but I *think* that they are really large--on the order of several KB when fully assembled, as compared to less than 100-120 bytes for a typical metric packet . I think that size will increase with the number of metrics stored, since each one must be described in full XML each time. The reason for the large size is that an entire XML description of the metrics appears to be sent each time. Metadata packets also appear to go over TCP, not UDP. My testing was pretty simple: 1) setup a gmond (from SVN, well after 3.1 came out) in unicast mode. 2) set 'send_metadata_interfaval' to 1 3) disable all modules, except for 'mod_core' 4) remove all collection groups. 5) start gmond, and run tcpdump. On a large cluster, with lots of metrics per host, I can see problems if the metadata packets are sent too frequently. I have hosts that send well over 300 metrics (lots of CPU cores makes for lots of metrics...). Each of these need to be described in the metadata packets. So I think that setting a non-zero default is fine. But think that something like 300 or 600 seconds would be preferable. -- Jesse Becker ------------------------------------------------------------------------------ Gaining the trust of online customers is vital for the success of any company that requires sensitive data to be transmitted over the Web. Learn how to best implement a security strategy that keeps consumers' information secure and instills the confidence they need to proceed with transactions. http://p.sf.net/sfu/oracle-sfdevnl _______________________________________________ Ganglia-developers mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/ganglia-developers
