>>> On 11/10/2008 at 6:11 PM, in message <[EMAIL PROTECTED]>,
Ofer Inbar <[EMAIL PROTECTED]> wrote:
> Brad Nicholes <[EMAIL PROTECTED]> wrote:
>> The reason why is because with the introduction of the modular
>> metric functionality, metric metadata is now passed between gmonds
>> rather than it being hardcoded into every gmond.  In multicast mode,
>> if you restart the master gmond, it has to request from and wait for
>> each sub-gmond that is listening on the same multicast channel, to
>> respond with its metadata for each metric it supports.  Depending on
>> the reporting interval for a collection group, this could take
>> anywhere from a few seconds to several minutes.  In unicast mode the
> 
> However, as we discovered a couple of months ago, requesting new
> metadata doesn't always work as designed, and it can take hours
> to get everything it needs.  Restarting the *other* gmonds in the
> cluster is sort of a workaround, because each one will send its
> metadata when it is restarted.  This way, you can at least ensure that
> the least recently restarted gmond knows everything.
>   -- Cos

That's interesting.  I would like to investigate this further since I haven't 
seen the same problem.  In my testing, granted I don't have very large clusters 
to test with, the complete metadata resync time has only taken as long as the 
long longest collection_group interval (ie. time_threshold value).  If you 
don't have any collection_groups that have a time_threshold on the order of 
hours, then there is something we need to investigate further.  It will just be 
a little more difficult because I can't duplicate the delay in my testing 
environment.

BTW, another workaround if you don't mind the additional UDP traffic, is to set 
the send_metadata_interval value anyway, even in multicast mode.  All it will 
do is ensure that the metadata is sent on an interval rather than just on 
request.  This might be a good idea if you are restarted gmond's very often.

Brad


-------------------------------------------------------------------------
This SF.Net email is sponsored by the Moblin Your Move Developer's challenge
Build the coolest Linux based applications with Moblin SDK & win great prizes
Grand prize is a trip for two to an Open Source event anywhere in the world
http://moblin-contest.org/redirect.php?banner_id=100&url=/
_______________________________________________
Ganglia-general mailing list
Ganglia-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/ganglia-general

Reply via email to