Hi Brad:

Thanks for your reply.

On Mon, Jan 10, 2011 at 8:06 AM, Brad Nicholes <bnicho...@novell.com> wrote:

> The purpose of setting the send_metadata_interval to 0 by default was to 
> avoid unnecessary traffic for our default configuration of multicast.  
> Setting the directive to anything other than 0 will cause each gmond to start 
> sending all of its metric metadata on that interval.  If you are going to set 
> it by default, IMO 30 seconds is too low.  The problem is that people only 
> notice this in the first few minutes after restarting a gmond.  They expect 
> metrics to start showing up immediately.  After the gmond node finally does 
> send its metadata, rebroadcasting the metadata at any interval is just 
> consuming unnecessary bandwidth on the network.  Especially in a multicast 
> environment where it isn't needed at all.  Also consider that the more gmond 
> nodes you have the more traffic you are going to but on the network where 99% 
> of the time the extra traffic is totally unnecessary.

I have a perhaps naive question.  It sounds like
send_metadata_interval is only relevant to unicast configuration, so
why is multicast affected as well?  How difficult of a code change
would it be if we make the send_metadata_interval directive to only
affect unicast?

Also multicast is the default configuration due to historic reasons
but not because it is more common.  It is however easier to set up if
your environment supports it.  Is it time for us to evaluate whether
we should switch to unicast as the default?  And if so how?  What is
the actual spread between unicast and multicast users?  If it turns
out that the majority of our (new) users are using unicast, should we
spend more time/effort making it easier for them to use Ganglia?

> 300 or 600 seconds is probably good enough for a default.  But no matter what 
> the default is, users still have to understand what that directive is for and 
> how to optimize it.  The value of send_metadata_interval will probably be 
> different for every installation when you take into consideration the number 
> of nodes, the number of metrics and any other network related variables.

A couple more ideas came out of a brief brainstorming session on IRC
between Vladimir, Jesse and myself:

1) Collector gmond should request metadata from all gmonds when it has
been freshly (re)started
2) Add a configuration check for gmond so upon starting, if
configuration is unicast-based, and send_metadata_interval is 0, warn
the user to set it to a sane number
3) Find a middle ground of default send_metadata_interval which does
not hurt new users in HPC space wanting to use unicast

2) and 3) are workarounds which could be implemented relatively
quickly, 1) maybe not so much.

Thanks,

Bernard

------------------------------------------------------------------------------
Gaining the trust of online customers is vital for the success of any company
that requires sensitive data to be transmitted over the Web.   Learn how to 
best implement a security strategy that keeps consumers' information secure 
and instills the confidence they need to proceed with transactions.
http://p.sf.net/sfu/oracle-sfdevnl 
_______________________________________________
Ganglia-developers mailing list
Ganglia-developers@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/ganglia-developers

Reply via email to