Re: [Ganglia-developers] Ganglia Gmond Memory

Mahendra Kutare Thu, 30 Jul 2009 13:32:28 -0700

On Thu, Jul 30, 2009 at 12:25 PM, Brad Nicholes <bnicho...@novell.com>wrote:


> >>> On 7/30/2009 at 9:08 AM, in message
> <669f1ab30907300808y67c403eev9a1653240c27c...@mail.gmail.com>, Mahendra
> Kutare
> <mahendra.kut...@gmail.com> wrote:
> > On Thu, Jul 30, 2009 at 10:31 AM, Brad Nicholes <bnicho...@novell.com
> >wrote:
> >
> >> >>> On 7/29/2009 at 11:23 PM, in message
> >> <669f1ab30907292223t2734f551lc8d9b98201d7f...@mail.gmail.com>, Mahendra
> >> Kutare
> >> <mahendra.kut...@gmail.com> wrote:
> >> > Hi All,
> >> >
> >> > If I have configured gmond.conf with a udp_recv_channel with just a
> port
> >> > number will that configure ganglia gmond to listen on that particular
> >> port
> >> > any incoming data and thus making it essentially unicast communication
> >> > channel ?
> >> >
> >>
> >> Yes, specifying just a port will configure gmond's recv channel  in
> unicast
> >> mode
> >>
> >> > What happens if the sending side sends data every 1 sec will that be
> >> > transferred immediately to gmond or it waits to collects some packets
> of
> >> > data and then delivers to gmond listening side ?
> >> >
> >> > I started sending some data from outside of gmond interface to gmond
> >> which
> >> > is configured as mentioned above to a udp_recv_channel on port 8108.
> >> >
> >> > Now even though the sending side is pushing data in every 1sec. I do
> not
> >> see
> >> > gmond showing in debug mode on the console that its processing Ganglia
> >> > message from sender side every 1 sec.
> >> >
> >> > Is it just the display part of the problem or ganglia does some
> >> > sophisticated processing of incoming data i.e waiting for a message
> size
> >> > before delivering it ?
> >> >
> >>
> >> How did you configure gmond to send data every 1 sec.?  Gmond sends its
> >> data in collection groups and each collection group is configured with a
> >> send time threshold.  At the very worst, the collection group will send
> all
> >> of the metric values within that group once the group's collection
> threshold
> >> has been exceeded.  In addition, each metric is assigned a value
> threshold
> >> which is a percent of change differential.  If any of the metrics within
> the
> >> collection group, differential change exceeds the value threshold, the
> >> entire group of metrics is immediately sent.  So even though a
> collection
> >> group is set to collect every 1 second, that doesn't mean that the
> metrics
> >> are sent every 1 second.  Also, by default the rrd files are configured
> by
> >> gmetad to store metrics at an interval of every 15 seconds.  So even if
> the
> >> metrics were sent every 1 second, you will still only be seeing 15
> second
> >> averages in the front end.
> >>
> >
> > Thanks Brad.  I am trying to do it to understand the ganglia protocol and
> > this helps.
> > Right now its fine with me even if Gmetad sees only 15 seconds average in
> > frontend as you described.
> >
> > So as I see there are other configuration in collection groups such as -
> >
> > 1. collect_once and collect_every
> >
> > I understand that collect_once with make some collection to be collected
> > only once and just send it other gmond every time_threshold.
> > Also, If I am not wrong If  I configured collect_every = 20 and
> > time_threshold=90, gmond will collect every 20 sec and send every 90 sec
> to
> > other gmond.
> >
>
> Under normal circumstances it will send every 90 seconds but if one of the
> metric value_thresholds has been exceeded, the entire collection group will
> be sent immediately.  The purpose for this is to make sure that
> abnormalities or spikes are caught and reported.
>
> > Now the part I am not clear is if I am collecting more frequently than I
> am
> > sending does that mean we are keeping more in memory ? I mean say after
> > first occurance of collect in 20 sec if I am not sending it across to
> gmonds
> > am I just keeping it in memory hash ? If not, whats the behaviour ?
> >
>
> No, if you are collecting every 20 seconds but the collection group is only
> sending every 90 seconds, the only metric that is sent or reported is the
> last metric collected with the 90 second interval.  This is the purpose of
> the metric value_threshold.  If for example, you collected a metric 4 times
> within a 90 second period and the delta between each collected metric value
> only varied by 5 percent, storing and reporting each of the metrics would
> just end up being noise on the wire because the percent of change between
> the values is insignificant.  So just sending the last metric collected in
> this case is good enough.  However if the metric saw a spike within the 90
> second period but then immediately dropped back to normal, you want to make
> sure that the metric spike is sent and recorded so gmond sends it
> immediately.
>
> > 2. What does this configuation  *cleanup_threshold* = 300 /*secs *  ?
> >
> > Is it cleaning stuff in memory hash ? If yes, is it happening
> concurrently
> > while gmond is trying to send data to other gmonds ? What happens if the
> > cleanup threshold is reached and gmond collection metric also reached
> > time_threshold or say if its synchronized first cleanup_threshold and
> next
> > time_threshold ?
> >
> > Will it just send all NULL ?
>
> No, the cleanup_threshold is an interval where gmond will analyze all of
> the hosts that have been reported and determines if they are still
> reporting.  If for example, host1 was reporting metrics and then was removed
> from the grid, gmond will no longer receive data from host1 and that host
> should no longer be included in the XML data that is collected by gmetad.
>

Brad thanks for your responses. It really helped to understand the protocol.

I have few more questions about memory hash -

I understand Gmond has threads which listen to the multicast channel and
write the data collected to a fast, in-memory hash table.

What I am trying to understand is -

a) Now is this memory hash table a bounded memory buffer ?

b) If a gmond recieve a packet on multicast channel, it just goes ahead and
update the memory-hash with some hash identifiers say node-location or
collection group based

What happens when we get another same collection metrics on multicast
channel say after 20 sec time_threshold or 10 sec value_threshold which
causes collection group to multicast ?

Does that update all the values for the collection group in the memory hash
obviously with the multiple threads updating multiple collection group
metrics stored in memory hash ?

Thanks
Mahednra

------------------------------------------------------------------------------
Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day 
trial. Simplify your report design, integration and deployment - and focus on 
what you do best, core application coding. Discover what's new with 
Crystal Reports now.  http://p.sf.net/sfu/bobj-july

_______________________________________________
Ganglia-developers mailing list
Ganglia-developers@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/ganglia-developers

Re: [Ganglia-developers] Ganglia Gmond Memory

Reply via email to