Re: [Ganglia-developers] [Ganglia-general] revisiting bogus spikes

2011-04-28 Thread Martin Knoblauch
Hi Cameron, [adding the developers list]

 OK:

1) we write the unmodified data in line 233 to capture the raw counters. That 
is what we are using in line 227 for the comparison
2) ns is created and returned by hash_lookup
3) The ULONG_MAX logic in line 231 is there because we need to ensure that the 
result is always positive. Needed because the variables are unsigned.
4) update_ifdata is called once by metric_init and then every time one of 
the byte/pkts_in/out collectors fires

 Now this does not solve your problem ... Question: do you see any of the debug 
messages that should be created by update_ifdata in case of something 
unusual? 
That should help to get an idea on how the interface counters on your 
machine(s) 
look like. Lokk in /var/log/messages, or just start gmond noninteractive.

 Hmm. Another question: do you compile gmond in 64-bit or 32-bit mode? The 
ULONG_MAX logic may/will fail in 32-bit mode, if the kernel is 64-bit. It could 
even be that the interface counters on 32-bit kernels are written as 64-bit 
values.

Hope this helps

Martin 
--
Martin Knoblauch
email: k n o b i AT knobisoft DOT de
www: http://www.knobisoft.de



From: Cameron L. Spitzer cspit...@nvidia.com
To: ganglia-gene...@lists.sourceforge.net 
ganglia-gene...@lists.sourceforge.net
Sent: Thu, April 28, 2011 3:21:04 AM
Subject: [Ganglia-general] revisiting bogus spikes


Once again I've been asked to make Ganglia usable on Linux hosts with the 
Broadcom NIC with the 32-bit byte counters.
E.g., HP Proliant 580 G5, a rather popular machine where Ganglia doesn't 
work 
out of the box.

So I'm trying to understand ganglia-3.1.7/libmetrics/linux/metrics.c again.

In update_ifdata(), we parse /proc/net/dev for the current bytes and packets 
in 
and out.
There's a structure ns (declared where?) of type net_dev_stats, representing 
the previous sample?
I'm not sure exactly what ns represents.

There's a sanity check at line 227   if ( rbi = ns-rbi )  for whether the 
counter went up or down.  If it went down, we assume the counter rolled 
around, 
and guess the value is negative, and invert it, line 231.  l_bytes_in += 
ULONG_MAX - ns-rbi + rbi;
(I don't understand how that is supposed to work.)
Then, regardless of whether the sample passed or failed the sanity check, it's 
saved in the ns structure.
Line 233, ns-rpi = rpi;

After the parsing is all done, and the crazy value is in ns, an optional 
reasonableness test (REMOVE_BOGUS_SPIKES)
returns early if any of the numbers are extremely large.  Otherwise it updates 
the static running counts and then returns.
On our HP 580G5s, defining REMOVE_BOGUS_SPIKES had no effect.  The network 
traffic graphs become useless within a minute of starting gmond.

The part I don't understand is when the line 227 check fails, we put the 
known-bad data in ns anyway.

I'd appreciate it if someone familiar with update_ifdata() could explain its 
logic.  When is this routine called?
(I can see modules/network/mod_net.c calls it via bytes_in_func(), but I 
haven't 
figured out when net_metric_handler()
is called.  Maybe that would explain how bogus data in ns doesn't matter.)
Is there any way to keep way out-of-scale data out of these graphs?
Thanks for any help.

-Cameron in Los Gatos






 
This email message is for the sole use of the intended recipient(s) and may  
contain confidential information.  Any unauthorized review, use, disclosure  
or 
distribution is prohibited.  If you are not the intended recipient,  please 
contact the sender by reply email and destroy all copies of the original  
message. 


--
WhatsUp Gold - Download Free Network Management Software
The most intuitive, comprehensive, and cost-effective network 
management toolset available today.  Delivers lowest initial 
acquisition cost and overall TCO of any competing solution.
http://p.sf.net/sfu/whatsupgold-sd___
Ganglia-developers mailing list
Ganglia-developers@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/ganglia-developers


[Ganglia-developers] Checked in THIRDPARTY file for monitor-web-2.0 branch

2011-04-28 Thread Bernard Li
Hi all:

Just checked in a THIRDPARTY file into the monitor-web-2.0 branch:

https://sourceforge.net/apps/trac/ganglia/changeset/2586

If I missed something, please let me know and I'll add it (or you can
add it yourself).

From now on, if you need to commit additional third-party software
into our tree, please also update this file.  Also, please ensure that
whatever you are checking in are licensed under a compatible license
with New BSD License.

Thanks for your attention.

Cheers,

Bernard

--
WhatsUp Gold - Download Free Network Management Software
The most intuitive, comprehensive, and cost-effective network 
management toolset available today.  Delivers lowest initial 
acquisition cost and overall TCO of any competing solution.
http://p.sf.net/sfu/whatsupgold-sd
___
Ganglia-developers mailing list
Ganglia-developers@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/ganglia-developers