Hi Khrist:

I'm not sure if he is working off our main GitHub repo but if he is, you
might want to patch it and submit a pull request.

Thanks for your support!

Cheers,

Bernard

On Thursday, 3 October 2013, Khrist Hansen wrote:

> This appears to be an issue with Mr. Perzl's updated libperfstat code
> borrowed from IBM's perfstat_cpu_total example.
>
> ftp://www.oss4aix.org/ganglia/RPMs-3.3.7/src/ganglia-3.3.7-aix.patch
>
>
> http://publib.boulder.ibm.com/infocenter/aix/v6r1/topic/com.ibm.aix.prftools
> /doc/prftools/idprftools_perfstat_glob_cpu.htm
>
> When calculating wio and idle, the code performs a divide operation with
> (dlt_lcpu_wait + dlt_lcpu_idle) as the divisor.  If the server is CPU
> bound,
> i.e. usr+sys=100%, then both dlt_lcpu_wait and dlt_lcpu_idle will be zero,
> and the division will occur with zero as the divisor.
>
> This should be a fairly simple fix, and I am attempting to contact Mr.
> Perzl
> to that effect.
>
>
> -----Original Message-----
> From: Khrist Hansen [mailto:[email protected] <javascript:;>]
> Sent: Wednesday, October 02, 2013 6:18 PM
> To: [email protected] <javascript:;>
> Subject: RE: Insane negative values for cpu_idle and cpu_wio when node is
> CPU bound
>
> Here is another example from gstat:
>  CPUs (Procs/Total) [     1,     5, 15min] [  User,  Nice, System, Idle,
> Wio]
>     8 (    8/  122) [  4.59,  2.04,  1.35] [  99.8,   0.0,
> 0.2,-67062349824.0,-67062349824.0] OFF
>
> Looking at the source code for AIX metrics
> (
> https://github.com/ganglia/monitor-core/blob/master/libmetrics/aix/metrics
> .
> c), it appears that negative values should be converted to 0.  This is
> either not happening or the metrics are somehow being modified after the
> fact.
>
> g_val_t
> cpu_wio_func ( void )
> {
>    g_val_t val;
>
>    get_cpuinfo();
>    val.f = CALC_CPUINFO(wait);
>
>
>    if(val.f < 0) val.f = 0.0;
>    return val;
> }
>
> g_val_t
> cpu_idle_func ( void )
> {
>    g_val_t val;
>
>
>    get_cpuinfo();
>    val.f = CALC_CPUINFO(idle);
>
>
>    if(val.f < 0) val.f = 0.0;
>    return val;
> }
>
>
> From: K. Hansen
> Sent: Wednesday, October 02, 2013 4:50 PM
> To: [email protected] <javascript:;>
> Subject: Insane negative values for cpu_idle and cpu_wio when node is CPU
> bound
>
> Environment:
> AIX 6.1 TL7 SP7
> gmond 3.6.0 (from http://www.perzl.org/ganglia/)
>
> I noticed that a particular node would send insanely high negative values
> for cpu_idle and cpu_wait metrics when cpu_user + cpu_system were near
> 100%,
> i.e. the node is completely CPU bound.  The result is major skewing of the
> node's cpu_idle and cpu_wio graphs so that no true positive values are
> visible, and the cpu_report graph for the node, cluster, and grid become
> corrupted.
>
> Here is an example of what I am talking about:  http://imgur.com/a/aIzyU
>
> I am able to replicate this behavior on any AIX node by running the
> following command to generate CPU load:
>
> perl -e 'while (--$ARGV[0] and fork) {}; while () {}' 8
>
> Where the last digit is the number of threads available to the server.  For
> example, if a server has 2 POWER7 vCPU, then it has 8 threads (logical CPU)
> due to 4-way simultaneous multithreading (SMT).
>
> Has anyone else experienced this on AIX or Linux?
>
> Thanks!
>
>
>
>
>
> ------------------------------------------------------------------------------
> October Webinars: Code for Performance
> Free Intel webinars can help you accelerate application performance.
> Explore tips for MPI, OpenMP, advanced profiling, and more. Get the most
> from
> the latest Intel processors and coprocessors. See abstracts and register >
> http://pubads.g.doubleclick.net/gampad/clk?id=60134791&iu=/4140/ostg.clktrk
> _______________________________________________
> Ganglia-general mailing list
> [email protected] <javascript:;>
> https://lists.sourceforge.net/lists/listinfo/ganglia-general
>
------------------------------------------------------------------------------
October Webinars: Code for Performance
Free Intel webinars can help you accelerate application performance.
Explore tips for MPI, OpenMP, advanced profiling, and more. Get the most from 
the latest Intel processors and coprocessors. See abstracts and register >
http://pubads.g.doubleclick.net/gampad/clk?id=60134791&iu=/4140/ostg.clktrk
_______________________________________________
Ganglia-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/ganglia-general

Reply via email to