Hi Neil,

Including the extra metrics (eg. swap in/out, contexts, interrupts) is a great 
idea. These metrics would be very welcome by our capacity planners.

Also, I'm very keen on the virtualized server metrics you get from libvirt. 
This would make dynamic performance monitoring of cloud-like server 
infrastructure a snap. I assume you can just spoof the hostnames of the virtual 
server instances as they spin up.

I must admit I'm not so ken on submitting zeros for unsupported metrics. This 
would lead to confusion ("why are the graphs there when there's no data" type 
of questions) and also the unnecessary creation of useless RRD files. The 
nature of RRD means that the files for all dummy metrics will be created at the 
maximum size the instant the first value is received. This adds up to a lot of 
wasted disk if you have 100's or 1000's of windows servers. My preferred 
approach would be to modify the web font-end to intelligently handle windows 
servers.

Lastly, the ability to define the host sflow port should allow you to configure 
multiple gmonds to run on the same Linux server with different windows cluster 
names assigned to them. Was this the intention? If so, I like it.

Regards,
Nick

-----Original Message-----
From: ext Neil McKee [mailto:neil.mc...@inmon.com] 
Sent: Saturday, February 12, 2011 7:56 AM
To: ganglia-developers@lists.sourceforge.net
Subject: Re: [Ganglia-developers] hsflowd for Windows + Ganglia webfrontend

Here is the patch I was referring to.  It allows you to put something like this 
in your gmond.conf:

sflow {
  null_int = 0
  null_float = 0.0
}

and then if a fields like cpu_nice is missing (as in the Windows hsflowd) we'll 
submit 0.0 instead of leaving it out.   This is a work-around for the problem 
where the RRD does not even appear when cpu_nice is missing.

You can also add another setting "accept_all_physical = yes",  like this:

sflow {
  null_int = 0
  null_float = 0
  accept_all_physical = yes
}

and now the extra metrics that are defined in host-sflow but not in libmetrics 
are accepted too.  These include some useful ones like the number of context 
switches,  the number of pages swapped in/out, network errors and drops, more 
info on disk reads and writes,  and so on.  The UI seems to do a good job of 
just adding these RRDs to the page (so perhaps it would be even safe to make 
"yes" the default here?)

I'm still skipping over the VM fields,  and don't have the option to ignore the 
sFlow hostname field yet,  but placeholder boolean options "accept_all_virtual" 
and "accept_hostname" are defined.  There is also "udp_port" in case you want 
to designate a non-standard port as the sFlow port (though it still has to 
appear in a udp_receive_channel section elsewhere).

I didn't edit gmond/conf.pod yet.  I figured that could happen once there is 
consensus on these options.

Thoughts?

Regards,
Neil


------------------------------------------------------------------------------
The ultimate all-in-one performance toolkit: Intel(R) Parallel Studio XE:
Pinpoint memory and threading errors before they happen.
Find and fix more than 250 security defects in the development cycle.
Locate bottlenecks in serial and parallel code that limit performance.
http://p.sf.net/sfu/intel-dev2devfeb
_______________________________________________
Ganglia-developers mailing list
Ganglia-developers@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/ganglia-developers

Reply via email to