If your batch runs regularly or consistently drive some virtual machines to 
100% this
may not signal a loop condition (which, I would guess, is why the ticket is 
being
raised).  Techs may grow conditioned to this and either take longer to respond 
or just
outright 'ignore' the tickets eventually, since the 'normal' course of action 
is to page
for a condition that is unresolvable without a larger share, or redistribution 
of the load.

If only the monitor could 'know' that the machine was running this batch load 
at a
certain time of day and had an absolute share and was running 100% for an 
extended
period of time.  It could be set up to not sent out alerts based on all of these
criteria.  Wow!  That would be a very nice feature.

When your monitoring department looks at top, vmstat and sar to detect 
problems, don't
forget the kernel numbers lie.  Even the new steal timer is a little off.


On 08/19/2010 05:51 PM, Berry van Sleeuwen wrote:
True, it isn't. It's the replacement of an operator. The main issue here
is that it needs to raise tickets and get reporting stats. For instance,
raise a ticket at 100% CPU (and indeed, our ABS limithard machines do
raise tickets when they are running their batch..<sigh>.) or when a
filesystem is at 100%. The reporting is for instance on CPU and
filesystem usage.

But indeed it can't provide insight in the performance of a guest, other
than detect thresholds. And it doesn't have to either, the monitoring
department can look at top, vmstat or sar to detect that kind of
problems should they need to (yeah right, then they know all about the
entire environment).

Still, as for a case, this is a good point. We need to be able to
address performance related monitoring and nagios can't do that. Or at
least not within the scope of an entire LPAR.

Thanks, Berry.


--
Rich Smrcina
Phone: 414-491-6001
http://www.linkedin.com/in/richsmrcina

Catch the WAVV! http://www.wavv.org
WAVV 2011 - April 15-19, 2011 Colorado Springs, CO

----------------------------------------------------------------------
For LINUX-390 subscribe / signoff / archive access instructions,
send email to [email protected] with the message: INFO LINUX-390 or visit
http://www.marist.edu/htbin/wlvindex?LINUX-390
----------------------------------------------------------------------
For more information on Linux on System z, visit
http://wiki.linuxvm.org/

Reply via email to