You can try getting more control of the environment. We don't install all these 'Unix/Linux' std packages in zLinux, because they don't fit in, or give inaccurate data. CPU load for example, we get that from z/VM instead, and our arguments to the organisation here is bought. We select appropriate stuff to monitor that is vaild and works without bloating the cpu to much. Yes, that is a balance, and we always try to minimize things, and just as said in this forum: we really need to think differently. And it is also true, we now starts getting company from other virtual environments than run into problems with resources.
So time is working for us :) ___________________________________________ Tore Agblad Volvo Information Technology Infrastructure Mainframe Design & Development, Linux servers Dept 4352 DA1S SE-405 08, Gothenburg Sweden Telephone: +46-31-3233569 E-mail: [email protected] http://www.volvo.com/volvoit/global/en-gb/ -----Original Message----- From: Linux on 390 Port [mailto:[email protected]] On Behalf Of Rob van der Heij Sent: den 20 augusti 2010 23:08 To: [email protected] Subject: Re: How to convince others. Was: Re: mono keep guest active - ban the blips. On Fri, Aug 20, 2010 at 12:40 AM, Berry van Sleeuwen <[email protected]> wrote: > Nagios is in use at the server side. Each client (our servers) has the > nagios client, with scipting instead of the nagios plugins, and sec. While parts of the Nagios user interface are pretty slick, it just does not scale. While the rather simple architecture does not help, the real problem appears to be in the admins who keep adding additional checks. You can do a lot of silly things on discrete servers with 5% avg utilization, but that does not mean it is a smart thing to do in a shared resource environment. > Sec is in use for monitoring the /var/log/messages, it makes the server > go into Q3 and stay there and has quite some CPU load as well. Usefull, > I don't know, perhaps but why brun so many cycles and keep busy all the > time? I mean, how many message can you write and consequently read? At > least when we monitor the linux console with PROP we won't have that > much overhead. It's probably polling with a very short delay while reading the open file. Obviously it could have used a much longer delay. Which still is pretty silly when nothing is happening in the system that writes data into the log file. You could be off worse. We ran into a commercial product that used this to start a new log file at midnight: - sleep until 23:59:59 - while time() <> 00:00 do ; You probably figure why this process went into a busy wait for 24 hours ... We have used SCIF to route the Linux console logging into a PROP-like service that checked for bad things and also allowed trusted processes to issue privileged commmands on the Linux guests. That's cheaper and does not keep the Linux guest awake. > The other part is scripting scheduled in cron to monitor the filesystem > and processes. They tend to run at the same time for all servers and > have some CPU load as well. I did notice the mon_fsstat and such, that > only have minor impact on the linuxsystem and they even write records > every minute. So in this case, usefull yes, but at a cost. So if you have monitor data telling you almost nothing was written to disk, does it still make sense to frequently run commands to check whether the file systems filled up? Similar reasoning for checking installed software levels - if you know nobody issued privileged commands since last time, why check again? Some of this really requires a different way of thinking. Not all the teams that currently deploy a few Linux servers can make that change. If they can't, it really hurts to let them dictate how one should manage an order of magnitude more servers... -- Rob van der Heij Velocity Software http://www.velocitysoftware.com/ ---------------------------------------------------------------------- For LINUX-390 subscribe / signoff / archive access instructions, send email to [email protected] with the message: INFO LINUX-390 or visit http://www.marist.edu/htbin/wlvindex?LINUX-390 ---------------------------------------------------------------------- For more information on Linux on System z, visit http://wiki.linuxvm.org/ ---------------------------------------------------------------------- For LINUX-390 subscribe / signoff / archive access instructions, send email to [email protected] with the message: INFO LINUX-390 or visit http://www.marist.edu/htbin/wlvindex?LINUX-390 ---------------------------------------------------------------------- For more information on Linux on System z, visit http://wiki.linuxvm.org/
