Many thanks to Loren, Adam and Dave. Yes, monitoring is an issue. This has been the basis of my SHARE and other user group presentations (UK CMG for example) for about 3 years. So to summarize my current presentations:
All Linux monitoring by any vendor is based on /proc, so all instrumentation gets the data from exactly the same place. The issue is that the Linux kernel does accounting based on time of day, and changing this to use the CPU timer which would be needed is not something trivial, so no one should hold their breath thinking IBM can just "fix this" for "z". As you and Adam pointed out, numbers based on a time of day accounting for guest under VM is not useful information. Another issue is that in the "server" environment, installations seldom asked for instrumentation such as we take for granted in a "z" environment, so what you have asked for has not been a focus for these installations and therefore the vendors have not needed to step up to these requirements. We've been working on addressing these issues since Linux on "z" was announced... The only chance you have for getting useful information is to be able to take VM data and Linux data within a fine granularity and "correct" the linux data based on the very accurate VM data. This means that any agent running under Linux under z/VM absolutely MUST be packaged with a VM performance monitor to correctly identify resource utilizations. ESALPS (Linux Performance Suite, renamed to Linux Power Suite by "marketing") uses the NETSNMP agent from sourceforge.org as a data source, and packages this data with our standard VM performance monitoring tools. Because of our "z" requirements, we've been forced to add significantly to the data source, using NETSNMP as the base technology. This allows us to improve both performance and instrumentation. Performance of the agents under Linux under z/VM is an extreme issue that you would not see unless you looked for it. Imagine your POC going well with 1 or 2 guests, and the agent taking 5% of a processor for each one? Then say you go up to 100 linux guests. Turning off your performance monitoring agent will solve your performance problems. We address this issue in several ways with NETSNMP, details here not necessary. But you WOULD want to validate if your agent technology would work in your targeted environment before becoming too wedded to it. One other point, the technology we use is appropriate to measure any Linux box under vm or not, NT boxes, Sun boxes, and even Apples and printers, etc. In fact, we currently are monitoring some 40-50 nodes from subnets from which we have received spam - it was very nice of them to offer their systems up as test sites for our monitoring. Most platforms have implemented the SNMP agent, providing a very useful cross platform monitoring infrastructure. So to answer your questions 1) accurate data: The technology ESALPS uses is to take the VM monitor data, accurate to microseconds, take the Linux data, and "correct" the linux data real time. Not 100% accurate, but within a few percent unless you create some pathelogical test cases. This shows data by process, corrected. The Linux System utilization is very accurate within some T/V ratio parameters. I think I presented this technology at the IBM tech conference in Europe at least 2 years ago, so we've been at it for a while. 2) Throughput and response times require additional instrumentation for each application. The applications we're working with are currently Oracle, Domino, WAS and SAP. Each of these has very unique instrumentation issues. What application are you interested in? If you are just interested in something like Web hits, the rates are already provided, response times are another issue that we're trying to address. 3) You are not doing anything wrong, just falling into the same trap as many POC's before you. Looking for mainframe type information from a system new to the mainframe. I've seen nothing close to the ESALPS technology from any other vendor - if you look at the IBM redbooks, you will see many examples of how ESALPS is being used in researching performance, and there are more to follow. - IBM Lotus Domino 6.5 for Linux on zSeries Implementation - Linux on IBM zSeries and S/390: Performance Measurement and Tuning - Linux on IBM zSeries and S/390: ISP/ASP Solutions We are putting up some new demonstrations on our web site, click the "demo" button on "http://velocitysoftware.com" to get an idea on what data we show and how. Also, click on the "presentations" button to see some of what Velocity Software has been presenting. If we can help with your POC, please call. Oh, and our prices are NOT a secret, and are readily found on our website as well. >Date: Tue, 2 Mar 2004 10:54:34 -0000 >Sender: Linux on 390 Port <[EMAIL PROTECTED]> >From: Mike Fry <[EMAIL PROTECTED]> > >All, > >I have recently embarked upon a 'proof-of-concept' study to look >at running LINUX on a zSeries platform under VM. My part of the >project is to size and monitor a 'lifted' Unix application that >will run on a zSeries platform (2064-2C8 + 1 IFL engine for the >testing). Early testing results have left me feeling a little >concerned by the lack of credible stats coming back from the >internal LINUX monitoring tools (SAR, PS, TOP etc...) and data >coming out of the z/VM Performance Toolkit. > >Running most applications on the IFL engine makes it run at 100% >utilisation, so I would like to monitor the application response >times and transaction volumes to get a feeling of what the >engine can handle and then compare the results against the Unix >figures. > >I have switched on the z/VM Toolkit 'event and user' stats, and >expected to see the LINUX guest transaction and response times >stats, but they do not show up on the monitor. > >The question(s) I would like some info on are, > >1] Is there any way of getting accurate SAR, Accounting stats >etc... out of z/VM-LINUX guest. > >2] How can I measure the throughput and response times for >transactions at guest level. > >3] Am I doing something wrong? > >If anyone has experience of any of the above, or knows ways of >getting 'good stats', then I would be very grateful to hear from >you. > >Mike Fry >Capacity Planning & SAS Consultant > >Performance & Capacity Support >ISS/Mainframe Infrastructure, Enable > >Tel c/w: 2000 x4813 >Tel ext: 01565 614813 "If you can't measure it, I'm Just NOT interested!"(tm) /************************************************************/ Barton Robinson - CBW Internet: [EMAIL PROTECTED] Velocity Software, Inc Mailing Address: 196-D Castro Street P.O. Box 390640 Mountain View, CA 94041 Mountain View, CA 94039-0640 VM Performance Hotline: 650-964-8867 Fax: 650-964-9012 Web Page: WWW.VELOCITY-SOFTWARE.COM /************************************************************/
