On Thu, 2 Apr 2009 17:11:11 -0400, Alan Altmark <alan_altm...@us.ibm.com> wrote:
>On Thursday, 04/02/2009 at 03:16 EDT, Alan Ackerman ><alan.acker...@earthlink.net> wrote: >> Does it use monitor data or accounting data? People at my shop would n ot >> like it using monitor data. Other shops can decide for themselves. > >You can't just drop that here and walk away! :-) Please explain. Why >would someone like or dislike how the resource consumption factoids >generated by the system are used? > >Alan Altmark >z/VM Development >IBM Endicott >======================== ========================= ======================== Creating new thread. 1. The folks that receive the data at my shop are z/OS folks. Historicall y the capture ratio of MVS was really poor. The notion was that you should use SMF data and never RMF data. I don't know if z/OS has cleaned up its act or not. But I have heard the same thing from VM folks. (I've said it myself.) As Barton says, the capture ratio in VM has always been quite high, due t o the way the data is captured in the VMDBK. However, Barton computes this (I think) by comparing different record types in the monitor data, not by comparing monitor to accounting data. There is system overhead, but it is captured in the SYSTEM VMDBK block. Accounting data and monitor data are using the same data, so they should get the same results. Of course, some time gets charged to the wrong user , for example between the time an interrupt comes in and the new user is identified. But it shows up the same in the monitor and the accounting data. (User CPU time is more reproducible than total CPU time, for this reason.) 2. Monitor sample data is taken at one minute samples. It used to be that data for users that logged in or off between samples was dropped for the partial minutes. Is this still true? Was it ever true? Or is it urban folklore? 3. On our systems, we sometimes see messages from CP that say the monitor data has been thrown away because the user connected to *MONITOR did not respond in time. This happens when the system is overloaded, either in CP U or storage. So we lose some minutes of monitor data, but not, I think, accounting data. Often you can fix this by increasing the segment sizes or give MONWRITE/ESAWRITE a bigger SHARE. Not always, though. In some cases the monitor segments get paged out. (We reported it to Velocity, who said it was a CP problem.) I think IBM could do things to make collection of monitor data more reliable in the extreme cases. Unfortunately, I'm not responsible for this and it is "only performance data". I think this can be dealt with, but it does take diligence and wor k to keep your monitor data accurate. You don't have to do this work for accounting data. I think IBM could do things to make collection of monitor data more easy. 4. On our systems, we switch files (I think hourly) to keep them from getting too big. We lose a minute or two of data each time. 5. The default for ESAWRITE is to collect User history records only for userids using more than 0.5% CPU. So when we go back to process CPU utilization for users, we get smaller totals for monitor than from accounting data. I assume this could be fixed by setting the threshold to zero. I don't know which of these, if any, affect the ESALPS data collection that Barton mentioned. We have tested ESALPS, but are not yet licensed. Alan Ackerman Alan (dot) Ackerman (at) Bank of America (dot) com