On Thu, 2 Apr 2009 17:11:11 -0400, Alan Altmark <alan_altm...@us.ibm.com>
 
wrote:

>On Thursday, 04/02/2009 at 03:16 EDT, Alan Ackerman
><alan.acker...@earthlink.net> wrote:
>> Does it use monitor data or accounting data? People at my shop would n
ot
>> like it using monitor data. Other shops can decide for themselves.
>
>You can't just drop that here and walk away!  :-)  Please explain.  Why
>would someone like or dislike how the resource consumption factoids
>generated by the system are used?
>
>Alan Altmark
>z/VM Development
>IBM Endicott
>========================
=========================
========================

Creating new thread.

1. The folks that receive the data at my shop are z/OS folks. Historicall
y 
the capture ratio of MVS was really poor. The notion was that you should 

use SMF data and never RMF data. I don't know if z/OS has cleaned up its 

act or not.

But I have heard the same thing from VM folks. (I've said it myself.)

As Barton says, the capture ratio in VM has always been quite high, due t
o 
the way the data is captured in the VMDBK. However, Barton computes this 

(I think) by comparing different record types in the monitor data, not by
 
comparing monitor to accounting data.

There is system overhead, but it is captured in the SYSTEM VMDBK block. 

Accounting data and monitor data are using the same data, so they should 

get the same results. Of course, some time gets charged to the wrong user
, 
for example between the time an interrupt comes in and the new user is 

identified. But it shows up the same in the monitor and the accounting 

data. (User CPU time is more reproducible than total CPU time, for this 

reason.)

2. Monitor sample data is taken at one minute samples. It used to be that
 
data for users that logged in or off between samples was dropped for the 

partial minutes. Is this still true? Was it ever true? Or is it urban 
folklore?

3. On our systems, we sometimes see messages from CP that say the monitor
 
data has been thrown away because the user connected to *MONITOR did not 

respond in time. This happens when the system is overloaded, either in CP
U 
or storage. So we lose some minutes of monitor data, but not, I think, 

accounting data. 

Often you can fix this by increasing the segment sizes or give 
MONWRITE/ESAWRITE a bigger SHARE. Not always, though. In some cases the 

monitor segments get paged out. (We reported it to Velocity, who said it 

was a CP problem.) I think IBM could do things to make collection of 
monitor data more reliable in the extreme cases.

Unfortunately, I'm not responsible for this and it is "only performance 

data". I think this can be dealt with, but it does take diligence and wor
k 
to keep your monitor data accurate. You don't have to do this work for 

accounting data. 

I think IBM could do things to make collection of monitor data more easy.


4. On our systems, we switch files (I think hourly) to keep them from 
getting too big. We lose a minute or two of data each time.

5. The default for ESAWRITE is to collect User history records only for 

userids using more than 0.5% CPU. So when we go back to process CPU 
utilization for users, we get smaller totals for monitor than from 
accounting data. I assume this could be fixed by setting the threshold to
 
zero. 

I don't know which of these, if any, affect the ESALPS data collection 

that Barton mentioned. We have tested ESALPS, but are not yet licensed. 


Alan Ackerman                    
                        
Alan (dot) Ackerman (at) Bank of America (dot) com       

Reply via email to