Hey, wait a minute guys.
I think that there is a bit of confusion on DUR and how TSO
transactions transit to Period 2.
1. Duration is the amount of service that a period should consume
before going on the next period. This is NOT service units per
second, but is total service units consumed. Thus, your 750 service
units do not equate to clock seconds in any regard. The 750 service
units are composed of CPU (SRB and TCB) service units, plus I/O
service units, plus (potentially) MSO service units. These basic
categories of service are adjusted by the service coefficients (CPU,
IOC, MSO, SRB). Those resulting service unit measures are basically
unrelated to elapsed clock time.
2. There is not a direct relationship between service units consumed
and elapsed time of the transaction (consider a CPU burner versus
someone scrolling a PDS). The RMF 14 buckets are buckets of response
times. They do not represent service units consumed. You cannot
legitimately say that the transactions ending in bucket 14 consume
any more service than a transaction ending in any other bucket. All
you can say is that if they ended in Period 1, then they probably
consumed less than 750 service units in your case (actually, 750 plus
the amount that transactions consumed before the SRM noticed and
booted them into Period 2...which can be a huge amount of service
units for CPU burners). For that matter, the delays to TSO
transactions often are more a function of other workloads (especially
workloads running at a higher Goal Importance) than anything inherent
in the TSO transactions themselves.
3. Depending on how you have set RMPTTOM, you might find that
significantly more service units were consumed in TSO Period 1 than
you might have specified. This is because setting RMPTTOM to large
values means that the SRM will check less frequently to see whether a
DUR value was exceeded. In data sent to me by some CPExpert users,
I see the AVERAGE service units consumed per transaction in TSO
Period 1 to be several times higher than the DUR value for TSO Period 1!
4. The design of multiple service class periods mostly focused on
service consumption. The idea is that heavy users of service should
not be in a position to interfere unreasonably with low users of
service. If the heavy users of service get migrated to Period 2, the
result is that the low users of service would not be unreasonably
delayed in their response time. From a practical view, the concept
of service mostly revolves around CPU service.
To a large extent, this idea is a carry-over from pre-SP5.2 days,
when a dispatchable unit of work could monopolize the dispatching
queue ahead of other work at the same dispatching priority. Since
the "fair access" algorithm introduced with SP5.2 eliminated this
dispatching problem, a lot of the technical need for multiple periods
went away. Only in the case of serious resource consumption by lots
of dispatchable units executing concurrently in Period 1 should this
become a problem.
5. In many cases, you will not see any better or worse response to
trivial transactions in TSO Period 1 by introducing a TSO Period 2
(there are exceptions, of course). Mostly, TSO transactions should
migrate to TSO Period 2 based on management decisions rather than
technical decisions (for example, "get those heavy resource
transactions into TSO Period 2 Importance 3 or 4, where they will
compete with other resource consumers at Importance 3 or 4, and the
competition at lower Importance will discourage users from submitting
that type of transaction under TSO", or some such management scheme).
6. The percent of transactions that end in TSO Period 1 is not a
universal objective. There is nothing whatsoever "magic" about 75%
ending in Period 1. The percent ending in Period 1 should be a
function of your management objectives versus the resources consumed
by various kinds of transactions (that is, 90% or 95% or 100% ending
in Period 1 can be a valid objective in the right
environment). Other than management objectives, the overriding
technical concern should be how many transactions execute
concurrently in Period 1 (and thus have the potential for interfering
with each other for access to a CPU). In an LPAR with multiple
logical processors, this potential for interference decreases
substantially (think queuing model effects).
Regards,
Don
******
Don Deese, Computer Management Sciences, Inc.
Voice: (703) 922-7027 Fax: (703) 922-7305
http://www.cpexpert.org
******
At 07:11 PM 3/28/2007, you wrote:
I'm reviewing our Workload Manager policy which hasn't really changed
since we implemented Goal Mode with OS/390 and an S/390 2003-237.
Granted, we haven't had many *real* problems over the years but ...
I'm trying to confirm my (lack of) understanding regarding the
duration value for a Performance period and the Service Units/second
I find in the RMF WLM report. From what I read in the Planning: WLM
manual, "Duration: Specifies the length of the period in service
units." Does that imply that the 750 specified for Period 1 duration
equated to approx. 0.44 clock seconds with the 2003 (1724.7 SU/sec);
exclusive of wait times, natch. And, by extension, does that mean
the period is now down to 0.11 clock seconds on our latest z/890
(8084 SU/sec)? The manual is not helpful neither can I find anything
in the Systems Programmer's redbook. Anybody with a better
reference/guide they can point me too?
My concerns centre 'round two cases in our local environment. 1) TSO
period 1 is 75% completion in 0.5 sec with a duration of 750 SU and
period 2 has Velocity>15. I'm worried that 750 SU is not really long
enough for half-a-second duration ie. that tasks are dropping
(almost) straight away into period 2. I'll be researching that in
the RMF report(s) later.
2) I'm looking to split a service class into two periods because it's
exhibiting the classic 'valley' graph of response times ie. 90% of
transactions are roughly split between bottom (0.5) and top (>4.0)
buckets. Its current definition is 50% complete in 1 second and I've
got *one* 4hr sample of 65% at 0.5 sec and 27% at >4 sec. It was
suggested, during a course, that this implies multiple periods. My
direct problem is how to determine the duration value?
----------------------------------------------------------------------
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to [EMAIL PROTECTED] with the message: GET IBM-MAIN INFO
Search the archives at http://bama.ua.edu/archives/ibm-main.html