Obviously the effect of capping depends on your workload peaks, how long
they last, how close the cap is to the physical hardware limits, etc.,
but since capping is based on 4-hour average MSUs the speed limit
analogy used is not really accurate. It's more like the required system
throughput has gotten to the point that traffic must move at 125 m/h
bumper to bumper just to keep even and after you have been doing this
long enough to raise the average speed over four hours to 100 m/h
suddenly the 100 m/h speed limit is now enforced, which can cause
significant disruption and increasingly long "traffic" delays. In some
situations after capping there may be enough "discretionary" traffic
that can be held back to minimize the disruption to important traffic,
but this is definitely not always the case. You may in some cases also
find in retrospect (too late) that if you had been able to restrict
discretionary loads more over the previous four hours, that the current
capping could have been avoided -- but this requires a workload
scheduler with ability to predict the future.
I would have to disagree that capping only has "a slight relation with
performance". If you are dealing with LPARs where the principal load is
transactional processing systems, capping suddenly cuts the server
processing capacity when it has been using more than the cap value for
an extended period, immediately placing the server at 100% saturation
with insufficient resources to process the current transaction rate.
Unless the average transaction arrival rate decreases soon, queue
lengths and response time increase exponentially.
If your business typically has short transaction peaks throughout the
course of an hour that require more than the cap value for a short time
and an unusual extended load causes the cap to begin to be enforced,
then even if the unususual load is eliminated the system will continue
to be capped for many minutes until the average MSU drops sufficiently;
and until then all those normal brief peaks that could be handled
"transparently" will now drive the server briefly to 100% saturation
with noticeable response time increases until the brief peak passes.
System response will become much more erratic from the user's perspective.
Joel C. Ewing
On 06/20/2014 08:14 AM, Vernooij, CP (SPLXM) - KLM wrote:
> John,
>
> I usually hate replies, that don't answer the question, but instead state:
> why don't you try it this way. However, this time I would like to ask some
> 'what are you doing' questions, in spite of your last remark.
>
> 1. The product that produces the CMFCPU13/14/15 messages also produces your
> RMF 72 records. From those I produce all my statistics on quarterly or hourly
> intervals, be it through CA MICS, but you can do it also via SAS / MXG (or
> some RMF reporting tool I believe). You can even download the SMF records to
> your Linux or Windows system and process them with SAS or a similar product.
> Did you try this?
>
> 2. What conclusions do you want to pull out of the figures? You know, that
> these 'LPAR is capped' figures have only a slight relation with the
> performance of those LPARs. If you have a road sign stating there is a speed
> limit of 100 m/h, that road is 'capped', but the capping won't hardly create
> performance problems.
>
> Kees.
>
>
> -----Original Message-----
> From: IBM Mainframe Discussion List [mailto:[email protected]] On
> Behalf Of John McKown
> Sent: Friday, June 20, 2014 14:42
> To: [email protected]
> Subject: How to? Designing a "graph" of information
>
> We use PR/SM Group Capacity to regulate our aggregate MSUs from two LPARs on
> a single CEC. This is for cost containment. We have a product which runs on
> both z/OS images on the LPARs which produce messages similar to:
>
> N 4020000 LIH1 14169 22:07:51.36 STC16813 00000090 CMFCPU15 LPAR NO
> LONGER SOFT CAPPED BY WLM; CAPPED DURATION WAS
> S 00.02.00
>
> I have a program, on Linux, which takes this and produces lines like:
>
> LPAR LIH1 was capped starting at Mon 2014-06-16 21:52:41 until Mon
> 2014-06-16 21:55:31 for a duration of 00.02.50
>
> I can the process this information in another program which puts the
> fields: LPAR (LIH1 above), the started date & time (2014-06-15 21:52:41 &
> 2014-06-16 21:55:31) into a relational table. From this I can generate
> another table which has a row for each minute within the interval. Each row
> contains the date/time column & a column for each z/OS Image. The z/OS image
> either contains a " " or a "*" depending on whether that z/OS is WLM capped
> any time during that minute. Thing of the columns like: date/time @ minute
> resolution; is LPAR#1 capped?; Is LPAR#2 capped?. Now what I want to do is
> create a "time graph". The X axis is the date / time. Each point on the Y
> axis is for a given LPAR. The intersection (plot) is either "*" if that LPAR
> is capped at that time or a blank. This is to show, along a time sequence how
> each LPAR is being "capped" and "uncapped".
>
> Ex:
>
> LIH1 capped| * | * |
> DEV1 capped| | * |
> Date/Time | yyyy-mm-dd hh:mm | yyyy-mm-dd hh:mm |
>
>
> Hopefully you get the idea. And see at least one problem. There are 1400
> minutes in a single day. Way too many to plot even a single day. So I though,
> why not summarize, perhaps on an hourly basis. Where each "point"
> in the plot is the sum of the number of minutes in which the LPAR was capped.
> This would be easy to do with SQL if I changed the " " & "*" for not
> capped/capped to 0 and 1 instead. Which I can easily do. Then use SQL to
> consolidate each hour. Again, easy. But what I'd like is something more
> "visual" than just putting out what would look like a spread sheet with
> numbers. What I would like is a true graph where for each DateTime / LPAR
> "point", I would plot a "bar" whose thickness is relative to the number.
> I.e. if a particular LPAR, during a particular hour had been capped 60 times
> (max # of minutes), then I'd have a 100% full vertical "bar" at that point.
> If it had been capped 30 times, then a 50% full bar. This way, the eye can
> easily scan along the X axis getting an "intuitive" grasp of how the LPARs
> are being impacted by the WLM capping.
>
> First, does the above information sound useful to others? I mean what I'm
> trying to convey (how WLM capping is possibly affecting turn around).
> Secondly, is the method (the "bars" varying in height) a good "intuitive"
> way to display the information to management (who simply adore graphs, with
> colors!). Third, most difficult, how do I create that graph from the
> information. My original data source is the SYSLOG that we unload to a disk
> data set. I was going to go into a lot of what I have done, but have decided
> that it likely isn't necessary.
>
> Thanks for thoughts. The ones about my lack of sanity are already well known!
> <grin/>
>
> --
> There is nothing more pleasant than traveling and meeting new people!
> Genghis Khan
>
> Maranatha! <><
> John McKown
>
> ...
>
>
> ----------------------------------------------------------------------
> For IBM-MAIN subscribe / signoff / archive access instructions,
> send email to [email protected] with the message: INFO IBM-MAIN
--
Joel C. Ewing, Bentonville, AR [email protected]
----------------------------------------------------------------------
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to [email protected] with the message: INFO IBM-MAIN