Re: WLM issue with a proposed solution

Neil Duffee Mon, 02 May 2016 12:20:26 -0700

Caveat: as a daily digester, responses are implicitly delayed...

Tracy:  among other good advice you got, I'll emphasize that the Importance for 
your Databases (DB2, etc) must be higher than your Applications (Cics, etc) to 
avoid [some of] these time-out/deadlock scenarios.  I strongly suggest reading 
the WLM RedBook. [1]  It has specific chapters on Cics, DB2, etc.


Secondly, I'd avoid strangling WLM but, rather, tend to suggest loosening the 
rules.  If WLM has this leeway, it is more able to balance the workload and, 
after all, that's the whole point of a WorkLoad Manager.  I use the concept 
where the rules are, "what can you tolerate when things go south?" vs. "how do 
I want things to perform normally?" [2]  When there's sufficient resources, all 
your classes will over-perform.  By bumping up the Cics minimum you're forcing 
WLM to deprecate others of the same Importance (or less; such as DB2 from your 
message).  Rather, by loosening the restrictions, DB2 is allowed to breath 
some.  In fact, you'll see below [5] that our Online-Hi is 75% in 1 second but 
our typical Cics response is 0.3 seconds and 0.8 on bad days.

Third, you might consider removing your long-running Cics transactions to a 
different Transaction group because they can skew the accumulated WLM results.  
Below [5], you'll see I have a group LONGRUN that encompasses monitoring tasks 
which, essentially, never end;  meaning *bad* response times.  Instead, because 
we cycle our production Cics each workday, they're shunted to the ONLINELG 
service class with 75% in 1 second so they don't pollute the ONLINEHI stats. 

Lastly, tho' I believe it is the default, make sure you have I/O Priority 
management [3] set to YES.  It will encourage WLM to promote lower classed work 
such as Batch to a higher DP (temporarily) to clear the blockage.  It will 
repeat the process if necessary and results can be seen in the RMF reporting 
[4] under LCK.  (LCK or ENQ?)  The Dynamic alias tuning management will let WLM 
manage your hyper-volSer UCB allocations as well. (can't remember the real name 
at the moment.)

A zIIP was suggested but, unless you're doing Java in Cics, it won't *directly* 
help your Cics/DB2 problems.  However, depending on your z/OS & DB2, more 
things are becoming zIIP-able ie. tcp/ip, system XML services, DRDA, etc.  
Plus, it's not included in your 4hr cap or licencing.

ps.  the DB2 velocity goal can be a small, red herring.  It applies to 
activities that are not assigned to specific enclaves such as Dasd I/O & lock 
management.  Your Batch work will be in a Batch class enclave (SRB) within DB2 
and be dispatched as such.  This is one of the places where you will see 
promotion by WLM occur due to enqueues/locks.

[1]  System Programmer’s Guide to: Workload Manager SG24-6472-03
[2]  The latter is from the old Dispatching Priority mentality that needs to be 
dropped.  Instead, DP is employed by WLM to achieve the minimum goals you have 
defined.

[3]  WLM samples:
Service Coefficient/Service Definition Options:
I/O priority management  . . . . . . . . YES
Dynamic alias tuning management  . . . . YES

[4]  RMF reporting 
--PROMOTED--
BLK    0.062
ENQ   52.084
CRM   21.455
LCK  654.084
SUP    0.000

[5]  WLM samples:
Transaction Name Group LONGRUN - Long running CICS transactions
  Qualifier  Starting                                          
  name       position  Description                             
  ---------  --------  --------------------------------        
  B11R                 BETA93                                  
  C*                   CICS supplied transactions              
  OSEC                 Omegamon 
  OSRV                 Omegamon
        -from the Cics monitor: CSSY, CSTP, CSNC, CSZI, CEX2, CSHQ, CSNE, OSRV, 
& OSEC all have elapsed/response times in days

Subsystem Type CICS - CICS transactions              
Classification:                                      
  Default service class is ONLINELO                  
  Default report class is CICS                       

    Qualifier  Qualifier      Starting       Service 
  # type       name           position       Class   
  - ---------- -------------- ---------      --------
  1 SIG        CICSPRD1                      ONLINEHI
  2 . TNG      . LONGRUN                     ONLINELG


Service Class ONLINELG - Long running transactions   
Base goal:                                           
CPU Critical = NO    I/O Priority Group = NORMAL     

  #  Duration   Imp  Goal description                
  -  ---------  -    --------------------------------
  1             3    50% complete within 00:00:01.000


Service Class ONLINEHI - High priority production    
Base goal:                                           
CPU Critical = NO    I/O Priority Group = NORMAL     

  #  Duration   Imp  Goal description                
  -  ---------  -    --------------------------------
  1             2    75% complete within 00:00:01.000


-------->  signature = 8 lines follows  <--------
Neil Duffee, Joe Sysprog, uOttawa, Ottawa, Ont, Canada
telephone:1 613 562 5800 x4585                  fax:1 613 562 5161
mailto:NDuffee of uOttawa.ca     http:/ /aix1.uOttawa.ca/ ~nduffee
“How *do* you plan for something like that?”  Guardian Bob, Reboot
“For every action, there is an equal and opposite criticism.”
“Systems Programming: Guilty, until proven innocent”  John Norgauer 2004
"Schrodinger's backup: The condition of any backup is unknown until a restore 
is attempted."  John McKown 2015


-----Original Message-----
From: Tracy Adams [mailto:tad...@fbb...com] 
Sent: April 29, 2016 08:55
Subject: Re: WLM issue with a proposed solution

Thank you all for chiming in!  Yeah the bottom line... figure out why those sub 
second transactions get stalled!  Hard to tune your way out of a locking 
condition :-)

I will check out the SYSSTC actual velocity... that is a good bench mark to 
what my max achievable would be around.  

Happy Friday Martin, sounds like you have written the book on this!

Gotta go read about resource groups. 

-----Original Message-----
From: IBM Mainframe Discussion List [mailto:IBM-MAIN@LISTSERV.UA.EDU] On Behalf 
Of Scott Chapman
Sent: Friday, April 29, 2016 6:40 AM
To: IBM-MAIN@LISTSERV.UA.EDU
Subject: Re: WLM issue with a proposed solution

>If your batch jobs are running Dicretionary at a DP lower than CICS, it 
>is very unlikely that they are causing significant CICS delays.

True from a CPU perspective. But the batch jobs could be locking resources in 
DB2 that are delaying the CICS transactions. And if the batch jobs holding 
those locks are progressing very slowly due to running in discretionary when 
there's little CPU available, the locks may persist for an extended period of 
time, elongating CICS transaction response time. 

Or I saw a similar situation once where some batch queries exhausted the RID 
pool, which caused sub-second CICS transactions to start taking over 60 
seconds. That's fortunately harder to do on the later versions of DB2. 

In short, while adjusting the goals very well may be in order, I'd be inclined 
to first look into the apparently unusually long running CICS transactions to 
identify why those particular transactions are taking a long time.


----------------------------------------------------------------------
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN

Re: WLM issue with a proposed solution

Reply via email to