OA22716 WLM INITIATORS DO NOT START WHEN THERE IS NO CPU USING OR DELAY APAR status OPEN
Error description Problem is the same as described in OA19711. OA19711 changed the algorithm in IRAPAQUE. But the change also needs to be done in IRAPABQD. When a service class is missing its goal and the main reason is queue delay, WLM assesses whether additional initiators would be beneficial for the service class to meet its goals. WLM bases these decisions on using and delay samples of jobs currently running in the service class. If there are no CPU using or delay samples in the first period (possibly due to an unforeseen bottleneck in the job), and it only has samples for queue delay and OTHER, then WLM does not positively assess that additional initiators will be beneficial. WLM assumes that the new jobs entering the system will behave the same way and not do any productive work, so no new initiators are started. This causes jobs to backup on the JES queue. PE Information: Users Affected: Users at JBB772S with UA34899 Users at HBB7720 with UA34897 Users at HBB7730 with UA34898 User Impact: PTFs UA34899, UA34897, and UA34898 did not completely eliminate the problem. The algorithm needed to be changed in module IRAPABQD as well as IRAPAQUE. ################################################## OA23048 ABEND0C9 IN IRARMCPM UA34694 +46E FOLLOWING WLM VARYING A CP ONLINE. APAR status OPEN Error description ABEND0C9 in IRARMCPM calculating WaitPercentage. The abend can occur when WLM varies a CP online that was previously varied offline by WLM. When WLM varies a CP offline the CPU LCCA is not freed like when an operator command varies a CP offline. When WLM varies a CP offline and then back online the accumulated wait time in LCCAWTIM will be the time since IPL and not since the last Vary Online. ################################################## OA22726 WORK NOT BEING DISTRIBUTED TO SYSTEMS WITH UA31230 APPLIED APAR status OPEN Error description With the fix for OA18531 applied, there is an error calculating the size of the data buffer that WLM sends to other systems in the sysplex. The buffer could be too small and this may result in the SRRU data not being sent. This will impact users of the IWMSRSRS macro. If a system is not sending its routing information to other systems in the sysplex, then WLM will not include that system in the data returned on the IWMSRSRS macro and this could result in transactions not being routed to that system. Verification steps: On the system with the fix for OA31230 applied, CTRACE COMP(SYSWLM) FULL will show exception records in IWMDMGET with a return code 8 reason code 802. PE Information: Users Affected: HBB7709 with UA31712 applied JBB77S9 with UA31713 applied HBB7720 with UA31215 applied JBB772S with UA31231 applied HBB7730 with UA31230 applied User Impact: Applications that use IWMSRSRS to obtain routing recommendations may be impacted. ################################################## OA19400 IXC426D SYSTEM HANG UNRESPONSIVE ABEND00C RSN0F08006C ABEND03C I/O PAGING BACKUP APAR status OPEN Error description Customer experienced system hang with MSGIXC426D SYSTEM xxxx IS SENDING XCF SIGNALS BUT NOT UPDATING STATUS being issued. This was preceded by ABEND00C RC0F08006C errors due to XCF not being able to get SQA storage due to insufficient frames, and ABEND03C RSN50000D10 due to insufficient frames to back DREF storage. There was a backup of paging I/O as indicated in the IPCS ASMK report. Additional symptoms as viewable under IPCS include the following: 1) Presence of ASM SRBs that initiate paging I/O on the system dispatching queue. SRB dispatch point is in ILRCPBLD for these SRBs. The system hang occurs because of these ASM SRBs not getting dispatched. They were scheduled to be dispatched, but due to disabled or SRB mode processing occurring on all CPs over a significant interval of time, the ASM SRBs could not get a processor to run on. Also significantly, the unit of work active on one of the CPs was requesting a lot of SQA storage, and therefore using lots of frames. Failure to dispatch these ASM means that paging I/O is not being initiated. Without the paging I/O, the system cannot free up frames to hand out to storage requesters. The condition of all CPs tied up with disabled or SRB mode work persisted until no frames were available at all, then worsened as recovery and retry logic was driven in response to the abends generated due to lack of frames. There is an exposure in the ASM paging I/O design whereby ASM must be able to obtain its own processor in order to initiate I/O. This exposure was introduced by new function APAR OA14248. .. Verification Steps: .. It is recommended that ASM Level 2 support be contacted to verify that external symptoms are due to this ASM paging issue. L2 will review the system trace table content, the system dispatching queue, and the ASM report to confirm. Local fix Application of WLM APAR OA18531 may offer relieve for the observed instance of this problem by reducing the amount of storage they getmain which will reduce the length of time they stay disabled on a processor. Problem summary Problem conclusion Temporary fix Comments APAR information APAR number OA19400 Reported component name 5752 AUX STOR M Reported component ID 5752SC1CW Reported release 709 Status OPEN PE NoPE HIPER YesHIPER Special Attention NoSpecatt Submitted date 2006-12-22 Closed date Last modified date 2007-10-30 APAR is sysrouted FROM one or more of the following: APAR is sysrouted TO one or more of the following: OA19925 ################################################## Regards from Barcelona ---------------------------------------------------------------------- For IBM-MAIN subscribe / signoff / archive access instructions, send email to [EMAIL PROTECTED] with the message: GET IBM-MAIN INFO Search the archives at http://bama.ua.edu/archives/ibm-main.html

