OMVS terminated abruptly caused an outage
Dear Group, One of our system sufferred an outage due to OMVS terminated abruptly with message id BPXI019E OMVS DETECTED A SEVERE INTERNAL ERROR THAT WILL REQUIRE A RE-IPL TO CORRECT.. There was a dump taken and had to quiesce the workload and re-IPL the system immediately.. We have had sent the dump to IBM L2 Support and while we wait to hear from them. Appreciate any expert advice on the reasons that could be a cause for this issue as i have had reviewed the log for errors and cant see anything unusual except i see there was a syntax error and one of the file in BPXPRMxx was not physically present in the system.. Can this be potential causer? Or anything else if known.. Any suggesstions / answers are highly appreciated. Many thanks! Regards, -- For IBM-MAIN subscribe / signoff / archive access instructions, send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN
Re: OMVS terminated abruptly caused an outage
Have you looked at LOGREC? It might have something. John Eatherly -Original Message- From: IBM Mainframe Discussion List [mailto:IBM-MAIN@LISTSERV.UA.EDU] On Behalf Of RCG Sent: Thursday, November 15, 2012 5:47 AM To: IBM-MAIN@LISTSERV.UA.EDU Subject: OMVS terminated abruptly caused an outage Dear Group, One of our system sufferred an outage due to OMVS terminated abruptly with message id BPXI019E OMVS DETECTED A SEVERE INTERNAL ERROR THAT WILL REQUIRE A RE-IPL TO CORRECT.. There was a dump taken and had to quiesce the workload and re-IPL the system immediately.. We have had sent the dump to IBM L2 Support and while we wait to hear from them. Appreciate any expert advice on the reasons that could be a cause for this issue as i have had reviewed the log for errors and cant see anything unusual except i see there was a syntax error and one of the file in BPXPRMxx was not physically present in the system.. Can this be potential causer? Or anything else if known.. Any suggesstions / answers are highly appreciated. Many thanks! Regards, -- For IBM-MAIN subscribe / signoff / archive access instructions, send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN -- For IBM-MAIN subscribe / signoff / archive access instructions, send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN
Re: OMVS terminated abruptly caused an outage
Was the dump an abendec6 with REASON=0E0F0407? If so, have a look at OA39768 or OA22587. Also, it's helpful if you provide z/os level and whether or not you are in a shared filesystem sysplex. To your question about a missing mount, IMHO I wouldn't think that would cause this kind of issue. MA On Thu, 15 Nov 2012 17:16:58 +0530, RCG rkcgowda1...@gmail.com wrote: Dear Group, One of our system sufferred an outage due to OMVS terminated abruptly with message id BPXI019E OMVS DETECTED A SEVERE INTERNAL ERROR THAT WILL REQUIRE A RE-IPL TO CORRECT.. There was a dump taken and had to quiesce the workload and re-IPL the system immediately.. We have had sent the dump to IBM L2 Support and while we wait to hear from them. Appreciate any expert advice on the reasons that could be a cause for this issue as i have had reviewed the log for errors and cant see anything unusual except i see there was a syntax error and one of the file in BPXPRMxx was not physically present in the system.. Can this be potential causer? Or anything else if known.. Any suggesstions / answers are highly appreciated. Many thanks! Regards, -- For IBM-MAIN subscribe / signoff / archive access instructions, send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN -- For IBM-MAIN subscribe / signoff / archive access instructions, send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN
Re: OMVS terminated abruptly caused an outage
Ravi, Sift (now there's a word) through your Syslog just afore the event and you might discover nuggets of enlightenment. Andre En réponse à Mary Anne Matyaz maryanne4...@gmail.com : -- Début du message d'origine Was the dump an abendec6 with REASON=0E0F0407? If so, have a look at OA39768 or OA22587. Also, it's helpful if you provide z/os level and whether or not you are in a shared filesystem sysplex. To your question about a missing mount, IMHO I wouldn't think that would cause this kind of issue. MA On Thu, 15 Nov 2012 17:16:58 +0530, RCG rkcgowda1...@gmail.com wrote: Dear Group, One of our system sufferred an outage due to OMVS terminated abruptly with message id BPXI019E OMVS DETECTED A SEVERE INTERNAL ERROR THAT WILL REQUIRE A RE-IPL TO CORRECT.. There was a dump taken and had to quiesce the workload and re-IPL the system immediately.. We have had sent the dump to IBM L2 Support and while we wait to hear from them. Appreciate any expert advice on the reasons that could be a cause for this issue as i have had reviewed the log for errors and cant see anything unusual except i see there was a syntax error and one of the file in BPXPRMxx was not physically present in the system.. Can this be potential causer? Or anything else if known.. Any suggesstions / answers are highly appreciated. Many thanks! Regards, -- For IBM-MAIN subscribe / signoff / archive access instructions, send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN -- For IBM-MAIN subscribe / signoff / archive access instructions, send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN --- Fin du message d'origine - www.lavache.com : l'email gratuit sans pub, vachement meuh. www.hugolescargot.com : coloriage, fiches recettes et bricolage, chansons, etc. www.jeux-gratuits.com : des jeux en ligne pour toute la famille. -- For IBM-MAIN subscribe / signoff / archive access instructions, send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN
Re: OMVS terminated abruptly caused an outage
Hi Mary.. Thanks for the response.. We are running zOS 1.12 and we are on shared file system.. I will take a look on the apars you suggessted. Many thanks On Nov 15, 2012 6:13 PM, Mary Anne Matyaz maryanne4...@gmail.com wrote: Was the dump an abendec6 with REASON=0E0F0407? If so, have a look at OA39768 or OA22587. Also, it's helpful if you provide z/os level and whether or not you are in a shared filesystem sysplex. To your question about a missing mount, IMHO I wouldn't think that would cause this kind of issue. MA On Thu, 15 Nov 2012 17:16:58 +0530, RCG rkcgowda1...@gmail.com wrote: Dear Group, One of our system sufferred an outage due to OMVS terminated abruptly with message id BPXI019E OMVS DETECTED A SEVERE INTERNAL ERROR THAT WILL REQUIRE A RE-IPL TO CORRECT.. There was a dump taken and had to quiesce the workload and re-IPL the system immediately.. We have had sent the dump to IBM L2 Support and while we wait to hear from them. Appreciate any expert advice on the reasons that could be a cause for this issue as i have had reviewed the log for errors and cant see anything unusual except i see there was a syntax error and one of the file in BPXPRMxx was not physically present in the system.. Can this be potential causer? Or anything else if known.. Any suggesstions / answers are highly appreciated. Many thanks! Regards, -- For IBM-MAIN subscribe / signoff / archive access instructions, send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN -- For IBM-MAIN subscribe / signoff / archive access instructions, send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN -- For IBM-MAIN subscribe / signoff / archive access instructions, send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN
Re: New way to do UCB lookups
On Wed, 14 Nov 2012 14:39:35 -0500, Bonaduce, Frank wrote: Are dynamic IODF changes actually so prevalent in most environments (especially in Production) that the condition warrants that much Why wouldn't IODF changes be implemented dynamically in production environments? Anyway, how many shops have CECs dedicated to non-production LPARs? Many years ago, I had a channel switch that had a few lightly used unit-record devices attached. IIRC, there were four devices and four LPARs connected to the switch. One of the LPARs was production. We still wanted to be able to connect any of the devices to any LPAR, but wanted to eliminate the channel switch. The solution was to connect each device to a dedicated channel path and when there was a need to switch a device the channel path off of one LPAR and on to another. We made multiple dynamic IODF changes, removing one CHPID at a time and butting the cables together to connect one device to that CHPID, then another IODF change to bring the device back online to the production LPAR. The entire reconfiguration was done during the day while with no disruption, just the unavailability of one device at a time for a relatively short time. -- Tom Marchant -- For IBM-MAIN subscribe / signoff / archive access instructions, send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN
SDSF DA command
At a previous job when I did a DA the results were equivalent to a DA OSTC. Now when I do a DA it is equivalent to DA OJOB. I'm at a class this week where DA is equivalent to DA OSTC. How do I set DA to default to OSTC. Google and the SDF manual haven't been much help. Thanks Alan Field Technical Engineer Principal BCBS Minnesota Phone: 651.662.3546 Mobile: 651.428.8826 The information contained in this communication may be confidential, and is intended only for the use of the recipient(s) named above. If the reader of this message is not the intended recipient, you are hereby notified that any dissemination, distribution, or copying of this communication, or any of its contents, is strictly prohibited. If you have received this communication in error, please return it to the sender immediately and delete the original message and any copy of it from your computer system. If you have any questions concerning this message, please contact the sender. Unencrypted, unauthenticated Internet e-mail is inherently insecure. Internet messages may be corrupted or incomplete, or may incorrectly identify the sender. -- For IBM-MAIN subscribe / signoff / archive access instructions, send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN
What happens when an LPAR gets interupted.
Hi all, I'm looking at what I think is a timing related issue with a piece of work that is running in another AS in XMS mode on z/OS 1.12. In the dump I can see two STCK values taken about 100 assembler instructions apart that show a time difference of over a second. I can't see anywhere in the code path that we would spin or wait, So why the gap? Looking at the MVS trace I see a couple of TIME-GAP entries for a total 14 seconds for one of the 4 CPUs in this LPAR. The last thing before the first entry and the only thing before and after the 2nd TIME-GAP entries are EMS (emergency signal) entries followed by a PC to SysTrace Processor Snap and a wait. Now it could be that this CPU just has nothing to do but I'm not sure. The registers in the TCB are old so I don't think we've been preempted by z/OS on this LPAR and I was wondering if perhaps the Virtual CPU had been pulled from under us by another LPAR and if that sort of interupt behaved differently or just the same as any other? My questions are 1) What are the signs of a CPU being preempted and given to another LPAR? 2) Presumably this can happen on any instruction boundary? If it happens where do the registers get stored? Is it the same place as for a z/OS interupt, i.e. in the TCB? 3) Does the preempted piece of work have to wait for that CPU to come back or can it be dispatched on another CPU in the LPAR? 4) I get a dump via an IF slip on the instruction AFTER the STCK, i.e. we ended up with PSW 2 instructions after STCK. would this impact the output from the STCK? Thanks in advance, Ron. -- For IBM-MAIN subscribe / signoff / archive access instructions, send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN
Re: SDSF DA command
Try a command table entry like this: SELECT PGM(ISFISP) NOCHECK NEWAPPL(ISF) PARM(DA OJOB) You choose the entry name and truncation value. I use SSJ. :-) -Original Message- From: IBM Mainframe Discussion List [mailto:IBM-MAIN@LISTSERV.UA.EDU] On Behalf Of Alan Field Sent: Thursday, November 15, 2012 8:46 AM To: IBM-MAIN@LISTSERV.UA.EDU Subject: SDSF DA command At a previous job when I did a DA the results were equivalent to a DA OSTC. Now when I do a DA it is equivalent to DA OJOB. I'm at a class this week where DA is equivalent to DA OSTC. How do I set DA to default to OSTC. Google and the SDF manual haven't been much help. Thanks Alan Field Technical Engineer Principal BCBS Minnesota Phone: 651.662.3546 Mobile: 651.428.8826 The information contained in this communication may be confidential, and is intended only for the use of the recipient(s) named above. If the reader of this message is not the intended recipient, you are hereby notified that any dissemination, distribution, or copying of this communication, or any of its contents, is strictly prohibited. If you have received this communication in error, please return it to the sender immediately and delete the original message and any copy of it from your computer system. If you have any questions concerning this message, please contact the sender. Unencrypted, unauthenticated Internet e-mail is inherently insecure. Internet messages may be corrupted or incomplete, or may incorrectly identify the sender. -- For IBM-MAIN subscribe / signoff / archive access instructions, send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN -- For IBM-MAIN subscribe / signoff / archive access instructions, send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN
Re: Updates to the IBM Service Request application coming Nov 14, 2012
At least you can 'reclose' an SR. When I closed one, it remained closed and I could no longer update it until tech support reset it. Under the old, MUCH, MUCH BETTER PMR system, you could reopen for 30 days or so or update after the SR was closed. If the ability to attach files instead of having to ftp them is the only improvement, then they can keep the attachments and bring back the old PMR system. -Original Message- From: IBM Mainframe Discussion List [mailto:IBM-MAIN@LISTSERV.UA.EDU] On Behalf Of Booth, Gerri - DOA Sent: Wednesday, November 14, 2012 4:34 PM To: IBM-MAIN@LISTSERV.UA.EDU Subject: Re: Updates to the IBM Service Request application coming Nov 14, 2012 I had to re-open the SR to add some additional info and that update does show it was made by me. But when I reclosed it, the comment I added when I did that shows that it was made by my co-worker. So it appears that when an SR is closed the application assumes it was closed by the owner. -- For IBM-MAIN subscribe / signoff / archive access instructions, send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN -- For IBM-MAIN subscribe / signoff / archive access instructions, send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN
SRB Again
What I don't understand fills volumes and when I think I understand something I am often wrong. One of the things I don't understand is SRBs. I know this because what I am doing is not working! Note: When I use the option to turn off SRB processing and call (BALR) the SRB routine, instead of scheduling it, it works great. I schedule my SRB with the following macro: IEAMSCHD EPADDR=SRBRTN,PRIORITY=ENCLAVE,ENCLAVETOKEN=WKETKN, X PARM=ISELPARM,SYNCH=YES,X PURGESTOKEN=WKTKN,PTCBADDR=WKTOLD, (SAME AS PSATOLD)X SYNCHCOMPADDR=ISELCOMP,SYNCHCODEADDR=ISELCODE, X SYNCHRSNADDR=ISELRSN In all my reading SRBs are work that runs in parallel to the scheduling program but the SYNCH documentation states: SYNCH=YES The SRB is to be scheduled and synchronized with the caller’s work unit; the caller’s work unit is suspended until the SRB completes, is purged, or ends abnormally. My interpretation of SYNCH=YES is that the scheduling program waits for the SRB to complete. It my reading it also says that I need to use wait and post. I reason that wait and post is an alternate way to do what SYNCH does. My SRB terminates by branching to register 14. When it terminates I expect my code to resume. Correct? I read somewhere that storage reference by an SRB had to be in common storage. I reasoned that storage referenced by the SRB had to be in common storage if the SRB runs in another address space. The parm I pass points to storage in my address space, I believe this should work because this SRB runs in my address space (ENV=HOME). Am I correct? I also read that storage obtained by an SRB had to be in SQA. The Authorized Assembler Services Guide does not say this, so I do not believe it. Am I correct? When I first started testing everything looked great. My jobs were scheduling the SRB and running to completion. But as it turned out the SRB was abending and disappearing so I added the following options: PURGESTOKEN=WKTKN,PTCBADDR=WKTOLD, (SAME AS PSATOLD)X SYNCHCOMPADDR=ISELCOMP,SYNCHCODEADDR=ISELCODE, X SYNCHRSNADDR=ISELRSN Now when my SRB fails I know it because I display a message like the following: SRB SCHEDULING RC=28 COMP=08 CODE=000C4000 I am not sure why but I still did not get a dump (I thought the task recovery routines would create a dump but they didn’t) so I added the following code to my SRB: SETFRR A,FRRAD=FRRA,EUT=YES,MODE=FULLXM,WRKREGS=(R1,R2) FRR DS0H USING FRR,R15 STR14,FRRSAVE LRR3,R15 DROP R15 LRR4,R1 *C TAKE A DUMP USING FRR,R3 SDUMPX HDR='SRB ERROR',BRANCH=YES, SDATA=(NOSQA,RGN,CSA) *C RETURN TO FRRRETRY L R2,FRRRETRY SETRP RC=4,REMREC=YES,RETREGS=YES,FRESDWA=YES, DUMP=YES,WKAREA=(R4),RETADDR=(R2) L R14,FRRSAVE BRR14 Now I get a dump but I don’t know how to read it. I do not see any RMT2 information or any control blocks that give me the registers and PSW at the time of error. I looked in the Diagnostic reference manual but I did not see what I needed. Can anyone direct me to the proper documentation or at least tell me where to look in the dump for the registers and PSW at time of error in a SRB? -- For IBM-MAIN subscribe / signoff / archive access instructions, send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN
Re: SDSF DA command
SDSF Server Alan Field Technical Engineer Principal BCBS Minnesota Phone: 651.662.3546 Mobile: 651.428.8826 From: Lizette Koehler stars...@mindspring.com To: IBM-MAIN@LISTSERV.UA.EDU Date: 11/15/2012 07:50 Subject:Re: SDSF DA command Sent by:IBM Mainframe Discussion List IBM-MAIN@LISTSERV.UA.EDU Are you using ISFPARMS or SDSF Server? Lizette -Original Message- From: IBM Mainframe Discussion List [mailto:IBM-MAIN@LISTSERV.UA.EDU] On Behalf Of Alan Field Sent: Thursday, November 15, 2012 6:46 AM To: IBM-MAIN@LISTSERV.UA.EDU Subject: SDSF DA command At a previous job when I did a DA the results were equivalent to a DA OSTC. Now when I do a DA it is equivalent to DA OJOB. I'm at a class this week where DA is equivalent to DA OSTC. How do I set DA to default to OSTC. Google and the SDF manual haven't been much help. Thanks Alan Field Technical Engineer Principal BCBS Minnesota -- For IBM-MAIN subscribe / signoff / archive access instructions, send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN The information contained in this communication may be confidential, and is intended only for the use of the recipient(s) named above. If the reader of this message is not the intended recipient, you are hereby notified that any dissemination, distribution, or copying of this communication, or any of its contents, is strictly prohibited. If you have received this communication in error, please return it to the sender immediately and delete the original message and any copy of it from your computer system. If you have any questions concerning this message, please contact the sender. Unencrypted, unauthenticated Internet e-mail is inherently insecure. Internet messages may be corrupted or incomplete, or may incorrectly identify the sender. -- For IBM-MAIN subscribe / signoff / archive access instructions, send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN
Re: SDSF DA command
It looks like you have filters active. They are not reset when exiting and reentering SDSF. Try FILTER OFF to see if you get everything. Kees. -Original Message- From: IBM Mainframe Discussion List [mailto:IBM-MAIN@LISTSERV.UA.EDU] On Behalf Of Alan Field Sent: Thursday, November 15, 2012 14:46 To: IBM-MAIN@LISTSERV.UA.EDU Subject: SDSF DA command At a previous job when I did a DA the results were equivalent to a DA OSTC. Now when I do a DA it is equivalent to DA OJOB. I'm at a class this week where DA is equivalent to DA OSTC. How do I set DA to default to OSTC. Google and the SDF manual haven't been much help. Thanks Alan Field Technical Engineer Principal BCBS Minnesota Phone: 651.662.3546 Mobile: 651.428.8826 The information contained in this communication may be confidential, and is intended only for the use of the recipient(s) named above. If the reader of this message is not the intended recipient, you are hereby notified that any dissemination, distribution, or copying of this communication, or any of its contents, is strictly prohibited. If you have received this communication in error, please return it to the sender immediately and delete the original message and any copy of it from your computer system. If you have any questions concerning this message, please contact the sender. Unencrypted, unauthenticated Internet e-mail is inherently insecure. Internet messages may be corrupted or incomplete, or may incorrectly identify the sender. -- For IBM-MAIN subscribe / signoff / archive access instructions, send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN For information, services and offers, please visit our web site: http://www.klm.com. This e-mail and any attachment may contain confidential and privileged material intended for the addressee only. If you are not the addressee, you are notified that no part of the e-mail or any attachment may be disclosed, copied or distributed, and that any other action related to this e-mail or attachment is strictly prohibited, and may be unlawful. If you have received this e-mail by error, please notify the sender immediately by return e-mail, and delete this message. Koninklijke Luchtvaart Maatschappij NV (KLM), its subsidiaries and/or its employees shall not be liable for the incorrect or incomplete transmission of this e-mail or any attachments, nor responsible for any delay in receipt. Koninklijke Luchtvaart Maatschappij N.V. (also known as KLM Royal Dutch Airlines) is registered in Amstelveen, The Netherlands, with registered number 33014286 -- For IBM-MAIN subscribe / signoff / archive access instructions, send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN
Re: What happens when an LPAR gets interupted.
I'll take a shot at bits of this: 1) What are the signs of a CPU being preempted and given to another LPAR? Look at the logical cpu (PR, first column) in the system trace table and the physical cp (CP, last column). When they change (assuming more than one lp and more than one cp), it means the unit of work ended up on another logical/physical lpar. Look at CLKC, EXT entries. 3) Does the preempted piece of work have to wait for that CPU to come back or can it be dispatched on another CPU in the LPAR? Following and CLKC and an EXT entry, you should see a DSP entry with the same psw address. It can come back both on another logical cp and/or on another physical cp. Barbara -- For IBM-MAIN subscribe / signoff / archive access instructions, send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN
Re: New way to do UCB lookups
On Thu, 15 Nov 2012 07:26:24 -0600, Tom Marchant m42tom-ibmm...@yahoo.com wrote: On Wed, 14 Nov 2012 14:39:35 -0500, Bonaduce, Frank wrote: Are dynamic IODF changes actually so prevalent in most environments (especially in Production) that the condition warrants that much Why wouldn't IODF changes be implemented dynamically in production environments? Anyway, how many shops have CECs dedicated to non-production LPARs? snip I think the point was about how often they are done. Quarterly? Monthly? Weekly? Daily? I have never been at a production shop that had a need to do them more often than monthly. Maybe some hardware vendors do them daily on a given system, I don't know. So even if you did it once a day, you wouldstill have to be using one of these programs at the same time as the IODF change to have a problem. The person using the program (a sysprog) is likely the one activating the IODF change, which makes it even less likely to be a problem. -- Mark Zelden - Zelden Consulting Services - z/OS, OS/390 and MVS mailto:m...@mzelden.com Mark's MVS Utilities: http://www.mzelden.com/mvsutil.html Systems Programming expert at http://expertanswercenter.techtarget.com/ -- For IBM-MAIN subscribe / signoff / archive access instructions, send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN
Re: What happens when an LPAR gets interupted.
On Thu, 15 Nov 2012 07:48:32 -0600, Ron MacRae wrote: 1) What are the signs of a CPU being preempted and given to another LPAR? 2) Presumably this can happen on any instruction boundary? If it happens where do the registers get stored? Is it the same place as for a z/OS interupt, i.e. in the TCB? AFAIK, PR/SM doesn't know or care about operating system structures. If it did, it would also have to know about VM, GNU/Linux and any other operating systems that could run in the LPAR. When the processor is assigned to another LPAR, PR/SM has to save the state of the processor in storage that PR/SM controls. Presumably, this would be in HSA. Also, if that information was stored in the memory that is assigned to the LPAR, it would be possible for another processor in the LPAR to modify that information, causing errors. 3) Does the preempted piece of work have to wait for that CPU to come back or can it be dispatched on another CPU in the LPAR? It can be dispatched on another CPU. Hiperdispatch tries not to do that. 4) I get a dump via an IF slip on the instruction AFTER the STCK, i.e. we ended up with PSW 2 instructions after STCK. I think that is normal. would this impact the output from the STCK? What do you mean by that? The STCK is finished. -- Tom Marchant -- For IBM-MAIN subscribe / signoff / archive access instructions, send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN
Re: SDSF DA command
This is controlled by the DADFLT parameter in ISFPRMxx. The explanation of DADFLT from the SDSF manual: Indicates the default address space types and positions to be shown on the DA panel when members of this group enter a DA command without any parameters. If the list contains more than one item, separate the items with a comma. If this parameter is not coded with at least one value for address space position (IN, OUT, TRANS, READY) and at least one value for address space type (STC,INIT, TSU, JOB), then no address spaces are displayed when the DA command is entered with no parameters. The possible values for the parameter follow. When RMF is installed, SDSF uses RMF as the source of data for the panel. IN Displays swapped-in address spaces OUT Displays swapped-out address spaces TRANS Displays address spaces that are in transition READY Displays address spaces that are ready for execution STC Displays started tasks INIT Displays initiators TSU Displays TSO users JOB Displays batch jobs -- For IBM-MAIN subscribe / signoff / archive access instructions, send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN
Re: What happens when an LPAR gets interupted.
Barbara/Tom, Thanks for the replies. I'll try to answer them both together. 1) I'm not seeing any CLCK/EXT/DSP entries for the logical processor within the the trace. 2) The Physical CPU changes on every trace entry for the 15 second period, There are only 11 trace entries. So it looks like the LCP is jumping all over the place onto different PCPs without any trace indication, other than the CP number. From what Tom says z/OS doesn't know about the interrupt so the LCP info is stored somewhere off z/OS and anything on that LCP would remain undispatched until that LCP is put back on a PCP, which may be the same one or a different one. I'm beginning to suspect this LCP was just idle. and it's not my problem. Regarding STCK I was wondering if piplelining of instructions meant that the slip might fire before the STCK was complete. I guess not. Is there any trace put in the MVS trace table when a slip fires? Or is there any other way to see which LCP we were running on at the time? I had a look in the PSA and the control registers but didn't spot anything. Thanks for the good info. Ron. -- For IBM-MAIN subscribe / signoff / archive access instructions, send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN
Re: What happens when an LPAR gets interupted.
On Thu, 15 Nov 2012 09:45:36 -0600, Ron MacRae wrote: I'm beginning to suspect this LCP was just idle. and it's not my problem. Your previous post said that that you didn't think that it went into a wait. In that case, the CP is not idle, but executing instructions in your program. Perhaps it is just that an LPAR with higher priority is stealing the CP from your LPAR. Regarding STCK I was wondering if piplelining of instructions meant that the slip might fire before the STCK was complete. I guess not. Perhaps Jim will have a more definitive answer, but I would think that the IF event would be detected before the completion of the STCK. However, long before the dump was taken, the STCK would have completed. -- Tom Marchant -- For IBM-MAIN subscribe / signoff / archive access instructions, send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN
Re: SRB Again
Donald Likens wrote: What I don't understand fills volumes and when I think I understand something I am often wrong. One of the things I don't understand is SRBs. I know this because what I am doing is not working! Note: When I use the option to turn off SRB processing and call (BALR) the SRB routine, instead of scheduling it, it works great. I schedule my SRB with the following macro: IEAMSCHD EPADDR=SRBRTN,PRIORITY=ENCLAVE,ENCLAVETOKEN=WKETKN, X PARM=ISELPARM,SYNCH=YES,X PURGESTOKEN=WKTKN,PTCBADDR=WKTOLD, (SAME AS PSATOLD)X SYNCHCOMPADDR=ISELCOMP,SYNCHCODEADDR=ISELCODE, X SYNCHRSNADDR=ISELRSN In all my reading SRBs are work that runs in parallel to the scheduling program but the SYNCH documentation states: SYNCH=YES The SRB is to be scheduled and synchronized with the callerâs work unit; the callerâs work unit is suspended until the SRB completes, is purged, or ends abnormally. My interpretation of SYNCH=YES is that the scheduling program waits for the SRB to complete. It my reading it also says that I need to use wait and post. I reason that wait and post is an alternate way to do what SYNCH does. My SRB terminates by branching to register 14. When it terminates I expect my code to resume. Correct? I read somewhere that storage reference by an SRB had to be in common storage. I reasoned that storage referenced by the SRB had to be in common storage if the SRB runs in another address space. The parm I pass points to storage in my address space, I believe this should work because this SRB runs in my address space (ENV=HOME). Am I correct? I also read that storage obtained by an SRB had to be in SQA. The Authorized Assembler Services Guide does not say this, so I do not believe it. Am I correct? When I first started testing everything looked great. My jobs were scheduling the SRB and running to completion. But as it turned out the SRB was abending and disappearing so I added the following options: PURGESTOKEN=WKTKN,PTCBADDR=WKTOLD, (SAME AS PSATOLD)X SYNCHCOMPADDR=ISELCOMP,SYNCHCODEADDR=ISELCODE, X SYNCHRSNADDR=ISELRSN Now when my SRB fails I know it because I display a message like the following: SRB SCHEDULING RC=28 COMP=08 CODE=000C4000 I am not sure why but I still did not get a dump (I thought the task recovery routines would create a dump but they didnât) so I added the following code to my SRB: SETFRR A,FRRAD=FRRA,EUT=YES,MODE=FULLXM,WRKREGS=(R1,R2) FRR DS0H USING FRR,R15 STR14,FRRSAVE LRR3,R15 DROP R15 LRR4,R1 *C TAKE A DUMP USING FRR,R3 SDUMPX HDR='SRB ERROR',BRANCH=YES, SDATA=(NOSQA,RGN,CSA) *C RETURN TO FRRRETRY L R2,FRRRETRY SETRP RC=4,REMREC=YES,RETREGS=YES,FRESDWA=YES, DUMP=YES,WKAREA=(R4),RETADDR=(R2) L R14,FRRSAVE BRR14 Now I get a dump but I donât know how to read it. I do not see any RMT2 information or any control blocks that give me the registers and PSW at the time of error. I looked in the Diagnostic reference manual but I did not see what I needed. Can anyone direct me to the proper documentation or at least tell me where to look in the dump for the registers and PSW at time of error in a SRB? Everything you need to know and the answers to all your questions are in the Authorized Services Guide and Reference manuals. Also the MVS data area manuals are useful if you don't know the format of the SDWA. Henry -- For IBM-MAIN subscribe / signoff / archive access instructions, send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN -- For IBM-MAIN subscribe / signoff / archive access instructions, send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN
Re: Drowning in service units on z/os 1.13 after migrating from v1.11
I have been meaning to post IBM's explanation of this problem. I hope I explain this correctly. I will paraphrase. From a conference call about 2 weeks ago: IBM fixed a bug in 'base code' at 1.13. The bug was that whenever a batch job performed vsam i/o, it was switched to the highest priority (this is normal), but then, when control was given back to the application task after the i/o, it continued to run at the highest priority INSTEAD OF the assigned WLM service class, which in our case was called BATPRDMED. Our temporary workaround, in order to get the service we were used to on v1.11, has been to set these jobs to a 'hotter' service class, BATHOT. IBM told us there was no tech doc or apar to document this change. What we observed (and possibly you?) when we brought up v1.13 was their 'fix' to this problem. The batch jobs were now running in the correct service class (WAD). They did not tell us exactly when the problem was introduced. A few days after this explanation 'soaked in', I suddenly remembered several instances, when we were on zOS v1.11, where batch jobs had preempted our online regions, which are supposed to ALWAYS run with higher priority than batch. I remember thinking that we had a problem at that time, though we never pursued it with IBM. I have now convinced myself that this was the manifestation of the bug they fixed at 1.13. While the lack of doc and background info was upsetting, I think IBM made an honorable attempt to 'make up for it' by providing some very helpful vsam tuning advice. I have to say their advice has improved our batch performance significantly. Their advice was to implement SMB in our long-running batch vsam jobs. We are a VSAM-centric shop with only a tiny bit of DB2, so we were probably impacted more than the typical shop. HTH, -Jim -- For IBM-MAIN subscribe / signoff / archive access instructions, send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN
Re: What happens when an LPAR gets interupted.
On 15 November 2012 08:48, Ron MacRae ronmac...@hotmail.co.uk wrote: 4) I get a dump via an IF slip on the instruction AFTER the STCK, i.e. we ended up with PSW 2 instructions after STCK. would this impact the output from the STCK? I'm not clear on what you mean by impact. There is no external way of telling anything about the real time, so your program (and for that matter the operating system) cannot know what value an STCK should have stored. But once observed (let's say your program copies the stored value somewhere), the general rules for conceptual order of execution, and the specific ones for TOD clock behaviour (strictly monotonically increasing values) apply. Unless they've allowed a huge deviation from the Principles of Operation, which is most unlikely, notwithstanding whatever nifty pipelining and OOE and such they may do under the covers, the conceptual order of execution as observed by programs is preserved. And that means, among other things, that the results stored by a conceptually earlier instruction are not affected by things like an interrupt taken on a conceptually later one. Tony H. -- For IBM-MAIN subscribe / signoff / archive access instructions, send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN
Re: Blades versus z was Re: Turn Off Another Light - Univ. of Tennessee
jwgli...@gmail.com (John Gilmore) writes: Lynn's most recent response is unsatisfactory, in substance evasive. Let us for the sake of the argument stipulate, though this is not usually the case, that some non-mainframe server can perform some single I/O operation faster than some mainframe. It turns out that this stipulation does not much help Lynn's argument. Mainframes handle aggregate I/O workloads, comprised of many single I/O operations, faster and with much less CP involvement than any non-mainframe server. CPU involvement is much lower, and many I/O operations are handled concurrently. The channels, which for some reason Lynn seems to want to disparage, do most of the work. Mark Post's point nevertheless remains crucial. Every case is indeed different. There are single applications that, particularly when they are considered in isolation, are easy enough to accomplish on a non-mainframe server. It is when many such applications are aggregated together that the mainframe comes into its own as an alternative, a highly attractive one, to server farms. recent discussion in a.f.c. about fibre-channel standard (work started 1988) ... in the 90s, some POK mainframe channel engineers started to participate ... working on layering mainframe channel conventions on top of fibre-channel ... which drastically cuts the throughput (compared to underlying fibre-channel) ... and eventually turns into FICON. http://www.garlic.com/~lynn/2012o.html#24 Assembler vs. COBOL--processing time, space needed in the past couple years ... there has been some work on FICON with introduction of TCW zHPF to coming closer to approx. the underlying fibre-channel throughput (looks to give FICON about factor of three times improvement). recent posts mentioning TCW enhancement to FICON http://www.garlic.com/~lynn/2012m.html#4 Blades versus z was Re: Turn Off Another Light - Univ. of Tennessee http://www.garlic.com/~lynn/2012m.html#5 Blades versus z was Re: Turn Off Another Light - Univ. of Tennessee http://www.garlic.com/~lynn/2012m.html#11 Blades versus z was Re: Turn Off Another Light - Univ. of Tennessee http://www.garlic.com/~lynn/2012m.html#13 Intel Confirms Decline of Server Giants HP, Dell, and IBM http://www.garlic.com/~lynn/2012m.html#28 I.B.M. Mainframe Evolves to Serve the Digital World http://www.garlic.com/~lynn/2012m.html#43 Blades versus z was Re: Turn Off Another Light - Univ. of Tennessee http://www.garlic.com/~lynn/2012n.html#19 How to get a tape's DSCB http://www.garlic.com/~lynn/2012n.html#44 Under what circumstances would it be a mistake to migrate applications/workload off the mainframe? http://www.garlic.com/~lynn/2012n.html#51 history of Programming language and CPU in relation to each http://www.garlic.com/~lynn/2012n.html#70 Under what circumstances would it be a mistake to migrate applications/workload off the mainframe? http://www.garlic.com/~lynn/2012n.html#72 Mainframes are still the best platform for high volume transaction processing IBM has z196 benchmark with peak of 2m IOPS with 104 FICON channels, 14 storage subsystems, and 14 system assist processors. It mentions that the 14 SAPs are capable of peak 2.2m SSCH/sec running at 100% cpu busy, but recommends SAPs run at 70% or less (i.e. 1.5m SSCH/sec). there is also a recent emulex announcement single fibre-channel for e5-2600 capable of over one millions IOPS (compared to z196 peak of 2m IOPS using 104 FICON channels) other aside, lots of past posts getting to play (IBM) disk engineer in bldgs. 1415 ... and working on mainframe channel disk thruput http://www.garlic.com/~lynn/subtopic.html#disk -- virtualization experience starting Jan1968, online at home since Mar1970 -- For IBM-MAIN subscribe / signoff / archive access instructions, send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN
Re: What happens when an LPAR gets interupted.
Is there any trace put in the MVS trace table when a slip fires? Or is there any other way to see which LCP we were running on at the time? I had a look in the PSA and the control registers but didn't spot anything. SYSTRACE will have an SPER entry when a SLIP PER trap is matched. Jim Mulder z/OS System Test IBM Corp. Poughkeepsie, NY -- For IBM-MAIN subscribe / signoff / archive access instructions, send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN
Re: SRB Again
Add SUMDUMP yo your dump options - that will get you all the register areas. I would expect IPCS option 2.2 would give you the error information. The trace table should have a *RCVY entry, and the abend PGM entry before it. You may wish to make a SLIP for the SRB abend instead of your FRR to make sure that you get the standard information. On Thu, 15 Nov 2012 07:54:10 -0600 Donald Likens dlik...@infosecinc.com wrote: :What I don't understand fills volumes and when I think I understand something I am often wrong. One of the things I don't understand is SRBs. I know this because what I am doing is not working! Note: When I use the option to turn off SRB processing and call (BALR) the SRB routine, instead of scheduling it, it works great. : :I schedule my SRB with the following macro: : : IEAMSCHD EPADDR=SRBRTN,PRIORITY=ENCLAVE,ENCLAVETOKEN=WKETKN, X : PARM=ISELPARM,SYNCH=YES,X : PURGESTOKEN=WKTKN,PTCBADDR=WKTOLD, (SAME AS PSATOLD)X : SYNCHCOMPADDR=ISELCOMP,SYNCHCODEADDR=ISELCODE, X : SYNCHRSNADDR=ISELRSN : :In all my reading SRBs are work that runs in parallel to the scheduling program but the SYNCH documentation states: : :SYNCH=YES The SRB is to be scheduled and synchronized with the callers work unit; the callers work unit is suspended until the SRB completes, is purged, or ends abnormally. : :My interpretation of SYNCH=YES is that the scheduling program waits for the SRB to complete. It my reading it also says that I need to use wait and post. I reason that wait and post is an alternate way to do what SYNCH does. My SRB terminates by branching to register 14. When it terminates I expect my code to resume. Correct? : :I read somewhere that storage reference by an SRB had to be in common storage. I reasoned that storage referenced by the SRB had to be in common storage if the SRB runs in another address space. The parm I pass points to storage in my address space, I believe this should work because this SRB runs in my address space (ENV=HOME). Am I correct? : :I also read that storage obtained by an SRB had to be in SQA. The Authorized Assembler Services Guide does not say this, so I do not believe it. Am I correct? : :When I first started testing everything looked great. My jobs were scheduling the SRB and running to completion. But as it turned out the SRB was abending and disappearing so I added the following options: : : PURGESTOKEN=WKTKN,PTCBADDR=WKTOLD, (SAME AS PSATOLD)X : SYNCHCOMPADDR=ISELCOMP,SYNCHCODEADDR=ISELCODE, X : SYNCHRSNADDR=ISELRSN :Now when my SRB fails I know it because I display a message like the following: : :SRB SCHEDULING RC=28 COMP=08 CODE=000C4000 : :I am not sure why but I still did not get a dump (I thought the task recovery routines would create a dump but they didnt) so I added the following code to my SRB: : :SETFRR A,FRRAD=FRRA,EUT=YES,MODE=FULLXM,WRKREGS=(R1,R2) : :FRR DS0H : USING FRR,R15 : STR14,FRRSAVE : LRR3,R15 : DROP R15 : LRR4,R1 :*C TAKE A DUMP : USING FRR,R3 : SDUMPX HDR='SRB ERROR',BRANCH=YES, : SDATA=(NOSQA,RGN,CSA) :*C RETURN TO FRRRETRY : L R2,FRRRETRY : SETRP RC=4,REMREC=YES,RETREGS=YES,FRESDWA=YES, : DUMP=YES,WKAREA=(R4),RETADDR=(R2) : L R14,FRRSAVE : BRR14 : :Now I get a dump but I dont know how to read it. I do not see any RMT2 information or any control blocks that give me the registers and PSW at the time of error. I looked in the Diagnostic reference manual but I did not see what I needed. Can anyone direct me to the proper documentation or at least tell me where to look in the dump for the registers and PSW at time of error in a SRB? : :-- :For IBM-MAIN subscribe / signoff / archive access instructions, :send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN -- Binyamin Dissen bdis...@dissensoftware.com http://www.dissensoftware.com Director, Dissen Software, Bar Grill - Israel Should you use the mailblocks package and expect a response from me, you should preauthorize the dissensoftware.com domain. I very rarely bother responding to challenge/response systems, especially those
Re: CA Common services ENF Monitor reporting high CPU time
We observed, while using STROBE, apparent high CPU use in module CAS9C66. From CA we found that they had the following on file:: APAR #: RO43562 Product: ENFCIC Release: 14.0Solution #: 7 Type: OS: OS Group: GCCOMC ISL SUP 2 Title: PERFORMANCE ENHANCEMENT ON Z/196 PROCESSORS. ** VERSION 0 EFFECTIVE: MAR 31 2012 2:09 ** ***NOTE*** PE: YES CORRECTED BY: RO45646 PROBLEM DESCRIPTION: After a processor upgrade to z/196, some performance monitors may show increase cpu usage in various csects in the CAS9Cxx module. The flagged area is usually in a very tight range and will contain a SPKA instruction. This APAR will have a greater affect on regions running STGPROT=NO. SYMPTOMS: Performance monitors show increased activity in CAS9Cxx modules. We had just moved to a z196 and STROBE was being used to compare performance against the previous processor (no longer available for direct comparison). Since the z196 and z114 are from the same design cycle and zEC12 is similar to a z196 we have asked whether this situation could exist on a z114 or a zEC12 and they said no. Has anyone any idea why this might occur, and anyone has seen it on a z114 or zEC12. Is the set up code for a SPKA instruction something that would be very different on a z196 from any other processor? One of the CPU designers gave me the following explanation: System z processor development has identified an aspect of the z196 processor that performs worse than the equivalent instruction on a z10 processor. When an SPKA instruction is executed in problem state, the new out-of-order design of the z196 processor requires more pipeline stalls to give functionally correct results than in prior generations of processors. Therefore, on workloads (i.e. CICS running with STGPROT=YES) that have an intense amount of SPKAs in problem state, this can show up as the z196 spending more time executing the SPKA instruction. Some vendor performance tools or single instruction benchmarks may uncover this additional time spent on the SPKA instruction. This change in SPKA behavior does not offset the benefits the z196 provides for the CICS environment. This aspect of the longer SPKA execution time can be exacerbated by running on a subcapacity machine. Jim Mulder z/OS System Test IBM Corp. Poughkeepsie, NY -- For IBM-MAIN subscribe / signoff / archive access instructions, send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN
Re: What happens when an LPAR gets interupted.
1) I'm not seeing any CLCK/EXT/DSP entries for the logical processor within the the trace. 2) The Physical CPU changes on every trace entry for the 15 second period, There are only 11 trace entries. It may be unusual to have 15 seconds of trace from all processors unless the system is mostly idle. In your analysis, are you considering trace entries which preceed this message? TRACE DATA IS NOT AVAILABLE FROM ALL PROCESSORS BEFORE THIS TIME. You should generally not consider entries which preceed this message unless you are aware of the implications.i So it looks like the LCP is jumping all over the place onto different PCPs without any trace indication, other than the CP number. From what Tom says z/OS doesn't know about the interrupt so the LCP info is stored somewhere off z/OS and anything on that LCP would remain undispatched until that LCP is put back on a PCP, which may be the same one or a different one. I would like to add that I believe there is interaction between z/OS and PR/SM in state saving. Somehow somewhere. I believe I read somewhere that with Hiperdispatch the MVS dispatcher works with PR/ SM and has one WUQ (work unit q) per physical (or was that logical?) cp. That change was made generally, regardless if Hiperdispatch is used or not. z/OS is not directly involved in the undispatching of a logicial processor on which it is running, at least, not until the new EC12 processor. The EC12 can present a new Warning Track external interruption when it wants to undispatch a logical processor. This gives z/OS the opportunity to first undispatch the z/OS workunit so that z/OS can dispatch it on another logical processor. Jim Mulder z/OS System Test IBM Corp. Poughkeepsie, NY -- For IBM-MAIN subscribe / signoff / archive access instructions, send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN
SPKA on z196 (was Re: CA Common services ENF Monitor reporting high CPU time)
On Thu, 15 Nov 2012 13:56:43 -0500, Jim Mulder d10j...@us.ibm.com wrote: One of the CPU designers gave me the following explanation: System z processor development has identified an aspect of the z196 processor that performs worse than the equivalent instruction on a z10 processor. When an SPKA instruction is executed in problem state, the new out-of-order design of the z196 processor requires more pipeline stalls to give functionally correct results than in prior generations of processors. Therefore, on workloads (i.e. CICS running with STGPROT=YES) that have an intense amount of SPKAs in problem state, this can show up as the z196 spending more time executing the SPKA instruction. Some vendor performance tools or single instruction benchmarks may uncover this additional time spent on the SPKA instruction. This change in SPKA behavior does not offset the benefits the z196 provides for the CICS environment. This aspect of the longer SPKA execution time can be exacerbated by running on a subcapacity machine. My client was able to see the difference, especially in a particular vendor's CICS monitor. Although according to the vendor, what was seen via Strobe was not completely accurate. There were changes made to CICS remove unnecessary SPKAs but the monitor's code had none that could be removed. -- Mark Zelden - Zelden Consulting Services - z/OS, OS/390 and MVS mailto:m...@mzelden.com Mark's MVS Utilities: http://www.mzelden.com/mvsutil.html Systems Programming expert at http://expertanswercenter.techtarget.com/ -- For IBM-MAIN subscribe / signoff / archive access instructions, send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN
Re: New way to do UCB lookups
yeah s http://www.medmutual.com/ Visit http://www.medmutual.com/ CONFIDENTIALITY NOTICE: This message is intended only for the use of the individual or entity to which it is addressed and may contain information that is privileged, confidential or exempt from disclosure by law. If the reader of this message is not the intended recipient, or the employee or agent responsible for delivering the message to the intended recipient, you are hereby notified that you are strictly prohibited from printing, storing, disseminating, distributing or copying this message. If you have received this message in error, please notify us immediately by replying to the message and deleting it from your computer. Neither this information block, the typed name of the sender, nor anything else in this message is intended to constitute an electronic signature, unless a specific statement to the contrary is included in this message. Thank you, Medical Mutual. -Original Message- From: IBM Mainframe Discussion List [mailto:IBM-MAIN@LISTSERV.UA.EDU] On Behalf Of Ed Gould Sent: Wednesday, November 14, 2012 5:56 PM To: IBM-MAIN@LISTSERV.UA.EDU Subject: Re: New way to do UCB lookups Frank, In one environment it was at least a weekly occurrence. In others once a month and others every 6 months, In the weekly environment it was a lively change management adventure as we did have several outages (sometime entire sysplex) It was fraught with issues and maybe if we had been more aggressive with maintenance the outage might not have happened. But the boss bought off on the chances, so... We did not have the man power to stay current like we should have been in that type of environment, but the boss yelled a few times and chasing down the cause was not a witch hunt per se but we just tried to tell them about issues and that is all we could do. At times we would get DASD that had data on it from a previous installation. The DASD people just wiped it out without looking or caring. As for CPU's it was at times scary (at least for me) as we had so many issue with OEM vendors that We had so many serial numbers floating around it was a PITA. That was scarier (to me) that the upgrades. I won't even go into tape drives types. Ed Ed On Nov 14, 2012, at 1:39 PM, Bonaduce, Frank wrote: This ongoing discussion prompts a question: Are dynamic IODF changes actually so prevalent in most environments (especially in Production) that the condition warrants that much consideration ? I, for one, would tend to doubt it. If it is the case in a 'sandbox' or development type environment, it's likely a tolerable condition. The advantage of using established facilities like UCBSCAN is that you can exploit parameters, like IOCTOKEN, to indicate if there is something of this nature happening and allows you the option of whether or not to react to it. In the case of DASD, the recommendation for some time has been to PIN the UCB if exclusivity is required and subsequently UNPIN it when it is no longer needed. These operations, of course, require authorization. Frank. GSG Systems. -Original Message- From: IBM Mainframe Discussion List [mailto:IBM- m...@listserv.ua.edu] On Behalf Of Sam Golob Sent: Wednesday, November 14, 2012 2:34 AM To: IBM-MAIN@LISTSERV.UA.EDU Subject: New way to do UCB lookups Hi Folks, Somebody please enlighten me. If you're trying to scratch a dataset on a pack, and somebody else is in the middle of doing an IODF change at the time, what is the difference if you are obtaining a copy of the UCB (to determine what's on the disk pack), or the real UCB itself? I'm not expecting a complete answer from somebody, but I'd at least like a reference to a manual or manuals where the perspective and advantages/disadvantages of copied, (and captured) UCB's is explained, as opposed to the real UCB's. I want to read about it. Please show me where. Thanks. All the best Sam -- For IBM-MAIN subscribe / signoff / archive access instructions, send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN -- For IBM-MAIN subscribe / signoff / archive access instructions, send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN -- For IBM-MAIN subscribe / signoff / archive access instructions, send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN -- For IBM-MAIN subscribe / signoff / archive access instructions, send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN
Re: What happens when an LPAR gets interupted.
On Thu, 15 Nov 2012 14:23:56 -0500, Jim Mulder wrote: The EC12 can present a new Warning Track external interruption when it wants to undispatch a logical processor. This gives z/OS the opportunity to first undispatch the z/OS workunit so that z/OS can dispatch it on another logical processor. Cool! Thanks for that information, Jim. Does z/OS have support for that yet? -- Tom Marchant -- For IBM-MAIN subscribe / signoff / archive access instructions, send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN
FW: What happens when an LPAR gets interupted.
In MXG 30.08 (and in CHANGES at www.mxg.com/changes): Change 30.208 Support for APAR OA37803 which adds Warning Track VMAC70 Interrupt Facility. Oct 6, 2012 -TYPE70EC and TYPE70PR new variables: SMF70WTI='DURATION*LP WAS YIELDED*DUE TO WTI' SMF70WTS='WTI-S*RETURNED*WITHIN*GRACE*PERIOD' SMF70WTU='WTI-S*UNABLE*TO RETURN*IN GRACE' -Original Message- From: IBM Mainframe Discussion List [mailto:IBM-MAIN@LISTSERV.UA.EDU] On Behalf Of Tom Marchant Sent: Thursday, November 15, 2012 2:35 PM To: IBM-MAIN@LISTSERV.UA.EDU Subject: Re: What happens when an LPAR gets interupted. On Thu, 15 Nov 2012 14:23:56 -0500, Jim Mulder wrote: The EC12 can present a new Warning Track external interruption when it wants to undispatch a logical processor. This gives z/OS the opportunity to first undispatch the z/OS workunit so that z/OS can dispatch it on another logical processor. Cool! Thanks for that information, Jim. Does z/OS have support for that yet? -- Tom Marchant -- For IBM-MAIN subscribe / signoff / archive access instructions, send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN -- For IBM-MAIN subscribe / signoff / archive access instructions, send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN
Re: CA Common services ENF Monitor reporting high CPU time
There's a conflict here. The CA APAR says This APAR will have a greater affect on regions running STGPROT=NO. Jim said Therefore, on workloads (i.e. CICS running with STGPROT=YES) , this can show up as the z196 spending more time Can't be both. Alan Schwartz ITO Global Services Operations and Engineering Xerox Business Services, LLC 1500 Towerview Rd. Eagan, MN. 55121-1346 p. 612.266.3150 m. 651.274.5819 f. 612.266.3196 -Original Message- From: IBM Mainframe Discussion List [mailto:IBM-MAIN@LISTSERV.UA.EDU] On Behalf Of Jim Mulder Sent: Thursday, November 15, 2012 12:57 PM To: IBM-MAIN@LISTSERV.UA.EDU Subject: Re: CA Common services ENF Monitor reporting high CPU time We observed, while using STROBE, apparent high CPU use in module CAS9C66. From CA we found that they had the following on file:: APAR #: RO43562 Product: ENFCIC Release: 14.0Solution #: 7 Type: OS: OS Group: GCCOMC ISL SUP 2 Title: PERFORMANCE ENHANCEMENT ON Z/196 PROCESSORS. ** VERSION 0 EFFECTIVE: MAR 31 2012 2:09 ** ***NOTE*** PE: YES CORRECTED BY: RO45646 PROBLEM DESCRIPTION: After a processor upgrade to z/196, some performance monitors may show increase cpu usage in various csects in the CAS9Cxx module. The flagged area is usually in a very tight range and will contain a SPKA instruction. This APAR will have a greater affect on regions running STGPROT=NO. SYMPTOMS: Performance monitors show increased activity in CAS9Cxx modules. We had just moved to a z196 and STROBE was being used to compare performance against the previous processor (no longer available for direct comparison). Since the z196 and z114 are from the same design cycle and zEC12 is similar to a z196 we have asked whether this situation could exist on a z114 or a zEC12 and they said no. Has anyone any idea why this might occur, and anyone has seen it on a z114 or zEC12. Is the set up code for a SPKA instruction something that would be very different on a z196 from any other processor? One of the CPU designers gave me the following explanation: System z processor development has identified an aspect of the z196 processor that performs worse than the equivalent instruction on a z10 processor. When an SPKA instruction is executed in problem state, the new out-of-order design of the z196 processor requires more pipeline stalls to give functionally correct results than in prior generations of processors. Therefore, on workloads (i.e. CICS running with STGPROT=YES) that have an intense amount of SPKAs in problem state, this can show up as the z196 spending more time executing the SPKA instruction. Some vendor performance tools or single instruction benchmarks may uncover this additional time spent on the SPKA instruction. This change in SPKA behavior does not offset the benefits the z196 provides for the CICS environment. This aspect of the longer SPKA execution time can be exacerbated by running on a subcapacity machine. Jim Mulder z/OS System Test IBM Corp. Poughkeepsie, NY -- For IBM-MAIN subscribe / signoff / archive access instructions, send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN -- For IBM-MAIN subscribe / signoff / archive access instructions, send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN
Re: SRB Again
Donald, First, SRB's can and do reside in the private area, and can freely access storage in the private area, or in a dataspace (subject to access list requirements of course). Second, when you specify SYNCH=YES on the IEAMSCHD, the system will handle the synchronization process between the scheduling TCB and the SRB, your application doesn't have to issue an WAIT or POST calls for this. As for you environment at the time of abend, try issuing IP STATUS FAILDATA it should show the PSW and register related data at the time of error. Good Luck === Wayne Driscoll OMEGAMON DB2 L3 Support/Development wdrisco(AT)us.ibm.com === From: Donald Likens dlik...@infosecinc.com To: IBM-MAIN@listserv.ua.edu, Date: 11/15/2012 07:54 AM Subject:[IBM-MAIN] SRB Again Sent by:IBM Mainframe Discussion List IBM-MAIN@listserv.ua.edu What I don't understand fills volumes and when I think I understand something I am often wrong. One of the things I don't understand is SRBs. I know this because what I am doing is not working! Note: When I use the option to turn off SRB processing and call (BALR) the SRB routine, instead of scheduling it, it works great. I schedule my SRB with the following macro: IEAMSCHD EPADDR=SRBRTN,PRIORITY=ENCLAVE,ENCLAVETOKEN=WKETKN, X PARM=ISELPARM,SYNCH=YES,X PURGESTOKEN=WKTKN,PTCBADDR=WKTOLD, (SAME AS PSATOLD)X SYNCHCOMPADDR=ISELCOMP,SYNCHCODEADDR=ISELCODE, X SYNCHRSNADDR=ISELRSN In all my reading SRBs are work that runs in parallel to the scheduling program but the SYNCH documentation states: SYNCH=YES The SRB is to be scheduled and synchronized with the caller’s work unit; the caller’s work unit is suspended until the SRB completes, is purged, or ends abnormally. My interpretation of SYNCH=YES is that the scheduling program waits for the SRB to complete. It my reading it also says that I need to use wait and post. I reason that wait and post is an alternate way to do what SYNCH does. My SRB terminates by branching to register 14. When it terminates I expect my code to resume. Correct? I read somewhere that storage reference by an SRB had to be in common storage. I reasoned that storage referenced by the SRB had to be in common storage if the SRB runs in another address space. The parm I pass points to storage in my address space, I believe this should work because this SRB runs in my address space (ENV=HOME). Am I correct? I also read that storage obtained by an SRB had to be in SQA. The Authorized Assembler Services Guide does not say this, so I do not believe it. Am I correct? When I first started testing everything looked great. My jobs were scheduling the SRB and running to completion. But as it turned out the SRB was abending and disappearing so I added the following options: PURGESTOKEN=WKTKN,PTCBADDR=WKTOLD, (SAME AS PSATOLD)X SYNCHCOMPADDR=ISELCOMP,SYNCHCODEADDR=ISELCODE, X SYNCHRSNADDR=ISELRSN Now when my SRB fails I know it because I display a message like the following: SRB SCHEDULING RC=28 COMP=08 CODE=000C4000 I am not sure why but I still did not get a dump (I thought the task recovery routines would create a dump but they didn’t) so I added the following code to my SRB: SETFRR A,FRRAD=FRRA,EUT=YES,MODE=FULLXM,WRKREGS=(R1,R2) FRR DS0H USING FRR,R15 STR14,FRRSAVE LRR3,R15 DROP R15 LRR4,R1 *C TAKE A DUMP USING FRR,R3 SDUMPX HDR='SRB ERROR',BRANCH=YES, SDATA=(NOSQA,RGN,CSA) *C RETURN TO FRRRETRY L R2,FRRRETRY SETRP RC=4,REMREC=YES,RETREGS=YES,FRESDWA=YES, DUMP=YES,WKAREA=(R4),RETADDR=(R2) L R14,FRRSAVE BRR14 Now I get a dump but I don’t know how to read it. I do not see any RMT2 information or any control blocks that give me the registers and PSW at the time of error. I looked in the Diagnostic reference manual but I did not see what I needed. Can anyone direct me to the proper documentation or at least tell me where to look in the dump for the registers and PSW at time of error in a SRB? -- For IBM-MAIN subscribe / signoff / archive access instructions, send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN -- For IBM-MAIN subscribe / signoff / archive access instructions, send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN
Re: SRB Again
Just to clarify, in the first sentence when I said SRB's can ... I mean SRB routines can ... === Wayne Driscoll OMEGAMON DB2 L3 Support/Development wdrisco(AT)us.ibm.com === From: Wayne Driscoll/Chicago/IBM@IBMUS To: IBM-MAIN@listserv.ua.edu, Date: 11/15/2012 03:47 PM Subject:Re: [IBM-MAIN] SRB Again Sent by:IBM Mainframe Discussion List IBM-MAIN@listserv.ua.edu Donald, First, SRB's can and do reside in the private area, and can freely access storage in the private area, or in a dataspace (subject to access list requirements of course). Second, when you specify SYNCH=YES on the IEAMSCHD, the system will handle the synchronization process between the scheduling TCB and the SRB, your application doesn't have to issue an WAIT or POST calls for this. As for you environment at the time of abend, try issuing IP STATUS FAILDATA it should show the PSW and register related data at the time of error. Good Luck === Wayne Driscoll OMEGAMON DB2 L3 Support/Development wdrisco(AT)us.ibm.com === From: Donald Likens dlik...@infosecinc.com To: IBM-MAIN@listserv.ua.edu, Date: 11/15/2012 07:54 AM Subject:[IBM-MAIN] SRB Again Sent by:IBM Mainframe Discussion List IBM-MAIN@listserv.ua.edu What I don't understand fills volumes and when I think I understand something I am often wrong. One of the things I don't understand is SRBs. I know this because what I am doing is not working! Note: When I use the option to turn off SRB processing and call (BALR) the SRB routine, instead of scheduling it, it works great. I schedule my SRB with the following macro: IEAMSCHD EPADDR=SRBRTN,PRIORITY=ENCLAVE,ENCLAVETOKEN=WKETKN, X PARM=ISELPARM,SYNCH=YES,X PURGESTOKEN=WKTKN,PTCBADDR=WKTOLD, (SAME AS PSATOLD)X SYNCHCOMPADDR=ISELCOMP,SYNCHCODEADDR=ISELCODE, X SYNCHRSNADDR=ISELRSN In all my reading SRBs are work that runs in parallel to the scheduling program but the SYNCH documentation states: SYNCH=YES The SRB is to be scheduled and synchronized with the caller’s work unit; the caller’s work unit is suspended until the SRB completes, is purged, or ends abnormally. My interpretation of SYNCH=YES is that the scheduling program waits for the SRB to complete. It my reading it also says that I need to use wait and post. I reason that wait and post is an alternate way to do what SYNCH does. My SRB terminates by branching to register 14. When it terminates I expect my code to resume. Correct? I read somewhere that storage reference by an SRB had to be in common storage. I reasoned that storage referenced by the SRB had to be in common storage if the SRB runs in another address space. The parm I pass points to storage in my address space, I believe this should work because this SRB runs in my address space (ENV=HOME). Am I correct? I also read that storage obtained by an SRB had to be in SQA. The Authorized Assembler Services Guide does not say this, so I do not believe it. Am I correct? When I first started testing everything looked great. My jobs were scheduling the SRB and running to completion. But as it turned out the SRB was abending and disappearing so I added the following options: PURGESTOKEN=WKTKN,PTCBADDR=WKTOLD, (SAME AS PSATOLD)X SYNCHCOMPADDR=ISELCOMP,SYNCHCODEADDR=ISELCODE, X SYNCHRSNADDR=ISELRSN Now when my SRB fails I know it because I display a message like the following: SRB SCHEDULING RC=28 COMP=08 CODE=000C4000 I am not sure why but I still did not get a dump (I thought the task recovery routines would create a dump but they didn’t) so I added the following code to my SRB: SETFRR A,FRRAD=FRRA,EUT=YES,MODE=FULLXM,WRKREGS=(R1,R2) FRR DS0H USING FRR,R15 STR14,FRRSAVE LRR3,R15 DROP R15 LRR4,R1 *C TAKE A DUMP USING FRR,R3 SDUMPX HDR='SRB ERROR',BRANCH=YES, SDATA=(NOSQA,RGN,CSA) *C RETURN TO FRRRETRY L R2,FRRRETRY SETRP RC=4,REMREC=YES,RETREGS=YES,FRESDWA=YES, DUMP=YES,WKAREA=(R4),RETADDR=(R2) L R14,FRRSAVE BRR14 Now I get a dump but I don’t know how to read it. I do not see any RMT2 information or any control blocks that give me the registers and PSW at the time of error. I looked in the Diagnostic reference manual but I did not see what I needed. Can anyone direct me to the proper documentation or at least tell me where to look in the dump for the registers and PSW at time of error in a SRB? -- For IBM-MAIN subscribe / signoff /
Nostalgia time
For those who love the sights and sounds of a card reader in action. Here are a couple of badly made videos I took at the National Computer Museum in Bletchley Park, England (the same Bletchley Park as Alan Turing and the Enigma code breakers). It shows a 1442 connected to a 1130 loading its deck of cards. One video is from the front, the other looking inside from the rear. http://youtu.be/w62NC1R6WLs http://youtu.be/viGAZwDB57A -- For IBM-MAIN subscribe / signoff / archive access instructions, send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN
Re: New way to do UCB lookups
I had judged that there was little more to be saidor at any rate little more for me to say---about this issue, but Mark Zelden has changed my mind. He wrote: begin extract I think the point was about how often they are done. Quarterly? Monthly? Weekly? Daily? I have never been at a production shop that had a need to do them more often than monthly. Maybe some hardware vendors do them daily on a given system, I don't know. end extract because this does not seem to me to be the point at all. To use a lawyer's word, we have stipulated/agreed that disaster can/will ensue without serialization. Mark's view is that, this conceded, the important question is not whether it will occur but how often. He and I disagree about this. My view is that if the unfortunate can occur it will occur and at a disagreeably inappropriate time. Moreover, the substantive question at issue is a curiously trivial one. There are z/OS macros available for implementing the serializations required here. Any sysprog competent to chase control-block pointers at all is presumably also competent to use these macros. Why then court the unfortunate by refusing to do so? Russian roulette is a diseased pastime, all but independently of whether the probability that a trial will have an unfortunate outcome is 1/8 or 1/4096. John Gilmore, Ashland, MA 01721 - USA -- For IBM-MAIN subscribe / signoff / archive access instructions, send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN
Re: Z196 and z/OS 1.4
We are running z/OS 1.5 at two client sites under VM on two separate z/196's. Admittedly not 1.4, but then again, not much difference either. Both are in the process of upgrading to z/OS 1.13. The APAR for the z/OS bug was OA30777. No fix was ever created in our code database for z/OS 1.4 or z/OS 1.5. My recollection is that during initial z/OS bringup on the z196 on the hardware test floor, this problem was a pretty solid occurrence IPLing z/OS as SIE Guest-1 (i.e., not under VM). There are TLB-related differences when running as SIE Guest-2 (i.e. under VM), and here may be more frequent TLB purges when running under VM. Some of these things might contribute to making the problem less frequent, or possibly unlikely, under VM. Also, I am thinking that we did not find an actual storage reference which was installing a TLB entry without the common bit for the segments in question, and that we speculated that it may have been due to speculative execution. But that was three years ago, and my memory isn't as sharp as it once was. Well, unless we are talking about dialog from Star Trek (The Original Series), or TV and radio commercial jingles from the 1960s. Jim Mulder z/OS System Test IBM Corp. Poughkeepsie, NY -- For IBM-MAIN subscribe / signoff / archive access instructions, send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN
Re: New way to do UCB lookups
OTOH I'm curious about the purported risk of a 'moving device'. In order to make a significant change to a device--especially UCB address--the volume has to be offline or else the dynamic ACTIVATE fails. How about a dynamic activate that succeeded but a component (catalog) that doesn't listen to the ENF signal that a UCB was removed, that had stored the old address in a common control block and *after* the activate abends CAS with very frequent 0C4s trying to touch the (now freemained) storage. While IBM took an apar on that 4 or 5 years ago, when we were hit with this problem and had an unscheduled installation wide outage because of it, the apar was closed SUG or some such and as far as I know it is still NOT fixed. During the escalation I was assured that IBM would fix it (in 1.13 at the latest), but the apar still doesn't say that the requirement was fulfilled. And that's IBM code. And don't tell me how to move a catalog. The documentation how to do it if you delete and then re-add the UCB to your config under a different name was written due to our outage. I think the HCD book now has a footnote that says 'don't do it'. Barbara -- For IBM-MAIN subscribe / signoff / archive access instructions, send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN
New way to do UCB lookups
Hi Folks, I'm answering John Gilmore's position mostly, although I've read the entire thread. John presumes that you have to be safe at all costs. My answer is: What's the whole point of this lookup method? Gilbert himself told it to me. IT'S AUTHORIZATION. You want to be able to do a complete UCB scan of all real UCB's to get the real-time information without having to be authorized. That's IT. Beginning and end. I don't think that IBM provides another way to do it. And STILL not to be authorized. Therefore I've tried it for myself, and have somewhat succeeded in doing some pretty telling demonstrations. Since I'm in the middle of the project, and I don't know how far it'll go, I don't want to say anything now. But it's pretty amazing what information you can show, TO ANY TSO USER, for example. So that's my point. But I'm a LOT wiser for all of the stuff that you folks brought up. Thanks much. And if you have any more to say, I'm pleased to hear it. All the best of everything to you and yours... Sincerely,Sam -- For IBM-MAIN subscribe / signoff / archive access instructions, send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN