Re: $HASP311 jobname re-queued at end of memory and held
On Tue, 24 Mar 2015 09:02:49 +0100, Peter Hunkeler wrote: So, the action taken at next warmstart will be determined by the setting of JOURNAL and RESTART on JOBCLASS. However, you could decide to release the job ($AJ) or cancel it ($PJ) immediately. ... And an additional hint: The job is requeued back for execution (per the $HASP311 message) since JOURNALing is active. You can confirm this by issuing a $DJOBCLASS(y),LONG where y is the respective jobclass of DB2A7417 and I would expect you find JOURNAL=YES. This is working as expected. Thanks for the update Peter. Sounds like a perfect opportunity for JES2 support to get organised and maybe set up a Health Checker to accommodate this scenario. Or do they really expect us to schedule warmstarts regularly just for them ... ? Too mundane a situation to bother JES2MON I guess. Shane ... -- For IBM-MAIN subscribe / signoff / archive access instructions, send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN
Re: $HASP311 jobname re-queued at end of memory and held
On Tue, 24 Mar 2015 09:02:49 +0100, Peter Hunkeler wrote: an End-of-Memory condition The way I used to understand it is that a Memory was once a common term for an an address space, and that the End of memory message simply meant that the address space (the initiator) terminated, taking the job with it. I'm not sure if my understanding is correct. -- Tom Marchant -- For IBM-MAIN subscribe / signoff / archive access instructions, send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN
Re: $HASP311 jobname re-queued at end of memory and held
Note also the DIAGxx parameter VSM CHECKREGIONLOSS(above,below) which looks at an initiator address space at the end of each job and automatically restarts the initiator if a loss of virtual storage exceeding the specified values is detected. This parameter is intended to avoid or minimize job errors due to initiator storage getting fragmented or by storage creep. I think CHECKREGIONLOSS was introduced in z/OS 1.11? Not sure. Ant. -Original Message- From: IBM Mainframe Discussion List [mailto:IBM-MAIN@LISTSERV.UA.EDU] On Behalf Of Peter Hunkeler Sent: Tuesday, 24 March 2015 5:33 PM To: IBM-MAIN@LISTSERV.UA.EDU Subject: Re: $HASP311 jobname re-queued at end of memory and held Cross-posted to IBM-MAing and JES2-L A while ago I was seeking for help regarding jobs being re-queued at an EOM situation. I did not get the desired insight here, so we asked IBM for details. We got below explanation, which IBM support found in internal documents, only, but not in any FM. I thought I'd post it here so it will be available in the archives. Q:We have a DB2 batch job that suffers from several S878-10 and finally the initiator fails completely. At the end of that a JES2 message $HASP311 appears and the job sits in the hold queue from then. Why is the job re-queued and where is that documented? Answer from IBM support (19/02/15 13:25): $HASP311 with the associated message text is always issued by JES2 in response to a request by the Initiator to requeue the job that has gone though EOM processing. This is normal behavior and is not new and not a defect. From a JES2 perspective this has always worked this way. It is the Initiator that is requesting the requeue of the job AND if there was an EOM involved before job completion $HASP311 is issued with the corresponding text. ... and a bit more and better explanation: There are a few reasons why a job could be marked for re-execution, but the most common cause is that the initiator (where the job executes in), AND NOT THE JOB ITSELF, experiences an error, like an End-of-Memory condition which appears to be your case (the 40D indicates an out-of-storage condition during RTM processing, usually following ABEND80A or ABEND878). In that case, we turn on a flag in the control block structure to indicate that the job is eligible for restart (this is the same flag that would be set if $EJOB command was used to restart a job). JES2 requeues the job for warmstart and puts it in operator hold to prevent the job from failing recursively. During warmstart processing, JES2 will decide whether this job needs to be automatically restarted based on the setting of JOURNAL and RESTART option. The JES2 Init and Tuning Guide has more details about this under topic 1.13.2.6 titled 'Warm start considerations'. Here is an excerpt: ...If a job in execution was journaled, it is updated to indicate warmstart, and the job is queued for re-execution. If a job in execution has no journal, it is tested to determine whether restart was indicated for the job. If restart was indicated, the job is updated to remove any warmstart indications, and the job is queued for re-execution. If restart was not indicated, the job is queued for output processing. So, the action taken at next warmstart will be determined by the setting of JOURNAL and RESTART on JOBCLASS. However, you could decide to release the job ($AJ) or cancel it ($PJ) immediately. ... And an additional hint: The job is requeued back for execution (per the $HASP311 message) since JOURNALing is active. You can confirm this by issuing a $DJOBCLASS(y),LONG where y is the respective jobclass of DB2A7417 and I would expect you find JOURNAL=YES. This is working as expected. -- For IBM-MAIN subscribe / signoff / archive access instructions, send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN -- For IBM-MAIN subscribe / signoff / archive access instructions, send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN
Re: $HASP311 jobname re-queued at end of memory and held
Oh... of course the proper syntax is VSM CHECKREGIONLOSS(below,above) not (above,below), if anyone cares. -Original Message- From: IBM Mainframe Discussion List [mailto:IBM-MAIN@LISTSERV.UA.EDU] On Behalf Of Anthony Thompson Sent: Tuesday, 24 March 2015 6:29 PM To: IBM-MAIN@LISTSERV.UA.EDU Subject: Re: $HASP311 jobname re-queued at end of memory and held Note also the DIAGxx parameter VSM CHECKREGIONLOSS(above,below) which looks at an initiator address space at the end of each job and automatically restarts the initiator if a loss of virtual storage exceeding the specified values is detected. This parameter is intended to avoid or minimize job errors due to initiator storage getting fragmented or by storage creep. I think CHECKREGIONLOSS was introduced in z/OS 1.11? Not sure. Ant. -Original Message- From: IBM Mainframe Discussion List [mailto:IBM-MAIN@LISTSERV.UA.EDU] On Behalf Of Peter Hunkeler Sent: Tuesday, 24 March 2015 5:33 PM To: IBM-MAIN@LISTSERV.UA.EDU Subject: Re: $HASP311 jobname re-queued at end of memory and held Cross-posted to IBM-MAing and JES2-L A while ago I was seeking for help regarding jobs being re-queued at an EOM situation. I did not get the desired insight here, so we asked IBM for details. We got below explanation, which IBM support found in internal documents, only, but not in any FM. I thought I'd post it here so it will be available in the archives. Q:We have a DB2 batch job that suffers from several S878-10 and finally the initiator fails completely. At the end of that a JES2 message $HASP311 appears and the job sits in the hold queue from then. Why is the job re-queued and where is that documented? Answer from IBM support (19/02/15 13:25): $HASP311 with the associated message text is always issued by JES2 in response to a request by the Initiator to requeue the job that has gone though EOM processing. This is normal behavior and is not new and not a defect. From a JES2 perspective this has always worked this way. It is the Initiator that is requesting the requeue of the job AND if there was an EOM involved before job completion $HASP311 is issued with the corresponding text. ... and a bit more and better explanation: There are a few reasons why a job could be marked for re-execution, but the most common cause is that the initiator (where the job executes in), AND NOT THE JOB ITSELF, experiences an error, like an End-of-Memory condition which appears to be your case (the 40D indicates an out-of-storage condition during RTM processing, usually following ABEND80A or ABEND878). In that case, we turn on a flag in the control block structure to indicate that the job is eligible for restart (this is the same flag that would be set if $EJOB command was used to restart a job). JES2 requeues the job for warmstart and puts it in operator hold to prevent the job from failing recursively. During warmstart processing, JES2 will decide whether this job needs to be automatically restarted based on the setting of JOURNAL and RESTART option. The JES2 Init and Tuning Guide has more details about this under topic 1.13.2.6 titled 'Warm start considerations'. Here is an excerpt: ...If a job in execution was journaled, it is updated to indicate warmstart, and the job is queued for re-execution. If a job in execution has no journal, it is tested to determine whether restart was indicated for the job. If restart was indicated, the job is updated to remove any warmstart indications, and the job is queued for re-execution. If restart was not indicated, the job is queued for output processing. So, the action taken at next warmstart will be determined by the setting of JOURNAL and RESTART on JOBCLASS. However, you could decide to release the job ($AJ) or cancel it ($PJ) immediately. ... And an additional hint: The job is requeued back for execution (per the $HASP311 message) since JOURNALing is active. You can confirm this by issuing a $DJOBCLASS(y),LONG where y is the respective jobclass of DB2A7417 and I would expect you find JOURNAL=YES. This is working as expected. -- For IBM-MAIN subscribe / signoff / archive access instructions, send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN -- For IBM-MAIN subscribe / signoff / archive access instructions, send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN -- For IBM-MAIN subscribe / signoff / archive access instructions, send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN
Re: $HASP311 jobname re-queued at end of memory and held
Cross-posted to IBM-MAing and JES2-L A while ago I was seeking for help regarding jobs being re-queued at an EOM situation. I did not get the desired insight here, so we asked IBM for details. We got below explanation, which IBM support found in internal documents, only, but not in any FM. I thought I'd post it here so it will be available in the archives. Q:We have a DB2 batch job that suffers from several S878-10 and finally the initiator fails completely. At the end of that a JES2 message $HASP311 appears and the job sits in the hold queue from then. Why is the job re-queued and where is that documented? Answer from IBM support (19/02/15 13:25): $HASP311 with the associated message text is always issued by JES2 in response to a request by the Initiator to requeue the job that has gone though EOM processing. This is normal behavior and is not new and not a defect. From a JES2 perspective this has always worked this way. It is the Initiator that is requesting the requeue of the job AND if there was an EOM involved before job completion $HASP311 is issued with the corresponding text. ... and a bit more and better explanation: There are a few reasons why a job could be marked for re-execution, but the most common cause is that the initiator (where the job executes in), AND NOT THE JOB ITSELF, experiences an error, like an End-of-Memory condition which appears to be your case (the 40D indicates an out-of-storage condition during RTM processing, usually following ABEND80A or ABEND878). In that case, we turn on a flag in the control block structure to indicate that the job is eligible for restart (this is the same flag that would be set if $EJOB command was used to restart a job). JES2 requeues the job for warmstart and puts it in operator hold to prevent the job from failing recursively. During warmstart processing, JES2 will decide whether this job needs to be automatically restarted based on the setting of JOURNAL and RESTART option. The JES2 Init and Tuning Guide has more details about this under topic 1.13.2.6 titled 'Warm start considerations'. Here is an excerpt: ...If a job in execution was journaled, it is updated to indicate warmstart, and the job is queued for re-execution. If a job in execution has no journal, it is tested to determine whether restart was indicated for the job. If restart was indicated, the job is updated to remove any warmstart indications, and the job is queued for re-execution. If restart was not indicated, the job is queued for output processing. So, the action taken at next warmstart will be determined by the setting of JOURNAL and RESTART on JOBCLASS. However, you could decide to release the job ($AJ) or cancel it ($PJ) immediately. ... And an additional hint: The job is requeued back for execution (per the $HASP311 message) since JOURNALing is active. You can confirm this by issuing a $DJOBCLASS(y),LONG where y is the respective jobclass of DB2A7417 and I would expect you find JOURNAL=YES. This is working as expected. -- For IBM-MAIN subscribe / signoff / archive access instructions, send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN
Re: $HASP311 jobname re-queued at end of memory and held
It seems I've seen this kind of behavior when the initiator the job was running on would ABEND. That may be something to look at. To address that symptom, I put some automation around $PI/$SI once a week. That was circa OS/390 2.10 time frame. -- For IBM-MAIN subscribe / signoff / archive access instructions, send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN
$HASP311 jobname re-queued at end of memory and held
Crossposted. I had posted this on the JES2 list but got no insight so far. Hopefully someone here can shed some light on this. I stumbled across a job that had ended and was in status HELD in the JES2 execution queue. First I thought someone must have issued a $HJnnn while the job was running. I then saw that the job has had some problems (878-10 abends) which finally lead the job and the initiator being terminated at end of memory (initiated by DB2 - step ends with S0F4 reason F30905). Now the intersting thing which I didn't know (and don't understand either), and for which I was not yet able to find where this is documented: JES2 writes message $HASP3111 jobname RE-QUEUED AT END OF MEMORY AND HELD Well, I've found the doc on message $HASP311 but it doesn't tell much details. Questions I have: - When are jobs eligible for this kind of JES2 automatic restart? - What are the factors being part of this decision? - Not sure all jobs are restartable per se. - What would have happended to the job at next JES2 warm start? - How to correctly handle jos in this state? - Can this be configured? Can someone shed some light on this and/or point me to the document where I can read how this works. -- For IBM-MAIN subscribe / signoff / archive access instructions, send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN
Re: $HASP311 jobname re-queued at end of memory and held
So APAR OA21984 states you would need JOURNAL=YES on JOB. Without complete understanding of your JES2 INIT Deck, you initiators, I am not sure if we can answer that question. You might open an SR with IBM to get in-depth knowledge for your question. I am not sure that IBM provided full disclosure on this condition. To me the message indicates that you had a job that ran into an end of memory condition that did not allow JES2 to complete it. So JES2 placed it in the HELD queue based on your JES2 environment. Might try the JES2 INIT and TUNING Guide/Reference on the parms related to restart, like Journal. Lizette -Original Message- From: IBM Mainframe Discussion List [mailto:IBM-MAIN@LISTSERV.UA.EDU] On Behalf Of Peter Hunkeler Sent: Saturday, February 21, 2015 8:08 AM To: IBM-MAIN@LISTSERV.UA.EDU Subject: $HASP311 jobname re-queued at end of memory and held Crossposted. I had posted this on the JES2 list but got no insight so far. Hopefully someone here can shed some light on this. I stumbled across a job that had ended and was in status HELD in the JES2 execution queue. First I thought someone must have issued a $HJnnn while the job was running. I then saw that the job has had some problems (878-10 abends) which finally lead the job and the initiator being terminated at end of memory (initiated by DB2 - step ends with S0F4 reason F30905). Now the intersting thing which I didn't know (and don't understand either), and for which I was not yet able to find where this is documented: JES2 writes message $HASP3111 jobname RE-QUEUED AT END OF MEMORY AND HELD Well, I've found the doc on message $HASP311 but it doesn't tell much details. Questions I have: - When are jobs eligible for this kind of JES2 automatic restart? - What are the factors being part of this decision? - Not sure all jobs are restartable per se. - What would have happended to the job at next JES2 warm start? - How to correctly handle jos in this state? - Can this be configured? Can someone shed some light on this and/or point me to the document where I can read how this works. -- For IBM-MAIN subscribe / signoff / archive access instructions, send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN