Re: $HASP311 jobname re-queued at end of memory and held

2015-03-24 Thread Shane Ginnane
On Tue, 24 Mar 2015 09:02:49 +0100, Peter Hunkeler wrote:

So, the action taken at next warmstart will be determined by the setting of 
JOURNAL and RESTART on JOBCLASS. However, you could decide to release the job 
($AJ) or cancel it ($PJ) immediately.


... And an additional hint:

The job is requeued back for execution (per the $HASP311 message) since 
JOURNALing is active. You can confirm this by issuing a $DJOBCLASS(y),LONG 
where y is the respective jobclass of DB2A7417 and I would expect you find 
JOURNAL=YES. This is working as expected.

Thanks for the update Peter.
Sounds like a perfect opportunity for JES2 support to get organised and maybe 
set up a Health Checker to accommodate this scenario.
Or do they really expect us to schedule warmstarts regularly just for them ... ?
Too mundane a situation to bother JES2MON I guess.

Shane ...

--
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN


Re: $HASP311 jobname re-queued at end of memory and held

2015-03-24 Thread Tom Marchant
On Tue, 24 Mar 2015 09:02:49 +0100, Peter Hunkeler wrote:

an End-of-Memory condition

The way I used to understand it is that a Memory was once a common 
term for an an address space, and that the End of memory message simply 
meant that the address space (the initiator) terminated, taking the job with it.

I'm not sure if my understanding is correct.

-- 
Tom Marchant

--
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN


Re: $HASP311 jobname re-queued at end of memory and held

2015-03-24 Thread Anthony Thompson
Note also the DIAGxx parameter VSM CHECKREGIONLOSS(above,below) which looks at 
an initiator address space at the end of each job and automatically restarts 
the initiator if a loss of virtual storage exceeding the specified values is 
detected. This parameter is intended to avoid or minimize job errors due to 
initiator storage getting fragmented or by storage creep. I think 
CHECKREGIONLOSS was introduced in z/OS 1.11? Not sure.

Ant.

-Original Message-
From: IBM Mainframe Discussion List [mailto:IBM-MAIN@LISTSERV.UA.EDU] On Behalf 
Of Peter Hunkeler
Sent: Tuesday, 24 March 2015 5:33 PM
To: IBM-MAIN@LISTSERV.UA.EDU
Subject: Re: $HASP311 jobname re-queued at end of memory and held

Cross-posted to IBM-MAing and JES2-L

A while ago I was seeking for help regarding jobs being re-queued at an EOM 
situation. I did not get the desired insight here, so we asked IBM for details.

We got below explanation, which IBM support found in internal documents, only, 
but not in any FM. I thought I'd post it here so it will be available in the 
archives.

Q:We have a DB2 batch job that suffers from several S878-10 and finally the 
initiator fails completely. At the end of that a JES2 message $HASP311 appears 
and the job sits in the hold queue from then. Why is the job re-queued and 
where is that documented? 


Answer from IBM support (19/02/15 13:25): $HASP311 with the associated message 
text is always issued by JES2 in response to a request by the Initiator to 
requeue the job that has gone though EOM processing. This is normal behavior 
and is not new and not a defect. 

From a JES2 perspective this has always worked this way. It is the Initiator 
that is requesting the requeue of the job AND if there was an EOM involved 
before job completion $HASP311 is issued with the corresponding text. 


... and a bit more and better explanation: 
There are a few reasons why a job could be marked for re-execution, but the 
most common cause is that the initiator (where the job executes in), AND NOT 
THE JOB ITSELF, experiences an error, like an End-of-Memory condition which 
appears to be your case (the 40D indicates an out-of-storage condition during 
RTM processing, usually following ABEND80A or ABEND878). 


In that case, we turn on a flag in the control block structure to indicate that 
the job is eligible for restart (this is the same flag that would be set if 
$EJOB command was used to restart a job). JES2 requeues the job for warmstart 
and puts it in operator hold to prevent the job from failing recursively. 


During warmstart processing, JES2 will decide whether this job needs to be 
automatically restarted based on the setting of JOURNAL and RESTART option. 

The JES2 Init and Tuning Guide has more details about this under topic 1.13.2.6 
titled 'Warm start considerations'. 

Here is an excerpt: 
...If a job in execution was journaled, it is updated to indicate warmstart, 
and the job is queued for re-execution. If a job in execution has no journal, 
it is tested to determine whether restart was indicated for the job. 
If restart was indicated, the job is updated to remove any warmstart 
indications, and the job is queued for re-execution. If restart was not 
indicated, the job is queued for output processing. 


So, the action taken at next warmstart will be determined by the setting of 
JOURNAL and RESTART on JOBCLASS. However, you could decide to release the job 
($AJ) or cancel it ($PJ) immediately. 


... And an additional hint: 

The job is requeued back for execution (per the $HASP311 message) since 
JOURNALing is active. You can confirm this by issuing a $DJOBCLASS(y),LONG 
where y is the respective jobclass of DB2A7417 and I would expect you find 
JOURNAL=YES. This is working as expected. 

 


--
For IBM-MAIN subscribe / signoff / archive access instructions, send email to 
lists...@listserv.ua.edu with the message: INFO IBM-MAIN

--
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN


Re: $HASP311 jobname re-queued at end of memory and held

2015-03-24 Thread Anthony Thompson
Oh... of course the proper syntax is VSM CHECKREGIONLOSS(below,above) not 
(above,below), if anyone cares.  

-Original Message-
From: IBM Mainframe Discussion List [mailto:IBM-MAIN@LISTSERV.UA.EDU] On Behalf 
Of Anthony Thompson
Sent: Tuesday, 24 March 2015 6:29 PM
To: IBM-MAIN@LISTSERV.UA.EDU
Subject: Re: $HASP311 jobname re-queued at end of memory and held

Note also the DIAGxx parameter VSM CHECKREGIONLOSS(above,below) which looks at 
an initiator address space at the end of each job and automatically restarts 
the initiator if a loss of virtual storage exceeding the specified values is 
detected. This parameter is intended to avoid or minimize job errors due to 
initiator storage getting fragmented or by storage creep. I think 
CHECKREGIONLOSS was introduced in z/OS 1.11? Not sure.

Ant.

-Original Message-
From: IBM Mainframe Discussion List [mailto:IBM-MAIN@LISTSERV.UA.EDU] On Behalf 
Of Peter Hunkeler
Sent: Tuesday, 24 March 2015 5:33 PM
To: IBM-MAIN@LISTSERV.UA.EDU
Subject: Re: $HASP311 jobname re-queued at end of memory and held

Cross-posted to IBM-MAing and JES2-L

A while ago I was seeking for help regarding jobs being re-queued at an EOM 
situation. I did not get the desired insight here, so we asked IBM for details.

We got below explanation, which IBM support found in internal documents, only, 
but not in any FM. I thought I'd post it here so it will be available in the 
archives.

Q:We have a DB2 batch job that suffers from several S878-10 and finally the 
initiator fails completely. At the end of that a JES2 message $HASP311 appears 
and the job sits in the hold queue from then. Why is the job re-queued and 
where is that documented? 


Answer from IBM support (19/02/15 13:25): $HASP311 with the associated message 
text is always issued by JES2 in response to a request by the Initiator to 
requeue the job that has gone though EOM processing. This is normal behavior 
and is not new and not a defect. 

From a JES2 perspective this has always worked this way. It is the Initiator 
that is requesting the requeue of the job AND if there was an EOM involved 
before job completion $HASP311 is issued with the corresponding text. 


... and a bit more and better explanation: 
There are a few reasons why a job could be marked for re-execution, but the 
most common cause is that the initiator (where the job executes in), AND NOT 
THE JOB ITSELF, experiences an error, like an End-of-Memory condition which 
appears to be your case (the 40D indicates an out-of-storage condition during 
RTM processing, usually following ABEND80A or ABEND878). 


In that case, we turn on a flag in the control block structure to indicate that 
the job is eligible for restart (this is the same flag that would be set if 
$EJOB command was used to restart a job). JES2 requeues the job for warmstart 
and puts it in operator hold to prevent the job from failing recursively. 


During warmstart processing, JES2 will decide whether this job needs to be 
automatically restarted based on the setting of JOURNAL and RESTART option. 

The JES2 Init and Tuning Guide has more details about this under topic 1.13.2.6 
titled 'Warm start considerations'. 

Here is an excerpt: 
...If a job in execution was journaled, it is updated to indicate warmstart, 
and the job is queued for re-execution. If a job in execution has no journal, 
it is tested to determine whether restart was indicated for the job. 
If restart was indicated, the job is updated to remove any warmstart 
indications, and the job is queued for re-execution. If restart was not 
indicated, the job is queued for output processing. 


So, the action taken at next warmstart will be determined by the setting of 
JOURNAL and RESTART on JOBCLASS. However, you could decide to release the job 
($AJ) or cancel it ($PJ) immediately. 


... And an additional hint: 

The job is requeued back for execution (per the $HASP311 message) since 
JOURNALing is active. You can confirm this by issuing a $DJOBCLASS(y),LONG 
where y is the respective jobclass of DB2A7417 and I would expect you find 
JOURNAL=YES. This is working as expected. 

 


--
For IBM-MAIN subscribe / signoff / archive access instructions, send email to 
lists...@listserv.ua.edu with the message: INFO IBM-MAIN

--
For IBM-MAIN subscribe / signoff / archive access instructions, send email to 
lists...@listserv.ua.edu with the message: INFO IBM-MAIN

--
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN


Re: $HASP311 jobname re-queued at end of memory and held

2015-03-24 Thread Peter Hunkeler
Cross-posted to IBM-MAing and JES2-L

A while ago I was seeking for help regarding jobs being re-queued at an EOM 
situation. I did not get the desired insight here, so we asked IBM for details.

We got below explanation, which IBM support found in internal documents, only, 
but not in any FM. I thought I'd post it here so it will be available in the 
archives.

Q:We have a DB2 batch job that suffers from several S878-10 and finally the 
initiator fails completely. At the end of that a JES2 message $HASP311 appears 
and the job sits in the hold queue from then. Why is the job re-queued and 
where is that documented?


Answer from IBM support (19/02/15 13:25): $HASP311 with the associated message 
text is always issued by JES2 in response to a request by the Initiator to 
requeue the job that has gone though EOM processing. This is normal behavior 
and is not new and not a defect.

From a JES2 perspective this has always worked this way. It is the Initiator 
that is requesting the requeue of the job AND if there was an EOM involved 
before job completion $HASP311 is issued with the corresponding text.


... and a bit more and better explanation:
There are a few reasons why a job could be marked for re-execution, but the 
most common cause is that the initiator (where the job executes in), AND NOT 
THE JOB ITSELF, experiences an error, like an End-of-Memory condition which 
appears to be your case (the 40D indicates an out-of-storage condition during 
RTM processing, usually following ABEND80A or ABEND878).


In that case, we turn on a flag in the control block structure to indicate that 
the job is eligible for restart (this is the same flag that would be set if 
$EJOB command was used to restart a job). JES2 requeues the job for warmstart 
and puts it in operator hold to prevent the job from failing recursively.


During warmstart processing, JES2 will decide whether this job needs to be 
automatically restarted based on the setting of JOURNAL and RESTART option.

The JES2 Init and Tuning Guide has more details about this under topic 1.13.2.6 
titled 'Warm start considerations'.

Here is an excerpt:
...If a job in execution was journaled, it is updated to indicate warmstart, 
and the job is queued for re-execution. If a job in execution has no journal, 
it is tested to determine whether restart was indicated for the job.
If restart was indicated, the job is updated to remove any warmstart 
indications, and the job is queued for re-execution. If restart was not 
indicated, the job is queued for output processing.


So, the action taken at next warmstart will be determined by the setting of 
JOURNAL and RESTART on JOBCLASS. However, you could decide to release the job 
($AJ) or cancel it ($PJ) immediately.


... And an additional hint:

The job is requeued back for execution (per the $HASP311 message) since 
JOURNALing is active. You can confirm this by issuing a $DJOBCLASS(y),LONG 
where y is the respective jobclass of DB2A7417 and I would expect you find 
JOURNAL=YES. This is working as expected.




--
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN


Re: $HASP311 jobname re-queued at end of memory and held

2015-02-21 Thread Steve Horein
It seems I've seen this kind of behavior when the initiator the job was
running on would ABEND. That may be something to look at.
To address that symptom, I put some automation around $PI/$SI once a week.
That was circa OS/390 2.10 time frame.
​

--
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN


$HASP311 jobname re-queued at end of memory and held

2015-02-21 Thread Peter Hunkeler
Crossposted. I had posted this on the JES2 list but got no insight so far. 
Hopefully someone here can shed some light on this.


I stumbled across a job that had ended and was in status HELD in the JES2 
execution queue. First I thought someone must have issued a $HJnnn while 
the job was running.


I then saw that the job has had some problems (878-10 abends) which finally 
lead the job and the initiator being terminated at end of memory (initiated by 
DB2 - step ends with S0F4  reason F30905).

Now the intersting thing which I didn't know (and don't understand either), and 
for which I was not yet able to find where this is documented:

JES2 writes message
$HASP3111 jobname RE-QUEUED AT END OF MEMORY AND HELD

Well, I've found the doc on message $HASP311 but it doesn't tell much details.


Questions I have:
- When are jobs eligible for this kind of JES2 automatic restart?
- What are the factors being part of this decision?
- Not sure all jobs are restartable per se.
- What would have happended to the job at next JES2 warm start?
- How to correctly handle jos in this state?
- Can this be configured?


Can someone shed some light on this and/or point me to the document where I can 
read how this works.





--
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN


Re: $HASP311 jobname re-queued at end of memory and held

2015-02-21 Thread Lizette Koehler
So APAR OA21984 states you would need JOURNAL=YES on JOB.

Without complete understanding of your JES2 INIT Deck, you initiators, I am
not sure if we can answer that question.

You might open an SR with IBM to get in-depth knowledge for your question.

I am not sure that IBM provided full disclosure on this condition.

To me the message indicates that you had a job that ran into an end of
memory condition that did not allow JES2 to complete it.  So JES2 placed it
in the HELD queue based on your JES2 environment.

Might try the JES2 INIT and TUNING Guide/Reference on the parms related to
restart, like Journal.

Lizette


 -Original Message-
 From: IBM Mainframe Discussion List [mailto:IBM-MAIN@LISTSERV.UA.EDU]
 On Behalf Of Peter Hunkeler
 Sent: Saturday, February 21, 2015 8:08 AM
 To: IBM-MAIN@LISTSERV.UA.EDU
 Subject: $HASP311 jobname re-queued at end of memory and held
 
 Crossposted. I had posted this on the JES2 list but got no insight so far.
 Hopefully someone here can shed some light on this.
 
 
 I stumbled across a job that had ended and was in status HELD in the JES2
 execution queue. First I thought someone must have issued a $HJnnn
 while the job was running.
 
 
 I then saw that the job has had some problems (878-10 abends) which
finally
 lead the job and the initiator being terminated at end of memory
(initiated
 by DB2 - step ends with S0F4  reason F30905).
 
 Now the intersting thing which I didn't know (and don't understand
either),
 and for which I was not yet able to find where this is documented:
 
 JES2 writes message
 $HASP3111 jobname RE-QUEUED AT END OF MEMORY AND HELD
 
 Well, I've found the doc on message $HASP311 but it doesn't tell much
 details.
 
 
 Questions I have:
 - When are jobs eligible for this kind of JES2 automatic restart?
 - What are the factors being part of this decision?
 - Not sure all jobs are restartable per se.
 - What would have happended to the job at next JES2 warm start?
 - How to correctly handle jos in this state?
 - Can this be configured?
 
 
 Can someone shed some light on this and/or point me to the document
 where I can read how this works.
 

--
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN