Re: CEEDUMP possible following 'new' failure

2016-10-23 Thread Charles Mills
Lots of questions ...

Charles

-Original Message-
From: IBM Mainframe Discussion List [mailto:IBM-MAIN@LISTSERV.UA.EDU] On Behalf 
Of Bill Woodger
Sent: Sunday, October 23, 2016 10:08 AM
To: IBM-MAIN@LISTSERV.UA.EDU
Subject: Re: CEEDUMP possible following 'new' failure

> Do you need to reacquire the storage, or does the LE dump routine hang around 
> for the second-time-through?

My logic is such that I terminate (neatly, with CSA and exit cleanup) at that 
point so I never have a second time through. One would have to experiment.

> Would it be possible to load the LE dump routine instead of doing the initial 
> GETMAIN? And your own routine?

That occurred to me but I did not try it. That would potentially be a better 
solution because one would not be guessing at the amount of storage required. 
I'm a commercial product developer, not a z/OS researcher () and so "works 
perfectly" is good enough for me. Also, this is just an interim solution until 
Mr. R beats the LE people into putting their stuff into LPA. There is no "my 
routine" -- my recovery code is all resident in the main load module.

> Do you have issues with something else looking for storage while you are 
> processing the first one?

Nope. There quite intentionally are no new's, GETMAINs or malloc()'s in the 
recovery code. And very quickly it starts doing delete's like there was no 
tomorrow (which there isn't, at least not for that instance).

> Is the dump output in the dataset where the four blank lines were, and are 
> they still there, or perhaps some "artefact" of failure?

I did not think to look explicitly for the four blank lines but the dump looked 
"perfect" at first glance. It has the traceback which is the main thing I was 
interested in. Perhaps the four blank lines are "between sections." Before, the 
sections were missing so the blank lines looked out of place. Now the sections 
are there, so the blank lines look appropriate.

--
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN


Re: CEEDUMP possible following 'new' failure

2016-10-23 Thread Bill Woodger
Do you need to reacquire the storage, or does the LE dump routine hang around 
for the second-time-through?

Would it be possible to load the LE dump routine instead of doing the initial 
GETMAIN? And your own routine?

Do you have issues with something else looking for storage while you are 
processing the first one?

Is the dump output in the dataset where the four blank lines were, and are they 
still there, or perhaps some "artefact" of failure?

--
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN


Re: CEEDUMP possible following 'new' failure

2016-10-23 Thread Charles Mills
I know this is a slightly aged thread but I wanted to share an interim
remediation with anyone who has this problem. I was able to resolve the
problem by doing a GETMAIN (vanilla SP=0) at startup and freeing the area
just before attempting to invoke the LE dump. I started out with 60Ki bytes
and that resolved the problem. I did not experiment to see if a somewhat
smaller area would do the trick.

Charles

-Original Message-
From: IBM Mainframe Discussion List [mailto:IBM-MAIN@LISTSERV.UA.EDU] On
Behalf Of Jim Mulder
Sent: Thursday, October 06, 2016 4:33 PM
To: IBM-MAIN@LISTSERV.UA.EDU
Subject: Re: CEEDUMP possible following 'new' failure

  From some internal discussion after this issue was raised today, our
intention is that LE will move the CEEDUMP modules to SCEELPA in the next
release of z/OS. 

Jim Mulder z/OS Diagnosis, Design, Development, Test  IBM Corp. 

--
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN


Re: CEEDUMP possible following 'new' failure

2016-10-11 Thread Denis
Hi Peter,
 
yes I am aware of that, but we also have hang conditions, where the jobs are 
not out of memory and it might help in those situations.
Many of this situation occur when unix system services are in the game.
 
Thanks.
 
 
-Original Message-
From: Peter Relson <rel...@us.ibm.com>
To: IBM-MAIN <IBM-MAIN@LISTSERV.UA.EDU>
Sent: Tue, Oct 11, 2016 1:44 pm
Subject: Re: CEEDUMP possible following 'new' failure

The "callrtm command" will do no better than anything else that requires 
private storage of the address space to run. It is nothing more than a 
targeted cancel.

Out of private storage is out of private storage.

Peter Relson
z/OS Core Technology Design


--
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN


--
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN


Re: CEEDUMP possible following 'new' failure

2016-10-11 Thread Peter Relson
The "callrtm command" will do no better than anything else that requires 
private storage of the address space to run. It is nothing more than a 
targeted cancel.

Out of private storage is out of private storage.

Peter Relson
z/OS Core Technology Design


--
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN


Re: CEEDUMP possible following 'new' failure

2016-10-10 Thread Denis
Hi Barbara,
 
no, we did not have Compuware in this environment. 
But the callrtm command seams something we need to look at. Thanks for that.
 
Denis.
 
 
-Original Message-
From: Barbara Nitz <nitz-...@gmx.net>
To: IBM-MAIN <IBM-MAIN@LISTSERV.UA.EDU>
Sent: Mon, Oct 10, 2016 9:22 am
Subject: Re: CEEDUMP possible following 'new' failure

>I cannot remember exactly, but what happened was that in IMS the STOP REGION 
>command was issued and the address space was not listed anymore in IMS 
>(Display active showed it was gone).
>It was visible in JES but nothing could be done about it, it did neither 
>accept cancel nor force.

Do you by any chance run Compuware Xpediter or something under IMS? A few jobs 
ago, we had the same problem with our IMS regions, and it turned out that 
Compuware code was hindering termination. (I took a dump and looked.)

In some cases I was successful by using the (unsupported) callrtm program that 
is now a (supported)  operator command. I specified the asid and the bottommost 
tcb and ran that program a number of times. In about 70% of the cases it 
succeeded in terminating the hung region, in the remaining 30% they needed a 
complete IMS restart (to enable to user whose id was blocked) to work again.

Barbara

--
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN


--
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN


Re: CEEDUMP possible following 'new' failure

2016-10-10 Thread Barbara Nitz
>I cannot remember exactly, but what happened was that in IMS the STOP REGION 
>command was issued and the address space was not listed anymore in IMS 
>(Display active showed it was gone).
>It was visible in JES but nothing could be done about it, it did neither 
>accept cancel nor force.

Do you by any chance run Compuware Xpediter or something under IMS? A few jobs 
ago, we had the same problem with our IMS regions, and it turned out that 
Compuware code was hindering termination. (I took a dump and looked.)

In some cases I was successful by using the (unsupported) callrtm program that 
is now a (supported)  operator command. I specified the asid and the bottommost 
tcb and ran that program a number of times. In about 70% of the cases it 
succeeded in terminating the hung region, in the remaining 30% they needed a 
complete IMS restart (to enable to user whose id was blocked) to work again.

Barbara

--
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN


Re: CEEDUMP possible following 'new' failure

2016-10-09 Thread Peter Relson

It was visible in JES but nothing could be done about it, it did neither 
accept cancel nor force.
 
Fault Analyzer showed that the last thing that happened in the address 
space was trying to load some z/OS routines for termination (if it was not 
memory termination then it must have been task termination) and failed to 
load those routines because of an out of storage condition.


You might consider the possibility that JES2 or Fault Analyzer is in error 
if this is true. If force is not accepted then the address space no longer 
exists as far as the system is concerned.
What exactly do you mean by "accept...force"? It is true that if memory 
termination fails because of some system error an address space could land 
in limbo. But that would not have anything to do with insufficient memory 
in the address space. And if the master address space (where memterm runs) 
has insufficient storage then some authorized program has screwed things 
up and you'll likely need to re-IPL (and find/fix that program).


If task termination, which is an operating system function, requires 
storage in an address space with no storage left, it should ensure that 
there is always enough room for task termination.


Unfortunately in the general case this is provably impossible. Just as it 
is provably impossible in the general case to detect infinite looping.

I would guess that a huge percentage of customers have chosen a prudent 
approach with their user exits to prevent the vast majority of these 
cases, by limiting regions size somewhat, not allowing someone to allocate 
as user-region storage all available storage (whether below or above 16M 
or even above 2G).

Peter Relson
z/OS Core Technology Design


--
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN


Re: CEEDUMP possible following 'new' failure

2016-10-09 Thread Barry Merrill
I have this note from  a few years ago that consurs:
 
- Early SAS notes recommended REGION=0M, but with early V9.1, only
  for diagnostics AFTER an out of memory condition, a specific REGION
  was recommended in http://support.sas.com/kb/18401:
"The thought is that if we have exhausted the full region and
 abnormal termination occurs as a result there is not sufficient
 ceiling within the address space to properly clean up. This can
 lead to damaged libraries, potential hanging and looping within
 the address space.

-Original Message-
From: IBM Mainframe Discussion List [mailto:IBM-MAIN@LISTSERV.UA.EDU] On
Behalf Of Peter Hunkeler
Sent: Sunday, October 9, 2016 9:44 AM
To: IBM-MAIN@LISTSERV.UA.EDU
Subject: AW: Re: CEEDUMP possible following 'new' failure

>Fault Analyzer showed that the last thing that happened in the address
space was trying to load some z/OS routines for termination (if it was not
memory termination then it must have been task termination) and failed to
load those routines because of an out of storage condition. 




I have in mind that, at least in earlier times, it was recommended *not* to
code REGION=0M? With REGION=0M the code can to eat up all storage in the
address space. Of course, authorized software may easily  overcome any
REGION setting (I guess).


--
Peter Hunkeler



--
For IBM-MAIN subscribe / signoff / archive access instructions, send email
to lists...@listserv.ua.edu with the message: INFO IBM-MAIN

--
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN


AW: Re: CEEDUMP possible following 'new' failure

2016-10-09 Thread Peter Hunkeler
>Fault Analyzer showed that the last thing that happened in the address space 
>was trying to load some z/OS routines for termination (if it was not memory 
>termination then it must have been task termination) and failed to load those 
>routines because of an out of storage condition.




I have in mind that, at least in earlier times, it was recommended *not* to 
code REGION=0M? With REGION=0M the code can to eat up all storage in the 
address space. Of course, authorized software may easily  overcome any REGION 
setting (I guess).


--
Peter Hunkeler



--
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN


Re: CEEDUMP possible following 'new' failure

2016-10-08 Thread Denis
Hi Jim,
 
I cannot remember exactly, but what happened was that in IMS the STOP REGION 
command was issued and the address space was not listed anymore in IMS (Display 
active showed it was gone).
It was visible in JES but nothing could be done about it, it did neither accept 
cancel nor force.
 
Fault Analyzer showed that the last thing that happened in the address space 
was trying to load some z/OS routines for termination (if it was not memory 
termination then it must have been task termination) and failed to load those 
routines because of an out of storage condition.
 
So the expectation of everyone for this situation is, task termination should 
be possible regardless if there was an IEFUSI reserving the 512k below or not.
If task termination, which is an operating system function, requires storage in 
an address space with no storage left, it should ensure that there is always 
enough room for task termination.

Thanks.
 
 
-Original Message-
From: Jim Mulder <d10j...@us.ibm.com>
To: IBM-MAIN <IBM-MAIN@LISTSERV.UA.EDU>
Sent: Fri, Oct 7, 2016 8:39 pm
Subject: Re: CEEDUMP possible following 'new' failure

> this reminds me of some hanging IMS jobs that could neither be 
> cancelled nor forced because the routines for memterm could not be 
> loaded because of memory exhausted. Only BMC Tooling allowed to get 
> rid of them.
> The suggestion in the PMR was to code an IEFUSI to reserve 512k 
> below to allow memterm to happen in any case.
> 
> Could you please raise another internal discussion why IEFUSI has to
> be coded at all in order to allow memterm to happen?
> Why can't z/OS just ensure that there is always enough storage 
> available in the address space for memterm?

  Since memterm does not access the storage of the address 
being terminated, there is no connection between IEFUSI and memterm.
There is no requirement for any available storage in the address
space being memtermed.  Task termination, yes. 
Memory termination, no. 


Jim Mulder z/OS Diagnosis, Design, Development, Test  IBM Corp. 
Poughkeepsie NY




--
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN


--
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN


Re: CEEDUMP possible following 'new' failure

2016-10-07 Thread Greg Dyck

On 10/7/2016 1:38 PM, Jim Mulder wrote:

  Since memterm does not access the storage of the address
being terminated, there is no connection between IEFUSI and memterm.
There is no requirement for any available storage in the address
space being memtermed.  Task termination, yes.
Memory termination, no.


I think have been cases where an address space determined it was in 
trouble and wanted to MEMTERM itself, but unable to issue a CALLRTM 
MEMTERM for *itself* due to storage being exhausted.  But, as Jim said, 
a FORCE command runs in *MASTER* and does not need any storage in the 
address space.


Greg

--
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN


Re: CEEDUMP possible following 'new' failure

2016-10-07 Thread Jim Mulder
> this reminds me of some hanging IMS jobs that could neither be 
> cancelled nor forced because the routines for memterm could not be 
> loaded because of memory exhausted. Only BMC Tooling allowed to get 
> rid of them.
> The suggestion in the PMR was to code an IEFUSI to reserve 512k 
> below to allow memterm to happen in any case.
> 
> Could you please raise another internal discussion why IEFUSI has to
> be coded at all in order to allow memterm to happen?
> Why can't z/OS just ensure that there is always enough storage 
> available in the address space for memterm?

  Since memterm does not access the storage of the address 
being terminated, there is no connection between IEFUSI and memterm.
There is no requirement for any available storage in the address
space being memtermed.  Task termination, yes. 
Memory termination, no. 


Jim Mulder z/OS Diagnosis, Design, Development, Test  IBM Corp. 
Poughkeepsie NY




--
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN


Re: CEEDUMP possible following 'new' failure

2016-10-07 Thread Denis
Hi,
 
this reminds me of some hanging IMS jobs that could neither be cancelled nor 
forced because the routines for memterm could not be loaded because of memory 
exhausted. Only BMC Tooling allowed to get rid of them.
The suggestion in the PMR was to code an IEFUSI to reserve 512k below to allow 
memterm to happen in any case.
 
Could you please raise another internal discussion why IEFUSI has to be coded 
at all in order to allow memterm to happen?
Why can't z/OS just ensure that there is always enough storage available in the 
address space for memterm?
 
Thanks.
 
 
-Original Message-
From: Jim Mulder <d10j...@us.ibm.com>
To: IBM-MAIN <IBM-MAIN@LISTSERV.UA.EDU>
Sent: Thu, Oct 6, 2016 10:33 pm
Subject: Re: CEEDUMP possible following 'new' failure

>From some internal discussion after this issue was raised today,
our intention is that LE will move the CEEDUMP modules to SCEELPA 
in the next release of z/OS. 

Jim Mulder z/OS Diagnosis, Design, Development, Test  IBM Corp. 
Poughkeepsie NY

> 
> So, when  will CEE.SCEELPA be z/OS standard? :) 
> 
> > -Original Message-
> > From: IBM Mainframe Discussion List [mailto:IBM-MAIN@LISTSERV.UA.EDU]
> > On Behalf Of Jim Mulder
> > Sent: Thursday, October 06, 2016 10:48 AM
> > To: IBM-MAIN@LISTSERV.UA.EDU
> > Subject: Re: CEEDUMP possible following 'new' failure
> > 
> > > The remaining problem is that I am not getting any diagnostic
> > information,
> > > in other words, exactly *which* new failed -- which will of course
> > > make
> > any
> > > bug of this sort in the field hard to find. I call CEEDUMP to get a
> > > call trace and it produces an *empty* four-line dataset. On the
> > > console I get
> > >
> > > IEW4000I FETCH FOR MODULE CEEMENU3 FROM DDNAME *VLF* FAILED
> > BECAUSE
> > > INSUFFICIENT STORAGE WAS AVAILABLE.
> > > CSV031I LIBRARY ACCESS FAILED FOR MODULE CEEMENU3, RETURN CODE
> > 24,
> > REASON
> > > CODE 26080021, DDNAME *LNKLST*
> > 
> >   I would suggest putting the CEEDUMP-related modules in LPA.  Our
> > intention in z/OS is that modules involved in the production of
> > SYSABEND/SYSUDUMP/SYSMDUMP/IEATDUMP/SDUMP
> > should be in LPA, so that they don't need get loaded into exhausted 
REGION-
> > constrained storage while trying to take a dump of REGION-constrained
> > storage exhaustion.  (And I say "our intention"
> > because we do sometimes find cases where we did not do what we
> > intended).



--
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN


--
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN


AW: Re: CEEDUMP possible following 'new' failure

2016-10-07 Thread Peter Hunkeler

>From some internal discussion after this issue was raised today,
our intention is that LE will move the CEEDUMP modules to SCEELPA
in the next release of z/OS.




Didn't look up and didn't care so far, but now that you mention it, I'm 
astonished those modules are not currrently part of LE's LPA library. Could you 
please have another internal discussion :-) and provide list of the   the LE 
recovery and dump modules which are not currently in LPA but should be?


--
Peter Hunkeler



--
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN


Re: CEEDUMP possible following 'new' failure

2016-10-06 Thread Charles Mills
Awesome! Thanks,

Charles

-Original Message-
From: IBM Mainframe Discussion List [mailto:IBM-MAIN@LISTSERV.UA.EDU] On
Behalf Of Jim Mulder
Sent: Thursday, October 06, 2016 1:33 PM
To: IBM-MAIN@LISTSERV.UA.EDU
Subject: Re: CEEDUMP possible following 'new' failure

  From some internal discussion after this issue was raised today, our
intention is that LE will move the CEEDUMP modules to SCEELPA in the next
release of z/OS. 

--
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN


Re: CEEDUMP possible following 'new' failure

2016-10-06 Thread Jim Mulder
  From some internal discussion after this issue was raised today,
our intention is that LE will move the CEEDUMP modules to SCEELPA 
in the next release of z/OS. 

Jim Mulder z/OS Diagnosis, Design, Development, Test  IBM Corp. 
Poughkeepsie NY

> 
> So, when  will CEE.SCEELPA be z/OS standard? :) 
> 
> > -Original Message-
> > From: IBM Mainframe Discussion List [mailto:IBM-MAIN@LISTSERV.UA.EDU]
> > On Behalf Of Jim Mulder
> > Sent: Thursday, October 06, 2016 10:48 AM
> > To: IBM-MAIN@LISTSERV.UA.EDU
> > Subject: Re: CEEDUMP possible following 'new' failure
> > 
> > > The remaining problem is that I am not getting any diagnostic
> > information,
> > > in other words, exactly *which* new failed -- which will of course
> > > make
> > any
> > > bug of this sort in the field hard to find. I call CEEDUMP to get a
> > > call trace and it produces an *empty* four-line dataset. On the
> > > console I get
> > >
> > > IEW4000I FETCH FOR MODULE CEEMENU3 FROM DDNAME *VLF*FAILED
> > BECAUSE
> > > INSUFFICIENT STORAGE WAS AVAILABLE.
> > > CSV031I LIBRARY ACCESS FAILED FOR MODULE CEEMENU3, RETURN CODE
> > 24,
> > REASON
> > > CODE 26080021, DDNAME *LNKLST*
> > 
> >   I would suggest putting the CEEDUMP-related modules in LPA.  Our
> > intention in z/OS is that modules involved in the production of
> > SYSABEND/SYSUDUMP/SYSMDUMP/IEATDUMP/SDUMP
> > should be in LPA, so that they don't need get loaded into exhausted 
REGION-
> > constrained storage while trying to take a dump of REGION-constrained
> > storage exhaustion.  (And I say "our intention"
> > because we do sometimes find cases where we did not do what we
> > intended).



--
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN


Re: CEEDUMP possible following 'new' failure

2016-10-06 Thread Gibney, Dave
So, when  will CEE.SCEELPA be z/OS standard? :) 

> -Original Message-
> From: IBM Mainframe Discussion List [mailto:IBM-MAIN@LISTSERV.UA.EDU]
> On Behalf Of Jim Mulder
> Sent: Thursday, October 06, 2016 10:48 AM
> To: IBM-MAIN@LISTSERV.UA.EDU
> Subject: Re: CEEDUMP possible following 'new' failure
> 
> > The remaining problem is that I am not getting any diagnostic
> information,
> > in other words, exactly *which* new failed -- which will of course
> > make
> any
> > bug of this sort in the field hard to find. I call CEEDUMP to get a
> > call trace and it produces an *empty* four-line dataset. On the
> > console I get
> >
> > IEW4000I FETCH FOR MODULE CEEMENU3 FROM DDNAME *VLF*FAILED
> BECAUSE
> > INSUFFICIENT STORAGE WAS AVAILABLE.
> > CSV031I LIBRARY ACCESS FAILED FOR MODULE CEEMENU3, RETURN CODE
> 24,
> REASON
> > CODE 26080021, DDNAME *LNKLST*
> 
>   I would suggest putting the CEEDUMP-related modules in LPA.  Our
> intention in z/OS is that modules involved in the production of
> SYSABEND/SYSUDUMP/SYSMDUMP/IEATDUMP/SDUMP
> should be in LPA, so that they don't need get loaded into exhausted REGION-
> constrained storage while trying to take a dump of REGION-constrained
> storage exhaustion.  (And I say "our intention"
> because we do sometimes find cases where we did not do what we
> intended).
> 
> Jim Mulder z/OS Diagnosis, Design, Development, Test  IBM Corp.
> Poughkeepsie NY
> 
> 
> 
> --
> For IBM-MAIN subscribe / signoff / archive access instructions, send email to
> lists...@listserv.ua.edu with the message: INFO IBM-MAIN

--
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN


Re: CEEDUMP possible following 'new' failure

2016-10-06 Thread Charles Mills
Ah! Most excellent. Thank you.

Charles

-Original Message-
From: IBM Mainframe Discussion List [mailto:IBM-MAIN@LISTSERV.UA.EDU] On
Behalf Of Jim Mulder
Sent: Thursday, October 06, 2016 10:48 AM
To: IBM-MAIN@LISTSERV.UA.EDU
Subject: Re: CEEDUMP possible following 'new' failure

> The remaining problem is that I am not getting any diagnostic
information,
> in other words, exactly *which* new failed -- which will of course 
> make
any
> bug of this sort in the field hard to find. I call CEEDUMP to get a 
> call trace and it produces an *empty* four-line dataset. On the 
> console I get
> 
> IEW4000I FETCH FOR MODULE CEEMENU3 FROM DDNAME *VLF*FAILED BECAUSE
> INSUFFICIENT STORAGE WAS AVAILABLE.
> CSV031I LIBRARY ACCESS FAILED FOR MODULE CEEMENU3, RETURN CODE 24,
REASON
> CODE 26080021, DDNAME *LNKLST*

  I would suggest putting the CEEDUMP-related modules in LPA.  Our intention
in z/OS is that modules involved in the production of
SYSABEND/SYSUDUMP/SYSMDUMP/IEATDUMP/SDUMP
should be in LPA, so that they don't need get loaded into exhausted
REGION-constrained storage while trying to take a dump of REGION-constrained
storage exhaustion.  (And I say "our intention"
because we do sometimes find cases where we did not do what we intended). 

--
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN


Re: CEEDUMP possible following 'new' failure

2016-10-06 Thread Bill Woodger
The reserve seems to be used as the new stack segment, and anything else can 
still gobble it up. Gets a U4008 with 1004 not a 1024 apparently. A larger 
reserve may help if you still have things acquiring storage. 


But then you didn't get a U4008.

Does the production of an LE Dump acquire storage? I'd suspect so.

Interesting here, you said POSIX(ON), but I have no idea what you meant by 
"dubbed"?

"When an error occurs that would cause a CEEDUMP to be taken, and this is a 
POSIX application, Language Environment writes this dump to the current 
directory. Output from CEE3DMP is written to one of the following (in top-down 
order):

The directory found in _CEE_DMPTARG, if found.
The current working directory, if this is not the root (/), and if the 
directory is writable, and if the dump pathname (made up of the cwd pathname 
plus the dump file name) does not exceed 1024 characters.
The directory found in environment variable TMPDIR (if the temporary 
directory is not /TMP.
/TMP"

You are probably looking in the right place, it started producing, and died.

Sliding off-topic, is "writable" an American word?

Following the CSV message, it seems a pretty internal thing, unless you have 
preceding messages, which I guess you don't.

--
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN


Re: CEEDUMP possible following 'new' failure

2016-10-06 Thread Jim Mulder
> The remaining problem is that I am not getting any diagnostic 
information,
> in other words, exactly *which* new failed -- which will of course make 
any
> bug of this sort in the field hard to find. I call CEEDUMP to get a call
> trace and it produces an *empty* four-line dataset. On the console I get
> 
> IEW4000I FETCH FOR MODULE CEEMENU3 FROM DDNAME *VLF*FAILED BECAUSE
> INSUFFICIENT STORAGE WAS AVAILABLE.
> CSV031I LIBRARY ACCESS FAILED FOR MODULE CEEMENU3, RETURN CODE 24, 
REASON
> CODE 26080021, DDNAME *LNKLST*

  I would suggest putting the CEEDUMP-related modules in LPA.  Our 
intention in z/OS is that modules involved in the production of 
SYSABEND/SYSUDUMP/SYSMDUMP/IEATDUMP/SDUMP 
should be in LPA, so that they don't need get loaded into exhausted
REGION-constrained storage while trying to take a dump of 
REGION-constrained storage exhaustion.  (And I say "our intention"
because we do sometimes find cases where we did not do what we 
intended). 

Jim Mulder z/OS Diagnosis, Design, Development, Test  IBM Corp. 
Poughkeepsie NY



--
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN


Re: CEEDUMP possible following 'new' failure

2016-10-06 Thread Charles Mills
No, no, I am not trying to improve a program that is using too much memory.
I am trying to write recovery code that will help diagnose future allocation
failures, whatever their cause.

LE provides leak analysis tools and I use them.

I know where the storage is going: 

static const unsigned int size_to_allocate = 5000;
for ( int i = 0; i < INT_MAX; i++ ) char *foo = new char[size_to_allocate];

Charles

-Original Message-
From: IBM Mainframe Discussion List [mailto:IBM-MAIN@LISTSERV.UA.EDU] On
Behalf Of Bernd Oppolzer
Sent: Thursday, October 06, 2016 9:34 AM
To: IBM-MAIN@LISTSERV.UA.EDU
Subject: Re: CEEDUMP possible following 'new' failure

Some suggestions:

- try REPORT (LE option) to see where the storage is used (below, above,
User heap, LE below- or anyheap) and how much storage is used before you get
in trouble; does it depend from the amount of input data? REPORT will also
show if you can do any better by playing with the LE options.

- if it's not an easy case (storage acquired below, but should be above /
any), there will be some processing which acquires storage and does not
release it; you will have to identify that processing and repair it.

- There are some tools on the market which allow you to find program parts
which do such things (for example: CEL4MCHK, that is: LE memory checking
heap manager)

- I wrote a routine that takes some sort of snapshot of the heap at a
certain point in time, and then again later, and then it compares the two
snapshots and shows the areas that have been acquired in the meantime and
not freed. I was very successful in finding memory leaks with this tool at
my current customer's 

--
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN


Re: CEEDUMP possible following 'new' failure

2016-10-06 Thread Charles Mills
Mmm. I hear you. May be the right answer. I am kind of in love with the
"character" CEE3DMP because it is easy for customers to send and not screw
up. Easy to view without uploading to the mainframe and firing up IPCS.
Seems to point quickly to about 99 out of 100 problems.

I do call some terminate-type function. CEE3DMP() is not "fatal" -- it
returns to the caller.

Charles

-Original Message-
From: IBM Mainframe Discussion List [mailto:IBM-MAIN@LISTSERV.UA.EDU] On
Behalf Of Don Poitras
Sent: Thursday, October 06, 2016 8:57 AM
To: IBM-MAIN@LISTSERV.UA.EDU
Subject: Re: CEEDUMP possible following 'new' failure

In article <01a301d21fe6$1dbcd760$59368620$@mcn.org> you wrote:
> I have been wrestling with the issue of recovery from a failure of 'new'
> (kind of like a GETMAIN for those of you who are not C people; just 
> like
> malloc() for those of you who are C but not C++ people) in XLC/LE C++
code.

> (Yes, I know, the right answer is "don't do too many 'new's" but this 
> is error recovery code. Stuff happens, or IBM would not have invented
ESTAE.
> "More region size" is not the answer -- this is an intentional test of 
> storage exhaustion. More storage would just make it slower to fail 
> .)

> First I got past the non-standard behavior of XLC in that rather than 
> blowing up per the standard, XLC just returns NULL (0) for a failed 
> new. Got the LANGLVL(NEWEXCP) in place.

> So I am catching 'new' exceptions. The next problem was that it was 
> impossible to do anything meaningful after catching the 'new' 
> exception because storage was exhausted. So I read somewhere that one 
> had to specify #pragma RUNOPTS( STORAGE(,,,32K) ) to reserve some 
> storage for the storage exhaustion case. I specified 48K just to be on 
> the safe side. This made things better: I go through my error 
> recovery, clean things up, and end gracefully.

> The remaining problem is that I am not getting any diagnostic 
> information, in other words, exactly *which* new failed -- which will 
> of course make any bug of this sort in the field hard to find. I call 
> CEEDUMP to get a call trace and it produces an *empty* four-line 
> dataset. On the console I get

> IEW4000I FETCH FOR MODULE CEEMENU3 FROM DDNAME *VLF*FAILED BECAUSE
> INSUFFICIENT STORAGE WAS AVAILABLE.
> CSV031I LIBRARY ACCESS FAILED FOR MODULE CEEMENU3, RETURN CODE 24, 
> REASON CODE 26080021, DDNAME *LNKLST*

> Any suggestions?

> Charles

Don't call CEEDUMP. Call abort() or something like:

{int *ptr; ptr = 0; *ptr = 1;}

Add another runopts:
#pragma runopts(TERMTHDACT(UADUMP))

and make sure you have a SYSMDUMP DD with a dataset to use later with IPCS.

IP VERBX LEDATA 'ceedump nthreads(*)'

will give you tracebacks for all threads including the one that caused the
abend.

--
Don Poitras - SAS Development  -  SAS Institute Inc. - SAS Campus Drive
sas...@sas.com   (919) 531-5637Cary, NC 27513

--
For IBM-MAIN subscribe / signoff / archive access instructions, send email
to lists...@listserv.ua.edu with the message: INFO IBM-MAIN

--
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN


Re: CEEDUMP possible following 'new' failure

2016-10-06 Thread Charles Mills
STC, so sort of batch. Not UNIX command line, but POSIX(ON) and dubbed.

> almost anything else asking for more memory could fail, couldn't it?

Yes, in my model of how this works, LE GETMAINs a bunch of storage for heap. 
When I do a new, it sub-allocates from that heap. If the heap is exhausted LE 
does another GETMAIN. Lather, rinse, repeat. When storage is gone, it's gone, 
pretty much no matter who you are.

I'm not sure *exactly* what the reserved storage is used for. What is the 
"trigger" or "authorization" for a particular request to be able to be 
satisfied from the reserved storage? Is that where the failed load is 
attempting to get its storage from? I am going to guess not, so reserving a 
megabyte won't make any difference. Maybe.

And yes, exactly, I meant CEE3DMP().

Charles

-Original Message-
From: IBM Mainframe Discussion List [mailto:IBM-MAIN@LISTSERV.UA.EDU] On Behalf 
Of Bill Woodger
Sent: Thursday, October 06, 2016 8:59 AM
To: IBM-MAIN@LISTSERV.UA.EDU
Subject: Re: CEEDUMP possible following 'new' failure

Non-batch, I assume. Whilst your "news" are sucking up memory, almost anything 
else asking for more memory could fail, couldn't it? Not just one of yours?

--
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN


Re: CEEDUMP possible following 'new' failure

2016-10-06 Thread Bernd Oppolzer

Some suggestions:

- try REPORT (LE option) to see where the storage is used (below, above, 
User heap,
LE below- or anyheap) and how much storage is used before you get in 
trouble; does it
depend from the amount of input data? REPORT will also show if you can 
do any better

by playing with the LE options.

- if it's not an easy case (storage acquired below, but should be above 
/ any),
there will be some processing which acquires storage and does not 
release it;

you will have to identify that processing and repair it.

- There are some tools on the market which allow you to find program parts
which do such things (for example: CEL4MCHK, that is: LE memory checking 
heap manager)


- I wrote a routine that takes some sort of snapshot of the heap at a 
certain point in
time, and then again later, and then it compares the two snapshots and 
shows the
areas that have been acquired in the meantime and not freed. I was very 
successful
in finding memory leaks with this tool at my current customer's site. 
(The culprit was
a PL/1 program, BTW, but it could have been C++, too ... the tool finds 
them all).


If you are interested to know more about this, contact me offline.

Kind regards

Bernd


Am 06.10.2016 um 17:27 schrieb Charles Mills:

I have been wrestling with the issue of recovery from a failure of 'new'
(kind of like a GETMAIN for those of you who are not C people; just like
malloc() for those of you who are C but not C++ people) in XLC/LE C++ code.

(Yes, I know, the right answer is "don't do too many 'new's" but this is
error recovery code. Stuff happens, or IBM would not have invented ESTAE.
"More region size" is not the answer -- this is an intentional test of
storage exhaustion. More storage would just make it slower to fail .)

First I got past the non-standard behavior of XLC in that rather than
blowing up per the standard, XLC just returns NULL (0) for a failed new. Got
the LANGLVL(NEWEXCP) in place.

So I am catching 'new' exceptions. The next problem was that it was
impossible to do anything meaningful after catching the 'new' exception
because storage was exhausted. So I read somewhere that one had to specify
#pragma RUNOPTS( STORAGE(,,,32K) ) to reserve some storage for the storage
exhaustion case. I specified 48K just to be on the safe side. This made
things better: I go through my error recovery, clean things up, and end
gracefully.

The remaining problem is that I am not getting any diagnostic information,
in other words, exactly *which* new failed -- which will of course make any
bug of this sort in the field hard to find. I call CEEDUMP to get a call
trace and it produces an *empty* four-line dataset. On the console I get

IEW4000I FETCH FOR MODULE CEEMENU3 FROM DDNAME *VLF*FAILED BECAUSE
INSUFFICIENT STORAGE WAS AVAILABLE.
CSV031I LIBRARY ACCESS FAILED FOR MODULE CEEMENU3, RETURN CODE 24, REASON
CODE 26080021, DDNAME *LNKLST*

Any suggestions?

Charles

--
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN



--
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN


Re: CEEDUMP possible following 'new' failure

2016-10-06 Thread Farley, Peter x23353
Sounds to me like 48K is not enough to allow the CEEDUMP process to complete.  
48K seems unreasonably low to me, I would make it 128K or even more to permit 
the CEEDUMP process to run.  I would research the CEEDUMP documentation to see 
if the minimum storage requirements for it to succeed are published and if not 
open a PMR with IBM to ask for it.

Also, is that "storage exhaustion reserve" above the line or below the line?  
It might make a difference where it is reserved.

HTH

Peter

-Original Message-
From: IBM Mainframe Discussion List [mailto:IBM-MAIN@LISTSERV.UA.EDU] On Behalf 
Of Charles Mills
Sent: Thursday, October 06, 2016 11:27 AM
To: IBM-MAIN@LISTSERV.UA.EDU
Subject: CEEDUMP possible following 'new' failure

I have been wrestling with the issue of recovery from a failure of 'new'
(kind of like a GETMAIN for those of you who are not C people; just like
malloc() for those of you who are C but not C++ people) in XLC/LE C++ code.

(Yes, I know, the right answer is "don't do too many 'new's" but this is
error recovery code. Stuff happens, or IBM would not have invented ESTAE.
"More region size" is not the answer -- this is an intentional test of
storage exhaustion. More storage would just make it slower to fail .)

First I got past the non-standard behavior of XLC in that rather than
blowing up per the standard, XLC just returns NULL (0) for a failed new. Got
the LANGLVL(NEWEXCP) in place.

So I am catching 'new' exceptions. The next problem was that it was
impossible to do anything meaningful after catching the 'new' exception
because storage was exhausted. So I read somewhere that one had to specify
#pragma RUNOPTS( STORAGE(,,,32K) ) to reserve some storage for the storage
exhaustion case. I specified 48K just to be on the safe side. This made
things better: I go through my error recovery, clean things up, and end
gracefully.

The remaining problem is that I am not getting any diagnostic information,
in other words, exactly *which* new failed -- which will of course make any
bug of this sort in the field hard to find. I call CEEDUMP to get a call
trace and it produces an *empty* four-line dataset. On the console I get

IEW4000I FETCH FOR MODULE CEEMENU3 FROM DDNAME *VLF*FAILED BECAUSE
INSUFFICIENT STORAGE WAS AVAILABLE.
CSV031I LIBRARY ACCESS FAILED FOR MODULE CEEMENU3, RETURN CODE 24, REASON
CODE 26080021, DDNAME *LNKLST*  

Any suggestions?

Charles 
--

This message and any attachments are intended only for the use of the addressee 
and may contain information that is privileged and confidential. If the reader 
of the message is not the intended recipient or an authorized representative of 
the intended recipient, you are hereby notified that any dissemination of this 
communication is strictly prohibited. If you have received this communication 
in error, please notify us immediately by e-mail and delete the message and any 
attachments from your system.

--
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN


Re: CEEDUMP possible following 'new' failure

2016-10-06 Thread Bill Woodger
Non-batch, I assume. Whilst your "news" are sucking up memory, almost anything 
else asking for more memory could fail, couldn't it? Not just one of yours?

Do you mean CEE3DMP? CEEDUMP is just for setting the options for am LE dump.

--
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN


Re: CEEDUMP possible following 'new' failure

2016-10-06 Thread Don Poitras
In article <01a301d21fe6$1dbcd760$59368620$@mcn.org> you wrote:
> I have been wrestling with the issue of recovery from a failure of 'new'
> (kind of like a GETMAIN for those of you who are not C people; just like
> malloc() for those of you who are C but not C++ people) in XLC/LE C++ code.

> (Yes, I know, the right answer is "don't do too many 'new's" but this is
> error recovery code. Stuff happens, or IBM would not have invented ESTAE.
> "More region size" is not the answer -- this is an intentional test of
> storage exhaustion. More storage would just make it slower to fail .)

> First I got past the non-standard behavior of XLC in that rather than
> blowing up per the standard, XLC just returns NULL (0) for a failed new. Got
> the LANGLVL(NEWEXCP) in place.

> So I am catching 'new' exceptions. The next problem was that it was
> impossible to do anything meaningful after catching the 'new' exception
> because storage was exhausted. So I read somewhere that one had to specify
> #pragma RUNOPTS( STORAGE(,,,32K) ) to reserve some storage for the storage
> exhaustion case. I specified 48K just to be on the safe side. This made
> things better: I go through my error recovery, clean things up, and end
> gracefully.

> The remaining problem is that I am not getting any diagnostic information,
> in other words, exactly *which* new failed -- which will of course make any
> bug of this sort in the field hard to find. I call CEEDUMP to get a call
> trace and it produces an *empty* four-line dataset. On the console I get

> IEW4000I FETCH FOR MODULE CEEMENU3 FROM DDNAME *VLF*FAILED BECAUSE
> INSUFFICIENT STORAGE WAS AVAILABLE.
> CSV031I LIBRARY ACCESS FAILED FOR MODULE CEEMENU3, RETURN CODE 24, REASON
> CODE 26080021, DDNAME *LNKLST*  

> Any suggestions?

> Charles 

Don't call CEEDUMP. Call abort() or something like:

{int *ptr; ptr = 0; *ptr = 1;}

Add another runopts:
#pragma runopts(TERMTHDACT(UADUMP))

and make sure you have a SYSMDUMP DD with a dataset to use later with
IPCS.

IP VERBX LEDATA 'ceedump nthreads(*)'

will give you tracebacks for all threads including the one that caused
the abend.

-- 
Don Poitras - SAS Development  -  SAS Institute Inc. - SAS Campus Drive
sas...@sas.com   (919) 531-5637Cary, NC 27513

--
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN


Re: CEEDUMP possible following 'new' failure

2016-10-06 Thread Pan, Zhicheng
This looks very like ABEND 80A in assembler's equivalent. The below-line region 
space has been exhausted. Most likely because in the recovery process, too much 
below line staorge are used and no more module can be loaded.

James

-Original Message-
From: IBM Mainframe Discussion List [mailto:IBM-MAIN@LISTSERV.UA.EDU] On Behalf 
Of Charles Mills
Sent: Thursday, October 06, 2016 11:27 AM
To: IBM-MAIN@LISTSERV.UA.EDU
Subject: CEEDUMP possible following 'new' failure

I have been wrestling with the issue of recovery from a failure of 'new'
(kind of like a GETMAIN for those of you who are not C people; just like
malloc() for those of you who are C but not C++ people) in XLC/LE C++ code.

(Yes, I know, the right answer is "don't do too many 'new's" but this is error 
recovery code. Stuff happens, or IBM would not have invented ESTAE.
"More region size" is not the answer -- this is an intentional test of storage 
exhaustion. More storage would just make it slower to fail .)

First I got past the non-standard behavior of XLC in that rather than blowing 
up per the standard, XLC just returns NULL (0) for a failed new. Got the 
LANGLVL(NEWEXCP) in place.

So I am catching 'new' exceptions. The next problem was that it was impossible 
to do anything meaningful after catching the 'new' exception because storage 
was exhausted. So I read somewhere that one had to specify #pragma RUNOPTS( 
STORAGE(,,,32K) ) to reserve some storage for the storage exhaustion case. I 
specified 48K just to be on the safe side. This made things better: I go 
through my error recovery, clean things up, and end gracefully.

The remaining problem is that I am not getting any diagnostic information, in 
other words, exactly *which* new failed -- which will of course make any bug of 
this sort in the field hard to find. I call CEEDUMP to get a call trace and it 
produces an *empty* four-line dataset. On the console I get

IEW4000I FETCH FOR MODULE CEEMENU3 FROM DDNAME *VLF*FAILED BECAUSE
INSUFFICIENT STORAGE WAS AVAILABLE.
CSV031I LIBRARY ACCESS FAILED FOR MODULE CEEMENU3, RETURN CODE 24, REASON CODE 
26080021, DDNAME *LNKLST*

Any suggestions?

Charles

--
For IBM-MAIN subscribe / signoff / archive access instructions, send email to 
lists...@listserv.ua.edu with the message: INFO IBM-MAIN





ATTENTION: -

The information contained in this message (including any files transmitted with 
this message) may contain proprietary, trade secret or other confidential 
and/or legally privileged information. Any pricing information contained in 
this message or in any files transmitted with this message is always 
confidential and cannot be shared with any third parties without prior written 
approval from Syncsort. This message is intended to be read only by the 
individual or entity to whom it is addressed or by their designee. If the 
reader of this message is not the intended recipient, you are on notice that 
any use, disclosure, copying or distribution of this message, in any form, is 
strictly prohibited. If you have received this message in error, please 
immediately notify the sender and/or Syncsort and destroy all copies of this 
message in your possession, custody or control.

--
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN


CEEDUMP possible following 'new' failure

2016-10-06 Thread Charles Mills
I have been wrestling with the issue of recovery from a failure of 'new'
(kind of like a GETMAIN for those of you who are not C people; just like
malloc() for those of you who are C but not C++ people) in XLC/LE C++ code.

(Yes, I know, the right answer is "don't do too many 'new's" but this is
error recovery code. Stuff happens, or IBM would not have invented ESTAE.
"More region size" is not the answer -- this is an intentional test of
storage exhaustion. More storage would just make it slower to fail .)

First I got past the non-standard behavior of XLC in that rather than
blowing up per the standard, XLC just returns NULL (0) for a failed new. Got
the LANGLVL(NEWEXCP) in place.

So I am catching 'new' exceptions. The next problem was that it was
impossible to do anything meaningful after catching the 'new' exception
because storage was exhausted. So I read somewhere that one had to specify
#pragma RUNOPTS( STORAGE(,,,32K) ) to reserve some storage for the storage
exhaustion case. I specified 48K just to be on the safe side. This made
things better: I go through my error recovery, clean things up, and end
gracefully.

The remaining problem is that I am not getting any diagnostic information,
in other words, exactly *which* new failed -- which will of course make any
bug of this sort in the field hard to find. I call CEEDUMP to get a call
trace and it produces an *empty* four-line dataset. On the console I get

IEW4000I FETCH FOR MODULE CEEMENU3 FROM DDNAME *VLF*FAILED BECAUSE
INSUFFICIENT STORAGE WAS AVAILABLE.
CSV031I LIBRARY ACCESS FAILED FOR MODULE CEEMENU3, RETURN CODE 24, REASON
CODE 26080021, DDNAME *LNKLST*  

Any suggestions?

Charles 

--
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN