Thanks. No COBOL in this picture.

Charles


-----Original Message-----
From: IBM Mainframe Discussion List [mailto:IBM-MAIN@LISTSERV.UA.EDU] On
Behalf Of Peter Hunkeler
Sent: Saturday, August 26, 2017 1:31 PM
To: IBM-MAIN@LISTSERV.UA.EDU
Subject: AW: Why would LE not trap?

>What should I be looking for? What would effectively override TRAP(ON)? 
Would SDWACLUP ever be set on a vanilla S0C4? 
 
>Why my own ESTAE? So I can deal with unrecoverable ABENDs, which LE will
not. Why NOSPIE? So everything comes through the ESTAE. 


Dangerous path, you chose. Why do I say this? Because we have learnt the
hard way.
We're using a vendor product that chose to install it own ESTAEX upon the
init call from the application. Not only did it install its own ESTAE, it
did also cancel LE's ESPIE (which has the same effect as running with
TRAP(ON,NOSPIE)). It deregisters its ESTAE when the application call the
termination function. In-between, the application run, and eventually calls
some vendor product interface, but also does all the normal programming
stuff under LE.
Installing an own ESTAE is not supported by LE! That's what the manual says.
It seemed to work but actually there are situation when you get into trouble
with thus setup. We learnt this step by step and from discussion I had with
IBM LE people. Finally, I succeeded in convincing the vendor to change their
product, and now they use the LE supported way to get notified about
problems, namely the CEEEXTAN user exit.


Note that we're a COBOL shop, and COBOL allows operations that loose
significant digits in numbers. This causes troubles when the decimal
overflow program mask is set, which it is if C code is also part of the
application (implicit or explicit).


- If you run with TRAP(ON,NOSPIE), then your own ESTAE must recognize that
an S0CA ABEND from a COBOL statement is *not* a problem, and your code must
resume the COBOL code. Not easy, believe me. 
In addition, this may become a total performance killer. Assume, (and we
have seen sucht jobs) that a COBOL program has some code that causes decimal
overflow (loosing significant digits). This is intentional, and proper COBOL
coding. Assume such an overflow happens thousands of times during the batch
run. Further, C code is also involved, so the decimal overflow program mask
bit is set. 
The result: COBOL code causes thousands of decimal overflows (00A program
checks). There is no ESPIE, which can handle this with a short path length.
Program check handler takes a shapshot of the system trace table in
anticipation that a dump might be taken, then it percolates to RTM, which
invokes ESTAE routines. Even if the ESTAE knows how to handle this COBOL
decimal overflow, it takes endless time to take the snapshot of the system
trace, depending on the trace table size. WE have 15MB per processor and
this leads to an elapsed time to take the snapshot of 0.2 to 0.5 seconds !!
A nightmare, the application just never completes.  


- If you run with TRAP(ON,SPIE), which you can't inhibit, because
PARM='/TRAP(ON,SPIE) would override your #pragma TRAP(ON,NOSPIE), then LE's
ESPIE will get control for program check, and, LE may decide this is an
unrecoverable error. According to the LE lab people, LE's error handler may
choose not to percolate to (what it thinks is its own ESTAE), but to cancel
the ESTAE and terminate. However, LE, not knowing there is another ESTAE in
front of its own, cancels you product's ESTAE. So, your product will not get
notified about the error. LE's ESTAE, which is still in place, will gain
control and terminate the application.


I understand this does not help much in finding the problem with that one
customer.
LE's documentation is not harsh enough in saying you own ESTAEs and ESPIEs
can cause troubles. Of course, this is only if the product installs the
ESTAE/ESPIE at the beginning of the application, and cancels it at the end.
If you call some code that installs its own recovery, does its job,
deinstalls its own recovery, and returns, then that should be fine, and not
interfere with LE.


 The vendor of our product has change it code an all those problems are
gone.


--
Peter Hunkeler  

----------------------------------------------------------------------
For IBM-MAIN subscribe / signoff / archive access instructions, send email
to lists...@listserv.ua.edu with the message: INFO IBM-MAIN

----------------------------------------------------------------------
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN

Reply via email to