Re: dump reading made simple??

David de Jongh Tue, 30 Jul 2013 07:50:06 -0700

This was "déjà vu all over again" for me.  We have an in-house abend
analysis routine driven by an LE handle condition routine, and also by the
CICS XPCTA exit.    I've been maintaining it for about 20 years now, through
multiple releases of CICS, COBOL and MVS/z/OS.  I learned assembler through
a 3-week part time class in 1972, then learned how real programs were
written over the next couple of years in an assembler application
maintenance group.  After being shown how to read a SYSUDUMP, I was fixing
production problems at 2a.m.  It's not rocket science, just a question of
getting some help at the start, and getting dumps to look at fairly
frequently.
David de Jongh

-----Original Message-----
From: IBM Mainframe Assembler List [mailto:[email protected]]
On Behalf Of Bernd Oppolzer
Sent: Tuesday, July 30, 2013 3:16 AM
To: [email protected]
Subject: Re: dump reading made simple??

I'd like to enter this thread once again.

90 percent of the dumps - if not more - that we see at our site are dumps
from normal application code, that is 0Cx etc. exceptions or other "easy"
resolvable
reasons from application code. Or same reason, but the error is detected
somewhere below, for example in a LE routine which is called after a call to
a PL/1 or C runtime routine. Then the caller of this runtime routine
normally has to be blamed for it.

We automized dump reading as much as possible - making it almost unnecessary
most of the time - by providing an LE exit which runs in all our environment
and which in case of an error catches this error and provides enough
information from the save area back trace, that normally the application
developer only has to look at those informations and simply doesn't need to
refer to the following SYSUDUMP.
For example, we print every DSA for every procedure call, together with the
name of the function, the parameter address lists of every call, the
complete call hierarchy etc., the registers at every call level, the offset
of the call etc.
If the error is indeed in a LE routine below the application code, we
recognize this and go up to the application code and identify the error
position in the application code - same goes for DB2 errors, that is, when
the error position is in the routine that is handling the DB2 "SQLCODE not
handled" condition. And: if we found the name of the module which is the
cause of the error, we send an alarm mail to the department which is
reponsible for the module - we get this information from a repository.

The information provided this way is much easier to read for our people than
SYSUDUMP and even easier than CEEDUMP (it has more information, has a
somehow better structure in our opinion, and - important for some of our
co-workers - it's in German language).

Furthermore, we teach the developers how to cope with this.

This was necessary (we did it in 2005), because we realized some problems:

- the dumps looked different in the different environments (batch, test, DB
dialog aka IMS), but we wanted the same look and feel in every environment

- dump reading skills degraded

- we didn't want to buy an expensive tool and do the customizing in the
different environments; instead we wanted one of our own, where we could add
additional function (see above in an easy way)

 From today's viewpoint, it looks like a success story.

Even in cases when the save area is destroyed (overwritten), the LE exit
does a very good job by providing at least the rests of the save area trace.
It tries to find the save areas first from the bottom (register 13), then
from above (TCBFSA), and in the normal case, the two chains fit together. If
not, there is a gap, and this gap is documented.

The save area trace and the back chain is very imporant for us, because at
our site we typically have many small modules calling each other and it is
not uncommon to see some 50 levels of calling hierarchy.

BTW: the method works regardless of the programming language; we have C,
PL/1 and
ASSEMBLER (and, at a neighbor site, the exit also works with C++ functions -
in fact the method to get the function name from the entry point is the same
for all LE languages, so I believe it will work for COBOL, too, although
there is no COBOL around).

Kind regards

Bernd

Am 29.07.2013 23:33, schrieb T'Dell Sparks:
> (First set of registers are usually the calling program) be careful
>
> That can be misleading as the RTM2 will put things in the DUMP as well.
That makes IPCS indispensable when dealing with dumps. The PSW might be
pointing to something else initially  in the dump so you have to scan down
through the listing  find you program. That's why, as some have said here
mentorship is  a good way to skip some other painful lessons. I still have
to call on 50 year vetran even after I've done t6his  for over 30.. I don't
write wait post code..
> Good point though ..
>
>

Re: dump reading made simple??

Reply via email to