This was "déjà vu all over again" for me. We have an in-house abend analysis routine driven by an LE handle condition routine, and also by the CICS XPCTA exit. I've been maintaining it for about 20 years now, through multiple releases of CICS, COBOL and MVS/z/OS. I learned assembler through a 3-week part time class in 1972, then learned how real programs were written over the next couple of years in an assembler application maintenance group. After being shown how to read a SYSUDUMP, I was fixing production problems at 2a.m. It's not rocket science, just a question of getting some help at the start, and getting dumps to look at fairly frequently. David de Jongh
-----Original Message----- From: IBM Mainframe Assembler List [mailto:[email protected]] On Behalf Of Bernd Oppolzer Sent: Tuesday, July 30, 2013 3:16 AM To: [email protected] Subject: Re: dump reading made simple?? I'd like to enter this thread once again. 90 percent of the dumps - if not more - that we see at our site are dumps from normal application code, that is 0Cx etc. exceptions or other "easy" resolvable reasons from application code. Or same reason, but the error is detected somewhere below, for example in a LE routine which is called after a call to a PL/1 or C runtime routine. Then the caller of this runtime routine normally has to be blamed for it. We automized dump reading as much as possible - making it almost unnecessary most of the time - by providing an LE exit which runs in all our environment and which in case of an error catches this error and provides enough information from the save area back trace, that normally the application developer only has to look at those informations and simply doesn't need to refer to the following SYSUDUMP. For example, we print every DSA for every procedure call, together with the name of the function, the parameter address lists of every call, the complete call hierarchy etc., the registers at every call level, the offset of the call etc. If the error is indeed in a LE routine below the application code, we recognize this and go up to the application code and identify the error position in the application code - same goes for DB2 errors, that is, when the error position is in the routine that is handling the DB2 "SQLCODE not handled" condition. And: if we found the name of the module which is the cause of the error, we send an alarm mail to the department which is reponsible for the module - we get this information from a repository. The information provided this way is much easier to read for our people than SYSUDUMP and even easier than CEEDUMP (it has more information, has a somehow better structure in our opinion, and - important for some of our co-workers - it's in German language). Furthermore, we teach the developers how to cope with this. This was necessary (we did it in 2005), because we realized some problems: - the dumps looked different in the different environments (batch, test, DB dialog aka IMS), but we wanted the same look and feel in every environment - dump reading skills degraded - we didn't want to buy an expensive tool and do the customizing in the different environments; instead we wanted one of our own, where we could add additional function (see above in an easy way) From today's viewpoint, it looks like a success story. Even in cases when the save area is destroyed (overwritten), the LE exit does a very good job by providing at least the rests of the save area trace. It tries to find the save areas first from the bottom (register 13), then from above (TCBFSA), and in the normal case, the two chains fit together. If not, there is a gap, and this gap is documented. The save area trace and the back chain is very imporant for us, because at our site we typically have many small modules calling each other and it is not uncommon to see some 50 levels of calling hierarchy. BTW: the method works regardless of the programming language; we have C, PL/1 and ASSEMBLER (and, at a neighbor site, the exit also works with C++ functions - in fact the method to get the function name from the entry point is the same for all LE languages, so I believe it will work for COBOL, too, although there is no COBOL around). Kind regards Bernd Am 29.07.2013 23:33, schrieb T'Dell Sparks: > (First set of registers are usually the calling program) be careful > > That can be misleading as the RTM2 will put things in the DUMP as well. That makes IPCS indispensable when dealing with dumps. The PSW might be pointing to something else initially in the dump so you have to scan down through the listing find you program. That's why, as some have said here mentorship is a good way to skip some other painful lessons. I still have to call on 50 year vetran even after I've done t6his for over 30.. I don't write wait post code.. > Good point though .. > >
