Hi Rob, Indeed I have rexx logic after the PIPE to restart when required (IF rc=313 then CP IPL CMS PARM AUTOCR). The PIPELINE has two parts. The first part is STARMSG to send commands to the machine. The second part is the STARMON processing. It selects domains/records based on a parameter file (such as 04 0003 User activity records) an then writes them to disk. The output stream for STARMON is never servered except when the disk is full.
Looking at the performance data I can see the LPAR was at 100% CPU at this time so probably the machine didn't get enough CPU to process data in time. Indeed I did get the HCPMOV6274I message but I didn't copy that line. In fact, that's the reason I have coded the restart when the PIPE ends with 313 in the past. As mentioned, I have seen a couple of times in the past that this would have stopped the entire PIPE but now it looks like only the STARMON was stopped. In this case I'm looking for a way to stop the PIPELINE when the STARMON stage stops collecting data. Indeed maybe I can rewrite the logic using the GATE stage. Met vriendelijke groet/With kind regards/Mit freundlichen Grüßen, Berry van Sleeuwen Flight Forum 3000 5657 EW Eindhoven -----Original Message----- From: CMSTSO Pipelines Discussion List <[email protected]> On Behalf Of Rob van der Heij Sent: Friday, November 08, 2019 9:20 AM To: [email protected] Subject: Re: [CMS-PIPELINES] Trap error in stage STARMON On Fri, 8 Nov 2019 at 01:54, van Sleeuwen, Berry <[email protected]> wrote: > Hi All, > > In the past when STARMON didn't get all data in time it abended, and > the entire pipeline with it, with an RC=313. I have coded the REXX > exec to restart the MONWRITE machine when that happens. > > I now have the 313 error but the PIPELINE didn't abend. In fact, the > PIPE is still alive but STARMON doesn't process any records after this event. > Has this behavior changed in the newer PIPE version? I had been using > the upstream runtime version in z/VM 5.4 and 6.3. Obviously in 6.4 and > 7.1 I now use the IBM supplied version. > > FPLIUS313E IPRCODE Message was purged received on IUCV instruction. > FPLMSG003I ... Issued from stage 1 of pipeline 5. > FPLMSG001I ... Running "STARMON MONDCSS SHARED". > FPLSMG313E IPRCODE 00000939 received on IUCV instruction. > FPLMSG003I ... Issued from stage 1 of pipeline 5. > FPLMSG001I ... Running "STARMON MONDCSS SHARED". > > What would be the best way to handle the abend? Is there a way to end > the PIPE so that the REXX code can restart the PIPELINE? > Berry, I normally see this when the pipeline processing the monitor records has been held up long enough (like waiting on MORE ... ). You may be able to avoid that by an ELASTIC after the STARMON stage (and the GATE to terminate it). When STARMON terminates because the output was severed (or through the immediate command, there shouldn't be any such messages). This isn't an ABEND; it's just this particular pipeline stage terminate because things didn't go as planned. As long as the rest of that pipeline is properly done, it would also wind down and you end up in the REXX code after the PIPE or CALLPIPE that did the pipe. You can check on the return code and decide what to do. You can even write your own REXX wrapper around STARMON that simply restarts that pipeline on a return code 313 as long as the business logic doesn't care when some data was late or lost. You probably also had CP messages indicating that you didn't finish consumption of the data in time. There are also CONFIG options to tell CP how long to keep the data for you, depending on the size of the MONDCSS and the configured size of the partitions in it. You could also look at the amount of event records generated and see whether you want to disable a few domains that you don't need. There are some things "in the pipeline" to teach STARMON to write the equivalent of MONWRITE data, which would let you also compress the data before writing to disk. Along with that, I envision controls that let you terminate STARMON after writing the complete block of sample records to avoid incomplete data because STARMON terminated by a severed output stream. Sir Rob the Plumber This e-mail and the documents attached are confidential and intended solely for the addressee; it may also be privileged. If you receive this e-mail in error, please notify the sender immediately and destroy it. As its integrity cannot be secured on the Internet, Atos’ liability cannot be triggered for the message content. Although the sender endeavours to maintain a computer virus-free network, the sender does not warrant that this transmission is virus-free and will not be liable for any damages resulting from any virus transmitted. On all offers and agreements under which Atos Nederland B.V. supplies goods and/or services of whatever nature, the Terms of Delivery from Atos Nederland B.V. exclusively apply. The Terms of Delivery shall be promptly submitted to you on your request.
