It has indeed been a long time since we ran with no SFM policy. However, I believe that it's not SFM that accounts for the behavior I described. (IBM XCF please weigh in here.) If we did not also have a basic sysplex with no CF, I couldn't be so persistent in my view. VARY XCF OFF is not a system failure; it's an operator action as indicated by the system wait state that eventually gets loaded. Parallel and basic sysplex behave differently in this situation.
1. In a parallel sysplex, voluntary withdrawal is communicated immediately via CF links and structures to any surviving member(s). One of these members immediately undertakes sysplex cleanup to recover consoles and the JESplex. There is no SFM-interval delay because everyone else understands that a member has been brought down, not just failed to respond. 2. In a basic sysplex, systems communicate only via couple data sets and CTC links, both dependent on conventional I/O subsystem functions. I/O is notoriously flaky. Delays happen for all sorts of reasons that do not necessarily indicate system failure. With no I/O-independent mechanism to communicate system status, surviving member(s) must rely on explicit limits in the SFM policy, where the customer has specified the length of time to wait before declaring a missing member dead. Even after V XCF OFF, other members are more or less hung until the operator replies 'down' or the SFM wait interval expires. We don't IPL our basic sysplex very often, so we often forget to reply 'down' until other members begin complaining about hanging processes. . . J.O.Skip Robinson Southern California Edison Company Electric Dragon Team Paddler SHARE MVS Program Co-Manager 626-302-7535 Office 323-715-0595 Mobile [email protected] From: Mark Zelden <[email protected]> To: [email protected], Date: 06/16/2014 06:24 AM Subject: Re: V xcf clarification Sent by: IBM Mainframe Discussion List <[email protected]> Skip, maybe it's been a long time since you have NOT had an SFM policy (I know I haven't operated in an environment without one in perhaps 15 years). It's SFM that prevents having to reply "DOWN" (message IXC102A). The SFM system isolation process also prevents the need to perform a SYSTEM RESET. http://www-03.ibm.com/systems/z/advantages/pso/removing.html Mark -- Mark Zelden - Zelden Consulting Services - z/OS, OS/390 and MVS ITIL v3 Foundation Certified mailto:[email protected] Mark's MVS Utilities: http://www.mzelden.com/mvsutil.html Systems Programming expert at http://search390.techtarget.com/ateExperts/ On Fri, 13 Jun 2014 15:00:30 -0700, Skip Robinson <[email protected]> wrote: >The virtue of issuing V XCF,OFF is that in a parallel sysplex, a system >can be detected by others as down via CF coupling links. That's why you >don't have to 'Reply down' on another system. XCF knows that the system >has been shut down. No 'down' reply is necessary. > > >From: Mark Zelden <[email protected]> >To: [email protected], >Date: 06/13/2014 01:37 PM >Subject: Re: V xcf clarification >Sent by: IBM Mainframe Discussion List <[email protected]> > > > >On Fri, 13 Jun 2014 20:23:15 +0000, Staller, Allan ><[email protected]> wrote: > >>I you look a little further into the process, there is a later reply >"reply down >> when system has been reset" (or something similar). >>IF the LPAR that issued the VARY XCF, is in fact the one that has been >reset, >>you would have to find another console to actually reply 'DOWN" from. >> >>Other than that, I do not believe it will make any difference in the end. > >You can VERY XCF,sysname,OFFLINE from the system you are taking out >of the sysplex and not get those annoying messages if you put an >SFM policy in place. ---------------------------------------------------------------------- For IBM-MAIN subscribe / signoff / archive access instructions, send email to [email protected] with the message: INFO IBM-MAIN
