I made some testing to see what happens in a case of a CF failure or CF and system(s) failure.
Our test parallel sysplex configuration: 4 systems (S1 - S4) z/OS 1.6 2 coupling facilities (CF1 & CF2) CFLEVEL 14 SMF policy with ISOLATETIME(0) CONNFAIL(NO) GRS STAR configuration (ISGLOCK in CF2) I was said that loosing one or more systems and the ISGLOCK structure at the same time could bring the whole sysplex down if we use SFM weight defaults of 1. Our ISGLOCK structure has a default REBUILDPERCENT (according to documentation it should be 1, but in the message IXC360I displayed as "REBUILD PERCENT: N/A") Test scenario 1: - set SFM weight for S3 to 40 (that was supposed to be the minimal weight for the important systems), other systems have weight = 1 - deactivate lpars S3, S4, and CF2 at the same time Result: - systems S1 and S2 partitioned the S3 and S4 from the sysplex and rebuilt the ISGLOCK. Everything works as expected (except operlog, etc.) After that I tried to build a scenario where the ISGLOCK rebuild will not happen. I took the explanation from the "Setting Up the Sysplex" (SA22-7625-09 because of the APAR OA05860). Test scenario 2: - set SFM weights for S2, S3, S4 to 9999, S1 to 1 - deactivate lpars S2, S3, S4, and CF2 at the same time I expected the system S1 not to be able to start a rebuild process because of the explanation and formula in the manual. Results: - system S1 partitioned the S2 - S4 from the sysplex and _then_ it rebuilt the ISGLOCK. Everything looks normal (as in results in scenario 1) Now I have some questions: 1. Does a system / connector rebuild a structure with default rebuild percent in a case of a connectivity loss no matter what SFM weights are defined? 2. To rephrase the question 1: Is the default rebuild percent the same as REBUILDPERCENT(1)? 3. What happens in a case that I lose one or more systems and the primary couple data sets at the same time? Is the partitioning of the lost system(s) still possible or the sysplex hangs because the CDS switches to alternate cannot be acknowledged by failed systems? That is, what is the order how it is done: first partition of the failed system(s) and than cds switch or the other way around? (Such a test is not easy to organize.) I hope i have not make the mail to complicated. Zaromil ---------------------------------------------------------------------- For IBM-MAIN subscribe / signoff / archive access instructions, send email to [EMAIL PROTECTED] with the message: GET IBM-MAIN INFO Search the archives at http://bama.ua.edu/archives/ibm-main.html

