On Wed, Sep 14, 2011 at 5:25 PM, Junko IKEDA <tsukishima...@gmail.com> wrote: > Hi, > > Pacemaker 1.1 shows the same behavior.
Which version did you check? The latest from git seems to work fine: Current cluster status: Online: [ bl460g1n13 bl460g1n14 ] Resource Group: grpDRBD dummy01 (ocf::pacemaker:Dummy): Started bl460g1n13 FAILED dummy02 (ocf::pacemaker:Dummy): Started bl460g1n13 dummy03 (ocf::pacemaker:Dummy): Started bl460g1n13 Master/Slave Set: msDRBD [prmDRBD] Masters: [ bl460g1n13 ] Slaves: [ bl460g1n14 ] Transition Summary: crm_simulate[13781]: 2011/09/26_15:00:05 notice: LogActions: Recover dummy01 (Started bl460g1n13) crm_simulate[13781]: 2011/09/26_15:00:05 notice: LogActions: Restart dummy02 (Started bl460g1n13) crm_simulate[13781]: 2011/09/26_15:00:05 notice: LogActions: Restart dummy03 (Started bl460g1n13) crm_simulate[13781]: 2011/09/26_15:00:05 notice: LogActions: Leave prmDRBD:0 (Master bl460g1n13) crm_simulate[13781]: 2011/09/26_15:00:05 notice: LogActions: Leave prmDRBD:1 (Slave bl460g1n14) Executing cluster transition: * Executing action 14: dummy03_stop_0 on bl460g1n13 * Executing action 12: dummy02_stop_0 on bl460g1n13 * Executing action 2: dummy01_stop_0 on bl460g1n13 * Executing action 11: dummy01_start_0 on bl460g1n13 * Executing action 1: dummy01_monitor_10000 on bl460g1n13 * Executing action 13: dummy02_start_0 on bl460g1n13 * Executing action 3: dummy02_monitor_10000 on bl460g1n13 * Executing action 15: dummy03_start_0 on bl460g1n13 * Executing action 4: dummy03_monitor_10000 on bl460g1n13 Revised cluster status: Online: [ bl460g1n13 bl460g1n14 ] Resource Group: grpDRBD dummy01 (ocf::pacemaker:Dummy): Started bl460g1n13 dummy02 (ocf::pacemaker:Dummy): Started bl460g1n13 dummy03 (ocf::pacemaker:Dummy): Started bl460g1n13 Master/Slave Set: msDRBD [prmDRBD] Masters: [ bl460g1n13 ] Slaves: [ bl460g1n14 ] > It seems that the following chengeset has the problems. > > http://hg.clusterlabs.org/pacemaker/stable-1.0/diff/281c8c03a8c2/pengine/native.c > > I could get the expected behavior with the latest Pacemaker 1.0 after > reverting the above change. > > Thanks, > Junko > > 2011/9/13 Junko IKEDA <tsukishima...@gmail.com>: >> Hi, >> >> I have the following resource setting; >> >> - msDRBD : Master/Slave(drbd) >> - grpDRBD : group(including 3 Dummy) >> >> and location setting is here; >> >> location rsc_location-1 msDRBD \ >> rule role=master 200: #uname eq bl460g1n13 \ >> rule role=master 100: #uname eq bl460g1n14 >> colocation rsc_colocation-1 inf: grpDRBD msDRBD:Master >> order rsc_order-1 0: msDRBD:promote grpDRBD:start >> >> >> * Initial starting; >> ============ >> Last updated: Tue Sep 13 22:09:17 2011 >> Stack: Heartbeat >> Current DC: bl460g1n14 (22222222-2222-2222-2222-222222222222) - >> partition with quorum >> Version: 1.0.11-1554a83db0d3c3e546cfd3aaff6af1184f79ee87 >> 2 Nodes configured, unknown expected votes >> 2 Resources configured. >> ============ >> >> Online: [ bl460g1n13 bl460g1n14 ] >> >> Resource Group: grpDRBD >> dummy01 (ocf::pacemaker:Dummy): Started bl460g1n13 >> dummy02 (ocf::pacemaker:Dummy): Started bl460g1n13 >> dummy03 (ocf::pacemaker:Dummy): Started bl460g1n13 >> Master/Slave Set: msDRBD >> Masters: [ bl460g1n13 ] >> Slaves: [ bl460g1n14 ] >> >> >> * break dummy01; >> ============ >> Last updated: Tue Sep 13 22:09:44 2011 >> Stack: Heartbeat >> Current DC: bl460g1n14 (22222222-2222-2222-2222-222222222222) - >> partition with quorum >> Version: 1.0.11-1554a83db0d3c3e546cfd3aaff6af1184f79ee87 >> 2 Nodes configured, unknown expected votes >> 2 Resources configured. >> ============ >> >> Online: [ bl460g1n13 bl460g1n14 ] >> >> Resource Group: grpDRBD >> dummy01 (ocf::pacemaker:Dummy): Started bl460g1n13 FAILED >> dummy02 (ocf::pacemaker:Dummy): Started bl460g1n13 >> dummy03 (ocf::pacemaker:Dummy): Stopped >> Master/Slave Set: msDRBD >> Masters: [ bl460g1n13 ] >> Slaves: [ bl460g1n14 ] >> >> Failed actions: >> dummy01_monitor_10000 (node=bl460g1n13, call=13, rc=7, >> status=complete): not running >> >> >> * grpDRBD can't failover... >> ============ >> Last updated: Tue Sep 13 22:09:48 2011 >> Stack: Heartbeat >> Current DC: bl460g1n14 (22222222-2222-2222-2222-222222222222) - >> partition with quorum >> Version: 1.0.11-1554a83db0d3c3e546cfd3aaff6af1184f79ee87 >> 2 Nodes configured, unknown expected votes >> 2 Resources configured. >> ============ >> >> Online: [ bl460g1n13 bl460g1n14 ] >> >> Master/Slave Set: msDRBD >> Masters: [ bl460g1n13 ] >> Slaves: [ bl460g1n14 ] >> >> Failed actions: >> dummy01_monitor_10000 (node=bl460g1n13, call=13, rc=7, >> status=complete): not running >> >> >> Please see the attached hb_report. >> >> I tried to reduce the primitive resource in group from 3 to 2, >> and grpDRBD can failover in this case. >> >> If dummy02 or dummy03 break down instead dummy01, >> grpDRMD can can failover, too. >> >> Master/Slave and group which has more than 3 resources won't work. >> >> Regards, >> Junko IKEDA >> >> >> NTT DATA INTELLILINK CORPORATION >> > > _______________________________________________ > Pacemaker mailing list: Pacemaker@oss.clusterlabs.org > http://oss.clusterlabs.org/mailman/listinfo/pacemaker > > Project Home: http://www.clusterlabs.org > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf > Bugs: > http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker > _______________________________________________ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker