Re: [Linux-HA] Ignoring HA message (op=noop) from storage-1: not in our membership list (size=1)

Dejan Muhamedagic Mon, 29 Jun 2009 07:41:55 -0700

Hi,

On Fri, Jun 26, 2009 at 11:44:05AM +0200, artur.k wrote:
> Hi
> 
> I use heartbeat 2.1.4-7~bpo50+1 on debian lenny (xen domU) and
> i have a problem. If the network connection is down and few
> seconds currently  now up, heartbeat not fail back :( on the
> log:
> 
> crmd[10201]: 2009/06/26_11:06:55 WARN: crmd_ha_msg_callback: Ignoring HA 
> message (op=noop) from storage-1: not in our mem
> bership list (size=1)
> ccm[10196]: 2009/06/26_11:06:55 info: Break tie for 2 nodes cluster
> crmd[10201]: 2009/06/26_11:06:55 info: mem_handle_event: Got an event 
> OC_EV_MS_INVALID from ccm
> crmd[10201]: 2009/06/26_11:06:55 info: mem_handle_event: no mbr_track info
> crmd[10201]: 2009/06/26_11:06:55 info: mem_handle_event: Got an event 
> OC_EV_MS_NEW_MEMBERSHIP from ccm
> crmd[10201]: 2009/06/26_11:06:55 info: mem_handle_event: instance=1295, 
> nodes=1, new=0, lost=0, n_idx=0, new_idx=1, old_idx=3
> crmd[10201]: 2009/06/26_11:06:55 info: crmd_ccm_msg_callback: Quorum 
> (re)attained after event=NEW MEMBERSHIP (id=1295)
> crmd[10201]: 2009/06/26_11:06:55 info: ccm_event_detail: NEW MEMBERSHIP: 
> trans=1295, nodes=1, new=0, lost=0 n_idx=0, new_idx=1
> , old_idx=3
> crmd[10201]: 2009/06/26_11:06:55 info: ccm_event_detail:        CURRENT: 
> trac-storage-2 [nodeid=1, born=1295]
> cib[10197]: 2009/06/26_11:06:55 info: mem_handle_event: Got an event 
> OC_EV_MS_INVALID from ccm
> cib[10197]: 2009/06/26_11:06:55 info: mem_handle_event: no mbr_track info
> cib[10197]: 2009/06/26_11:06:55 info: mem_handle_event: Got an event 
> OC_EV_MS_NEW_MEMBERSHIP from ccm
> cib[10197]: 2009/06/26_11:06:55 info: mem_handle_event: instance=1295, 
> nodes=1, new=0, lost=0, n_idx=0, new_idx=1, old_idx=3
> cib[10197]: 2009/06/26_11:06:55 info: cib_ccm_msg_callback: PEER: 
> trac-storage-2
> ccm[10196]: 2009/06/26_11:06:56 info: Break tie for 2 nodes cluster
> cib[10197]: 2009/06/26_11:06:56 info: mem_handle_event: Got an event 
> OC_EV_MS_INVALID from ccm
> cib[10197]: 2009/06/26_11:06:56 info: mem_handle_event: no mbr_track info


The ha.cf is more appropriate to check this. If you have such
intermittent problems with your interface you can't help with,
you can increase the dead timeouts in ha.cf. The very best is to
have reliable connections. If there's a problem with xen, you can
file a bugzilla with them. The network interface shouldn't really
be doing yo-yo.

Thanks,

Dejan

> 
> my cib.xml :
> 
>  <cib generated="true" admin_epoch="0" have_quorum="true" ignore_dtd="false" 
> num_peers="2" cib_feature_revision="2.0" crm_feature_set="2.0" epoch="126" 
> num_updates="3" cib-last-written="Fri Jun 26 11:42:56 2009" 
> ccm_transition="2" dc_uuid="0c57668f-5a90-49bd-af4c-06987e8773a4">
>    <configuration>
>      <crm_config>
>        <cluster_property_set id="cib-bootstrap-options">
>          <attributes>
>            <nvpair id="cib-bootstrap-options-dc-version" name="dc-version" 
> value="2.1.4-node: aa909246edb386137b986c5773344b98c6969999"/>
>          </attributes>
>        </cluster_property_set>
>      </crm_config>
>      <nodes>
>        <node id="fcd92b39-cc52-4392-9c8e-f316c34070e6" uname="storage-1" 
> type="normal"/>
>        <node id="0c57668f-5a90-49bd-af4c-06987e8773a4" uname="storage-2" 
> type="normal"/>
>      </nodes>
>      <resources>
>        <primitive class="ocf" provider="heartbeat" type="IPaddr" id="ip0">
>          <instance_attributes id="ia-ip0">
>            <attributes>
>              <nvpair name="ip" id="ia-ip0-1" value="10.1.1.2"/>
>            </attributes>
>          </instance_attributes>
>        </primitive>
>        <master_slave id="ms-drbd0">
>          <meta_attributes id="ma-ms-drbd0">
>            <attributes>
>              <nvpair id="ma-ms-drbd0-1" name="clone_max" value="2"/>
>              <nvpair id="ma-ms-drbd0-2" name="clone_node_max" value="1"/>
>              <nvpair id="ma-ms-drbd0-3" name="master_max" value="1"/>
>              <nvpair id="ma-ms-drbd0-4" name="master_node_max" value="1"/>
>              <nvpair id="ma-ms-drbd0-5" name="notify" value="yes"/>
>              <nvpair id="ma-ms-drbd0-6" name="globally_unique" value="false"/>
>              <nvpair id="ma-ms-drbd0-7" name="target_role" value="started"/>
>            </attributes>
>          </meta_attributes>
>          <primitive class="ocf" provider="heartbeat" type="drbd" id="drbd0">
>            <instance_attributes id="ia-drbd0">
>              <attributes>
>                <nvpair id="ia-drbd0-1" name="drbd_resource" value="r0"/>
>              </attributes>
>            </instance_attributes>
>            <operations>
>              <op name="monitor" timeout="10s" role="Master" id="op-drbd0-1" 
> interval="20s"/>
>              <op name="monitor" timeout="10s" role="Slave" id="op-drbd0-2" 
> interval="21s"/>
>            </operations>
>          </primitive>
>        </master_slave>
>        <primitive class="ocf" provider="heartbeat" type="Filesystem" id="fs0">
>          <instance_attributes id="ia-fs0">
>            <attributes>
>             <nvpair id="ia-fs0-1" name="fstype" value="reiserfs"/>
>              <nvpair id="ia-fs0-2" name="directory" value="/mnt/drbd"/>
>              <nvpair id="ia-fs0-3" name="device" value="/dev/drbd0"/>
>              <nvpair id="ia-fs0-4" name="options" 
> value="rw,nosuid,nodev,noatime"/>
>            </attributes>
>          </instance_attributes>
>        </primitive>
>        <primitive id="nfsserver" class="lsb" type="nfs-kernel-server"/>
>      </resources>
>      <constraints>
>        <rsc_colocation id="ip_run" to="ms-drbd0" to_role="master" from="ip0" 
> score="infinity"/>
>        <rsc_colocation id="fs0_on_drbd0" to="ms-drbd0" to_role="master" 
> from="fs0" score="infinity"/>
>        <rsc_colocation id="nfs_run" to="ms-drbd0" to_role="master" 
> from="nfsserver" score="infinity"/>
>        <rsc_order id="start_fs0" from="fs0" action="start" to="ms-drbd0" 
> to_action="promote"/>
>        <rsc_order id="start_nfs" from="nfsserver" action="start" to="fs0" 
> type="after"/>
>        <rsc_order id="start_ip0" from="ip0" action="start" to="fs0" 
> type="after"/>
>        <rsc_order id="stop_fs0" from="fs0" action="stop" to="nfsserver" 
> type="after"/>
>      </constraints>
>    </configuration>
>  </cib>
> 
> _______________________________________________
> Linux-HA mailing list
> [email protected]
> http://lists.linux-ha.org/mailman/listinfo/linux-ha
> See also: http://linux-ha.org/ReportingProblems
_______________________________________________
Linux-HA mailing list
[email protected]
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems

Re: [Linux-HA] Ignoring HA message (op=noop) from storage-1: not in our membership list (size=1)

Reply via email to