On 2008-11-21T19:24:02, cyclope cyclope <[EMAIL PROTECTED]> wrote:
> Hi all. Please help.
> I have node1 & node2.
>
>
> Node1 has network interface named bond0.
> Node2 has network interface named eth1.
>
> When Node1 is the only started node - all is ok. But when I start Node2 after
> that - makes resource takeover.
> Strange is that in Heartbeat logs of Node2 there is a strings:
>
> IPaddr2[4478]: 2008/11/21_18:03:23 ERROR: Failed: /usr/lib/heartbeat/findif
> 192.168.0.14/30/bond0/ . Parameter error.
> crmd[4430]: 2008/11/21_18:03:23 WARN: process_lrm_event:lrm.c LRM operation
> (4) stop_0 on IPaddr2_1 Error: (1) unknown error
> IPaddr2[4486]: 2008/11/21_18:03:23 INFO: /sbin/ip -f inet addr add
> 192.168.0.14/30 brd 192.168.0.15 dev eth1 label eth1:0
> IPaddr2[4486]: 2008/11/21_18:03:23 INFO: /sbin/ip link set eth1 up
> IPaddr2[4486]: 2008/11/21_18:03:23 INFO: /usr/lib/heartbeat/send_arp -i 200
> -r 5 -p /var/run/heartbeat/rsctmp/send_arp/send_arp-192.168.0.14 eth1
> 192.168.0.14 auto 192.168.0.14 ffffffffffff
>
> So, Node2 tryeis to takeover bond0 (which does not exists on Node2, it jas
> eth1 instead). And then it brings up eth1 - that is ok.
> So strange things are:
> 1. Why does Node2 make takeover after heartbet start on it, while Node1 is
> absolutely ok
> 2. Why does Node2 monitor bond0, while it is interface for Node1.
> <group id="group_1">
> <primitive class="ocf" provider="heartbeat" type="IPaddr2"
> id="IPaddr2_1">
> <operations>
> <op timeout="5s" id="IPaddr2_1_mon" name="monitor"
> interval="5s"/>
> </operations>
> <instance_attributes id="IPaddr2_1_inst_attr">
> <attributes>
> <nvpair value="192.168.0.14" id="IPaddr2_1_attr_0" name="ip"/>
> <nvpair value="30" id="IPaddr2_1_attr_1" name="netmask"/>
> <nvpair value="bond0" id="IPaddr2_1_attr_2" name="nic"/>
> <nvpair value="0" id="IPaddr2_1_attr_3" name="iflabel"/>
> </attributes>
> </instance_attributes>
> </primitive>
> </group>
> <group id="group_2">
> <primitive class="ocf" provider="heartbeat" type="IPaddr2"
> id="IPaddr2_2">
> <operations>
> <op timeout="5s" id="IPaddr2_2_mon" name="monitor"
> interval="5s"/>
> </operations>
> <instance_attributes id="IPaddr2_2_inst_attr">
> <attributes>
> <nvpair value="192.168.0.14" id="IPaddr2_2_attr_0" name="ip"/>
> <nvpair value="30" id="IPaddr2_2_attr_1" name="netmask"/>
> <nvpair value="eth1" id="IPaddr2_2_attr_2" name="nic"/>
> <nvpair value="0" id="IPaddr2_2_attr_3" name="iflabel"/>
> </attributes>
> </instance_attributes>
> </primitive>
> </group>
> </resources>
The configuration is very wrong. You have two IP address resources, both
with the same IP but different interfaces, which cannot fail-over to the
other node - as they are distinct resources, the CRM will try to start
them both at the same time.
You only want to have _one_ IP address resource. (And a group with just
one resource in it is quite pointless too, you can just have one
resource.)
You basically have two options:
1. Rename bond0 & eth1 to have a common name. NIC names are completely
arbitrary and free-form, and that is the best option for you to pursue,
as it will make your life so much easier as more and more of the
configuration can be identical.
2. Use rules in your instance_attribute sets to override the nic name on
one node. This works, but is somewhat more complex.
Regards,
Lars
--
Teamlead Kernel, SuSE Labs, Research and Development
SUSE LINUX Products GmbH, GF: Markus Rex, HRB 16746 (AG Nürnberg)
"Experience is the name everyone gives to their mistakes." -- Oscar Wilde
_______________________________________________
Linux-HA mailing list
[email protected]
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems