hi, all

I met a strange problem when I restarted one node.

To make this problem clearly. Here is the description for my cluster:

============
Last updated: Fri Jan 11 08:22:52 2008
Current DC: A (d956d657-94c9-4100-9d53-f510628f37cf)
3 Nodes configured.
2 Resources configured.
============

Node: A (f6b3f735-e4e2-4cc0-9fcb-dbb3caac23de): online
Node: B (da203400-019e-48a8-ba98-30a1e9cd45c0): online
Node: C (d956d657-94c9-4100-9d53-f510628f37cf): online

Resource Group: group0
    10.170.2.76 (heartbeat::ocf:IPaddr2):       Started A
    10.170.2.86 (heartbeat::ocf:IPaddr2):       Started A
Resource Group: group2
    10.170.2.75 (heartbeat::ocf:IPaddr2):       Started B
    10.170.2.85 (heartbeat::ocf:IPaddr2):       Started B

Usually, when node B or C restarted. There is no impact on node A's
resource. But sometimes, when node B or C restarted, the second IP on node A
stopped and started again. I am confusing about this situation. Why it
happened and why sometimes it happened?

Here is the related log:
Jan  8 10:43:15 ipmuxa1 cib: [26916]: info: write_cib_contents: Wrote
version 0.562.1 of the CIB to disk (digest:
c19dd340ca5c9aa8911174bbd1074700)
Jan  8 10:43:15 ipmuxa1 crmd: [4428]: info: do_election_count_vote: Election
check: vote from ipmuxa3
Jan  8 10:43:15 ipmuxa1 crmd: [4428]: info: update_dc: Set DC to <null>
(<null>)
Jan  8 10:43:15 ipmuxa1 crmd: [4428]: info: do_state_transition: State
transition S_NOT_DC -> S_PENDING [ input=I_PENDING cause=C_FSA_INTERNAL
origin=do_election_count_vote ]
Jan  8 10:43:15 ipmuxa1 crmd: [4428]: info: update_dc: Set DC to <null>
(<null>)
Jan  8 10:43:15 ipmuxa1 crmd: [4428]: info: update_dc: Set DC to ipmuxa3 (
1.0.9)
Jan  8 10:43:15 ipmuxa1 cib: [26921]: info: write_cib_contents: Wrote
version 0.563.2 of the CIB to disk (digest:
a3828851ff29ff77460a6f9661e28d88)
Jan  8 10:43:15 ipmuxa1 crmd: [4428]: info: update_dc: Set DC to ipmuxa3 (
1.0.9)
Jan  8 10:43:15 ipmuxa1 crmd: [4428]: info: build_operation_update: Digest
for 0:0;10:25:2a5a8dd1-b120-405a-94bc-b360ea7818e5 (10.3.0.100_start_0) was
473fad6250c8c4233966de2a2d021381
Jan  8 10:43:15 ipmuxa1 crmd: [4428]: info: log_data_element:
build_operation_update: digest:source <parameters target_role="started" ip="
10.3.0.100" cidr_netmask="24" nic="enet0.20"/>
Jan  8 10:43:15 ipmuxa1 crmd: [4428]: info: build_operation_update: Digest
for 0:0;13:25:2a5a8dd1-b120-405a-94bc-b360ea7818e5 (10.1.10.100_start_0) was
749901939c7895ec4a92dc503070ed35
Jan  8 10:43:15 ipmuxa1 crmd: [4428]: info: log_data_element:
build_operation_update: digest:source <parameters target_role="started" ip="
10.1.10.100" cidr_netmask="24" nic="enet0.15"/>
Jan  8 10:43:15 ipmuxa1 crmd: [4428]: info: do_state_transition: State
transition S_PENDING -> S_NOT_DC [ input=I_NOT_DC cause=C_HA_MESSAGE
origin=do_cl_join_finalize_respond ]
Jan  8 10:43:15 ipmuxa1 cib: [4419]: info: cib_replace_notify: Local-only
Replace: 0.564.1 from <null>
Jan  8 10:43:15 ipmuxa1 crmd: [4428]: info: populate_cib_nodes: Requesting
the list of configured nodes
Jan  8 10:43:15 ipmuxa1 cib: [26922]: info: write_cib_contents: Wrote
version 0.564.5 of the CIB to disk (digest:
02f9b4756748799b383bc8a5280f003d)
Jan  8 10:43:15 ipmuxa1 crmd: [4428]: notice: populate_cib_nodes: Node:
ipmuxa3 (uuid: 58d8c03c-b53d-4a05-9582-ef362238b402)
Jan  8 10:43:15 ipmuxa1 crmd: [4428]: notice: populate_cib_nodes: Node:
ipmuxa2 (uuid: 5579a5ae-ab27-4611-88bc-cf081c2dc843)
Jan  8 10:43:15 ipmuxa1 crmd: [4428]: notice: populate_cib_nodes: Node:
ipmuxa1 (uuid: 6e18ea82-d2fa-479f-81b2-77656ff4c0a1)
Jan  8 10:43:15 ipmuxa1 crmd: [4428]: info: do_lrm_rsc_op: Performing op=
10.1.10.100_stop_0 key=14:3:920eb4ce-1319-4947-be0d-6f7db14ab49e)
Jan  8 10:43:15 ipmuxa1 crmd: [4428]: info: process_lrm_event: LRM operation
10.1.10.100_monitor_1000 (call=9, rc=-2) Cancelled
Jan  8 10:43:15 ipmuxa1 cib: [26930]: info: write_cib_contents: Wrote
version 0.565.2 of the CIB to disk (digest:
56bccc5a0744e868a3592b0a5354b26f)
Jan  8 10:43:16 ipmuxa1 IPaddr2[26927]: [26991]: INFO: ip -f inet addr
delete 10.1.10.100/24 dev enet0.15
Jan  8 10:43:16 ipmuxa1 kernel: enet0.15: del 01:00:5e:00:00:01 mcast
address from master interface
Jan  8 10:43:16 ipmuxa1 IPaddr2[26927]: [26993]: INFO: ip -o -f inet addr
show enet0.15
Jan  8 10:43:16 ipmuxa1 IPaddr2[26927]: [26995]: INFO: ip link set enet0.15down
Jan  8 10:43:16 ipmuxa1 kernel: enet0.15: del 33:33:ff:25:a8:b4 mcast
address from vlan interface
Jan  8 10:43:16 ipmuxa1 kernel: enet0.15: del 33:33:ff:25:a8:b4 mcast
address from master interface
Jan  8 10:43:16 ipmuxa1 kernel: enet0.15: del 33:33:00:00:00:01 mcast
address from vlan interface
Jan  8 10:43:16 ipmuxa1 kernel: enet0.15: del 33:33:00:00:00:01 mcast
address from master interface

I found "Jan  8 10:43:16 ipmuxa1 IPaddr2[26927]: [26991]: INFO: ip -f inet
addr delete 10.1.10.100/24 dev enet0.15" in the log. This means the IP on
this network was stopped. But why?
Then I found a strange line before it "process_lrm_event: LRM operation
10.1.10.100_monitor_1000 (call=9, rc=-2) Cancelled". Is it the reason?
What's the meaning for this printout?

Thanks a lot
Mingdao Lu
_______________________________________________
Linux-HA mailing list
[email protected]
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems

Reply via email to