Hi! When a SLES11SP3 node joined a 3-node cluster after reboot (and preceeding update), a node with up-to-date software showed these messages (I feel these should not appear):
Jan 20 17:12:38 h10 corosync[13220]: [MAIN ] Completed service synchronization, ready to provide service. Jan 20 17:12:38 h10 cib[13257]: warning: crm_find_peer: Node 'h01' and 'h01' share the same cluster nodeid: 739512321 Jan 20 17:12:38 h10 cib[13257]: warning: crm_find_peer: Node 'h01' and 'h01' share the same cluster nodeid: 739512321 Jan 20 17:12:38 h10 cib[13257]: warning: crm_find_peer: Node 'h01' and 'h01' share the same cluster nodeid: 739512321 ### So why may not the same nodes have the same nodeid? Jan 20 17:12:38 h10 attrd[13260]: warning: crm_dump_peer_hash: crm_find_peer: Node 84939948/h05 = 0x61ae90 - b6cabbb3-8332-4903-85be-0c06272755ac Jan 20 17:12:38 h10 attrd[13260]: warning: crm_dump_peer_hash: crm_find_peer: Node 17831084/h01 = 0x61e300 - 11693f38-8125-45f2-b397-86136d5894a4 Jan 20 17:12:38 h10 attrd[13260]: warning: crm_dump_peer_hash: crm_find_peer: Node 739512330/h10 = 0x614400 - 302e33d8-7cee-4f3b-97da-b38f0d51b0f6 ### above are the three nodes of the cluster Jan 20 17:12:38 h10 attrd[13260]: crit: crm_find_peer: Node 739512321 and 17831084 share the same name 'h01' ### Now there are different nodeids it seems... Jan 20 17:12:38 h10 attrd[13260]: warning: crm_find_peer: Node 'h01' and 'h01' share the same cluster nodeid: 739512321 Jan 20 17:12:38 h10 cib[13257]: warning: crm_find_peer: Node 'h01' and 'h01' share the same cluster nodeid: 739512321 ### The same again... (pacemaker-1.1.11-0.7.53, corosync-1.4.7-0.19.6) As a result the node h01 is offline now. Before updating the software the node was member of the cluster. On node h01 I see messages like these: cib[7439]: notice: get_node_name: Could not obtain a node name for classic openais (with plugin) nodeid 84939948 cib[7439]: notice: crm_update_peer_state: plugin_handle_membership: Node (null)[84939948] - state is now member (was (null)) cib[7439]: notice: get_node_name: Could not obtain a node name for classic openais (with plugin) nodeid 739512330 cib[7439]: notice: crm_update_peer_state: plugin_handle_membership: Node (null)[739512330] - state is now member (was (null)) crmd[7444]: warning: crmd_cs_dispatch: Receiving messages from a node we think is dead: rksaph05[84939948] crmd[7444]: notice: get_node_name: Could not obtain a node name for classic openais (with plugin) nodeid 739512330 corosync[7402]: [MAIN ] Completed service synchronization, ready to provide service. crmd[7444]: notice: do_state_transition: State transition S_PENDING -> S_NOT_DC [ input=I_NOT_DC cause=C_HA_MESSAGE origin=do_cl_join_finalize_respond ] An attempt to restart openais did hang with this messages: attrd[7442]: notice: attrd_perform_update: Sent update 7: shutdown=1421771193 corosync[7402]: [pcmk ] notice: pcmk_shutdown: Still waiting for crmd (pid=7444, seq=6) to terminate... [message repeats] So I killed crmd (pid 7444)m and openais shut down. Unfortunately the problem still persists... Regards, Ulrich _______________________________________________ Linux-HA mailing list Linux-HA@lists.linux-ha.org http://lists.linux-ha.org/mailman/listinfo/linux-ha See also: http://linux-ha.org/ReportingProblems