On Fri, Apr 16, 2010 at 3:28 PM, Haussecker, Armin <[email protected]> wrote: > Hi, > > we have a 2-node-cluster based on SLES11 , openais (0.80.3-26.8.1) and > pacemaker (1.0.5-0.5.6).
You're best off contacting Novell support for older versions. There's really not enough in the log fragments below to make any meaningful comment, but if you _attach_ the complete logs we might be able to help. > Sometimes the failover from one node (named > cuzzonib) to the second node (named cuzzonia) fails with the following > messages: > > Apr 16 13:16:14 cuzzonib lrmd: [6706]: info: Try to stop STONITH resource > <rsc_id=iRMC_cuzzoniaInstance:0> : Device=external/ipmi > Apr 16 13:16:14 cuzzonib crmd: [18479]: info: process_lrm_event: LRM > operation iRMC_cuzzoniaInstance:0_stop_0 (call=51, rc=0, cib-update=108, > confirmed=true) ok > Apr 16 13:16:14 cuzzonib crmd: [18479]: info: match_graph_event: Action > iRMC_cuzzoniaInstance:0_stop_0 (25) confirmed on cuzzonib (rc=0) > > Apr 16 13:16:14 cuzzonib crmd: [18479]: info: te_pseudo_action: Pseudo > action 29 fired and confirmed > Apr 16 13:16:14 cuzzonib crmd: [18479]: info: te_crm_command: Executing > crm-event (79): do_shutdown on cuzzonib > Apr 16 13:16:14 cuzzonib crmd: [18479]: info: te_crm_command: crm-event (79) > is a local shutdown > > Apr 16 13:16:17 cuzzonib logger: /etc/xen/scripts/xen-hotplug-cleanup: > XENBUS_PATH=backend/vkbd/4/0 > Apr 16 13:16:17 cuzzonib logger: /etc/xen/scripts/xen-hotplug-cleanup: > XENBUS_PATH=backend/console/4/0 > Apr 16 13:16:17 cuzzonib logger: /etc/xen/scripts/xen-hotplug-cleanup: > XENBUS_PATH=backend/vfb/4/0 > Apr 16 13:16:17 cuzzonib logger: /etc/xen/scripts/xen-hotplug-cleanup: > XENBUS_PATH=backend/vif/4/0 > Apr 16 13:16:17 cuzzonib logger: /etc/xen/scripts/block: remove > XENBUS_PATH=backend/vbd/4/51712 > Apr 16 13:16:17 cuzzonib logger: /etc/xen/scripts/block: remove > XENBUS_PATH=backend/vbd/4/51744 > Apr 16 13:16:17 cuzzonib logger: /etc/xen/scripts/xen-hotplug-cleanup: > XENBUS_PATH=backend/vbd/4/51712 > Apr 16 13:16:17 cuzzonib logger: /etc/xen/scripts/xen-hotplug-cleanup: > XENBUS_PATH=backend/vbd/4/51744 > > Apr 16 13:16:32 cuzzonib openais[18468]: [crm ] notice: pcmk_shutdown: > Still waiting for crmd (pid=18479, seq=6) to terminate.. > . > Apr 16 13:16:38 cuzzonib openais[18468]: [TOTEM] The token was lost in the > OPERATIONAL state. > Apr 16 13:16:38 cuzzonib openais[18468]: [TOTEM] Receive multicast socket > recv buffer size (262142 bytes). > Apr 16 13:16:38 cuzzonib openais[18468]: [TOTEM] Transmit multicast socket > send buffer size (262142 bytes). > Apr 16 13:16:38 cuzzonib openais[18468]: [TOTEM] entering GATHER state from > 2. > Apr 16 13:16:58 cuzzonib openais[18468]: [TOTEM] entering GATHER state from > 0. > Apr 16 13:16:58 cuzzonib openais[18468]: [TOTEM] Creating commit token > because I am the rep. > Apr 16 13:16:58 cuzzonib openais[18468]: [TOTEM] Saving state aru 14b high > seq received 14b > Apr 16 13:16:58 cuzzonib openais[18468]: [TOTEM] Storing new sequence id for > ring bb4 > Apr 16 13:16:58 cuzzonib openais[18468]: [TOTEM] entering COMMIT state. > Apr 16 13:16:58 cuzzonib openais[18468]: [TOTEM] entering RECOVERY state. > Apr 16 13:16:58 cuzzonib openais[18468]: [TOTEM] position [0] member > 192.168.10.5: > Apr 16 13:16:58 cuzzonib openais[18468]: [TOTEM] previous ring seq 2992 rep > 192.168.10.3 > Apr 16 13:16:58 cuzzonib openais[18468]: [TOTEM] aru 14b high delivered 14b > received flag 1 > Apr 16 13:16:58 cuzzonib openais[18468]: [TOTEM] Did not need to originate > any messages in recovery. > Apr 16 13:16:58 cuzzonib openais[18468]: [TOTEM] Sending initial ORF token > Apr 16 13:16:58 cuzzonib openais[18468]: [CLM ] CLM CONFIGURATION CHANGE > Apr 16 13:16:58 cuzzonib openais[18468]: [CLM ] New Configuration: > Apr 16 13:16:58 cuzzonib openais[18468]: [CLM ] r(0) > ip(192.168.10.5) > > Apr 16 13:16:58 cuzzonib openais[18468]: [CLM ] Members Left: > Apr 16 13:16:58 cuzzonib crmd: [18479]: notice: ais_dispatch: Membership > 2996: quorum lost > Apr 16 13:16:58 cuzzonib cib: [18475]: notice: ais_dispatch: Membership > 2996: quorum lost > Apr 16 13:16:58 cuzzonib crmd: [18479]: info: ais_status_callback: status: > cuzzonia is now lost (was member) > > Apr 16 13:16:58 cuzzonib cib: [18475]: info: crm_update_peer: Node cuzzonia: > id=51030208 state=lost (new) addr=r(0) ip(192.168.10.3) votes=1 born=2992 > seen=2992 proc=00000000000000000000000000053312 > > Afterwards the second cluster node (cuzzonia) is rebooted. > What could be the reason for the problem ? > > Regards, > Armin Haussecker > > > > > > > > _______________________________________________ > Openais mailing list > [email protected] > https://lists.linux-foundation.org/mailman/listinfo/openais > _______________________________________________ Openais mailing list [email protected] https://lists.linux-foundation.org/mailman/listinfo/openais
