Re: [Openais] Upgrade from openais-0.80 to corosync-1.3.0 fails

Steven Dake Wed, 26 Jan 2011 15:46:23 -0800

On 01/26/2011 10:10 AM, Dan Frincu wrote:
> Hi Steve,
> 
> On Wed, Jan 26, 2011 at 6:53 PM, Steven Dake <[email protected]
> <mailto:[email protected]>> wrote:
> 
>     Gather from state 3 back to back is an indicator that iptables are not
>     properly configured on the node.  I know you said iptables are turned
>     off, but if iptables are off, the node would at least form a
>     singleton ring.
> 
>     Could you send your config file?
> 
> 
> I haven't read about the MCP deployment model that you've mentioned in
> your previous email, I'd like to know more, if you can point me to the
> right documentation I'd appreciate it.
>


http://theclusterguy.clusterlabs.org/post/907043024/introducing-the-pacemaker-master-control-process-for

The udpu mode has not been tested at all by me or anyone that I know
with rrp.  In the case your using, you are using the same port on both
rings.  The port numbers should be different at a minimum.  Give that a
go and see if it solves your problems.

Regards
-steve

> Here is the config file, iptables output, corosync-cfgtool -s,
> corosync-fplay: http://pastebin.com/bChQZgaE
> 
> Let me know if there's anything else that I can provide.
> 
> Regards,
> Dan
>  
> 
>     Regards
>     -steve
> 
> 
>     On 01/26/2011 09:01 AM, Dan Frincu wrote:
>     > Update: increased verbosity to debug and I get the following
>     >
>     > Jan 26 16:36:45 corosync [TOTEM ] The consensus timeout expired.
>     > Jan 26 16:36:45 corosync [TOTEM ] entering GATHER state from 3.
>     > Jan 26 16:36:53 corosync [TOTEM ] The consensus timeout expired.
>     > Jan 26 16:36:53 corosync [TOTEM ] entering GATHER state from 3.
>     > Jan 26 16:37:00 corosync [TOTEM ] The consensus timeout expired.
>     > Jan 26 16:37:00 corosync [TOTEM ] entering GATHER state from 3.
>     > Jan 26 16:37:08 corosync [TOTEM ] The consensus timeout expired.
>     > Jan 26 16:37:08 corosync [TOTEM ] entering GATHER state from 3.
>     > Jan 26 16:37:12 cluster1 crmd: [16266]: ERROR: crm_timer_popped:
>     > Integration Timer (I_INTEGRATED) just popped!
>     > Jan 26 16:37:12 cluster1 crmd: [16266]: info: crm_timer_popped:
>     > Welcomed: 1, Integrated: 0
>     > Jan 26 16:37:12 cluster1 crmd: [16266]: debug: s_crmd_fsa: Processing
>     > I_INTEGRATED: [ state=S_INTEGRATION cause=C_TIMER_POPPED
>     > origin=crm_timer_popped ]
>     > Jan 26 16:37:12 cluster1 crmd: [16266]: info: do_state_transition:
>     State
>     > transition S_INTEGRATION -> S_FINALIZE_JOIN [ input=I_INTEGRATED
>     > cause=C_TIMER_POPPED origin=crm_timer_popped ]
>     > Jan 26 16:37:12 cluster1 crmd: [16266]: WARN: do_state_transition:
>     > Progressed to state S_FINALIZE_JOIN after C_TIMER_POPPED
>     > Jan 26 16:37:12 cluster1 crmd: [16266]: WARN: do_state_transition: 1
>     > cluster nodes failed to respond to the join offer.
>     > Jan 26 16:37:12 cluster1 crmd: [16266]: info: ghash_print_node:
>     > Welcome reply not received from: cluster1 42
>     > Jan 26 16:37:12 cluster1 crmd: [16266]: debug: do_fsa_action:
>     > actions:trace:    // A_DC_TIMER_STOP
>     > Jan 26 16:37:12 cluster1 crmd: [16266]: debug: do_fsa_action:
>     > actions:trace:    // A_INTEGRATE_TIMER_STOP
>     > Jan 26 16:37:12 cluster1 crmd: [16266]: debug: do_fsa_action:
>     > actions:trace:    // A_FINALIZE_TIMER_START
>     > Jan 26 16:37:12 cluster1 crmd: [16266]: debug: crm_timer_start:
>     Started
>     > Finalization Timer (I_ELECTION:1800000ms), src=102
>     > Jan 26 16:37:12 cluster1 crmd: [16266]: debug: do_fsa_action:
>     > actions:trace:    // A_DC_JOIN_FINALIZE
>     > Jan 26 16:37:12 cluster1 crmd: [16266]: debug: do_dc_join_finalize:
>     > Finializing join-42 for 0 clients
>     > Jan 26 16:37:12 cluster1 crmd: [16266]: debug: s_crmd_fsa: Processing
>     > I_ELECTION_DC: [ state=S_FINALIZE_JOIN cause=C_FSA_INTERNAL
>     > origin=do_dc_join_finalize ]
>     > Jan 26 16:37:12 cluster1 crmd: [16266]: debug: do_fsa_action:
>     > actions:trace:    // A_WARN
>     > Jan 26 16:37:12 cluster1 crmd: [16266]: WARN: do_log: FSA: Input
>     > I_ELECTION_DC from do_dc_join_finalize() received in state
>     S_FINALIZE_JOIN
>     > Jan 26 16:37:12 cluster1 crmd: [16266]: info: do_state_transition:
>     State
>     > transition S_FINALIZE_JOIN -> S_INTEGRATION [ input=I_ELECTION_DC
>     > cause=C_FSA_INTERNAL origin=do_dc_join_finalize ]
>     > Jan 26 16:37:12 cluster1 crmd: [16266]: debug: do_fsa_action:
>     > actions:trace:    // A_DC_TIMER_STOP
>     > Jan 26 16:37:12 cluster1 crmd: [16266]: debug: do_fsa_action:
>     > actions:trace:    // A_INTEGRATE_TIMER_START
>     > Jan 26 16:37:12 cluster1 crmd: [16266]: debug: crm_timer_start:
>     Started
>     > Integration Timer (I_INTEGRATED:180000ms), src=103
>     > Jan 26 16:37:12 cluster1 crmd: [16266]: debug: do_fsa_action:
>     > actions:trace:    // A_FINALIZE_TIMER_STOP
>     > Jan 26 16:37:12 cluster1 crmd: [16266]: debug: do_fsa_action:
>     > actions:trace:    // A_ELECTION_VOTE
>     > Jan 26 16:37:12 corosync [TOTEM ] mcasted message added to pending
>     queue
>     > Jan 26 16:37:12 cluster1 crmd: [16266]: debug: do_election_vote:
>     Started
>     > election 44
>     > Jan 26 16:37:12 cluster1 crmd: [16266]: debug: do_fsa_action:
>     > actions:trace:    // A_DC_JOIN_OFFER_ALL
>     > Jan 26 16:37:12 cluster1 crmd: [16266]: debug: initialize_join:
>     join-43:
>     > Initializing join data (flag=true)
>     > Jan 26 16:37:12 cluster1 crmd: [16266]: debug: join_make_offer:
>     join-43:
>     > Sending offer to cluster1
>     > Jan 26 16:37:12 cluster1 crmd: [16266]: info: do_dc_join_offer_all:
>     > join-43: Waiting on 1 outstanding join acks
>     > Jan 26 16:37:15 corosync [TOTEM ] The consensus timeout expired.
>     > Jan 26 16:37:16 corosync [TOTEM ] entering GATHER state from 3.
>     > Jan 26 16:37:23 corosync [TOTEM ] The consensus timeout expired.
>     > Jan 26 16:37:23 corosync [TOTEM ] entering GATHER state from 3.
>     > Jan 26 16:37:31 corosync [TOTEM ] The consensus timeout expired.
>     >
>     > Running corosync-blackbox gives me:
>     >
>     > # corosync-blackbox
>     > Starting replay: head [67420] tail [0]
>     > rec=[1] Log Message=Corosync Cluster Engine ('1.3.0'): started and
>     ready
>     > to provide service.
>     > rec=[2] Log Message=Corosync built-in features: nss rdma
>     > rec=[3] Log Message=Successfully read main configuration file
>     > '/etc/corosync/corosync.conf'.
>     > rec=[4] Log Message=Token Timeout (5000 ms) retransmit timeout
>     (490 ms)
>     > rec=[5] Log Message=token hold (382 ms) retransmits before loss
>     (10 retrans)
>     > rec=[6] Log Message=join (1000 ms) send_join (45 ms) consensus
>     (2500 ms)
>     > merge (200 ms)
>     > rec=[7] Log Message=downcheck (1000 ms) fail to recv const (50 msgs)
>     > rec=[8] Log Message=seqno unchanged const (30 rotations) Maximum
>     network
>     > MTU 1402
>     > rec=[9] Log Message=window size per rotation (50 messages) maximum
>     > messages per rotation (25 messages)
>     > rec=[10] Log Message=send threads (0 threads)
>     > rec=[11] Log Message=RRP token expired timeout (490 ms)
>     > rec=[12] Log Message=RRP token problem counter (2000 ms)
>     > rec=[13] Log Message=RRP threshold (10 problem count)
>     > rec=[14] Log Message=RRP mode set to active.
>     > rec=[15] Log Message=heartbeat_failures_allowed (0)
>     > rec=[16] Log Message=max_network_delay (50 ms)
>     > rec=[17] Log Message=HeartBeat is Disabled. To enable set
>     > heartbeat_failures_allowed > 0
>     > rec=[18] Log Message=Initializing transport (UDP/IP Unicast).
>     > rec=[19] Log Message=Initializing transmit/receive security:
>     libtomcrypt
>     > SOBER128/SHA1HMAC (mode 0).
>     > rec=[20] Log Message=Initializing transport (UDP/IP Unicast).
>     > rec=[21] Log Message=Initializing transmit/receive security:
>     libtomcrypt
>     > SOBER128/SHA1HMAC (mode 0).
>     > rec=[22] Log Message=you are using ipc api v2
>     > rec=[23] Log Message=The network interface [10.0.2.11] is now up.
>     > rec=[24] Log Message=Created or loaded sequence id 0.10.0.2.11 for
>     this
>     > ring.
>     > rec=[25] Log Message=debug: pcmk_user_lookup: Cluster user root has
>     > uid=0 gid=0
>     > rec=[26] Log Message=info: process_ais_conf: Reading configure
>     > rec=[27] Log Message=info: config_find_init: Local handle:
>     > 4552499517957603332 for logging
>     > rec=[28] Log Message=info: config_find_next: Processing additional
>     > logging options...
>     > rec=[29] Log Message=info: get_config_opt: Found 'on' for option:
>     debug
>     > rec=[30] Log Message=info: get_config_opt: Found 'yes' for option:
>     > to_logfile
>     > rec=[31] Log Message=info: get_config_opt: Found
>     > '/var/log/cluster/corosync.log' for option: logfile
>     > rec=[32] Log Message=info: get_config_opt: Found 'no' for option:
>     to_syslog
>     > rec=[33] Log Message=info: process_ais_conf: User configured file
>     based
>     > logging and explicitly disabled syslog.
>     > rec=[34] Log Message=info: config_find_init: Local handle:
>     > 8972265949260414981 for service
>     > rec=[35] Log Message=info: config_find_next: Processing additional
>     > service options...
>     > rec=[36] Log Message=info: get_config_opt: Defaulting to 'pcmk' for
>     > option: clustername
>     > rec=[37] Log Message=info: get_config_opt: Found 'no' for option:
>     use_logd
>     > rec=[38] Log Message=info: get_config_opt: Found 'yes' for option:
>     use_mgmtd
>     > rec=[39] Log Message=info: pcmk_startup: CRM: Initialized
>     > rec=[40] Log Message=Logging: Initialized pcmk_startup
>     > rec=[41] Log Message=info: pcmk_startup: Maximum core file size is:
>     > 18446744073709551615
>     > rec=[42] Log Message=debug: pcmk_user_lookup: Cluster user
>     hacluster has
>     > uid=101 gid=102
>     > rec=[43] Log Message=info: pcmk_startup: Service: 9
>     > rec=[44] Log Message=info: pcmk_startup: Local hostname: cluster1
>     > rec=[45] Log Message=info: pcmk_update_nodeid: Local node id:
>     184680458
>     > rec=[46] Log Message=info: update_member: Creating entry for node
>     > 184680458 born on 0
>     > rec=[47] Log Message=info: update_member: 0x5fe73e0 Node 184680458 now
>     > known as cluster1 (was: (null))
>     > rec=[48] Log Message=info: update_member: Node cluster1 now has 1
>     quorum
>     > votes (was 0)
>     > rec=[49] Log Message=info: update_member: Node 184680458/cluster1 is
>     > now: member
>     > rec=[50] Log Message=info: spawn_child: Forked child 16261 for process
>     > stonithd
>     > rec=[51] Log Message=debug: pcmk_user_lookup: Cluster user
>     hacluster has
>     > uid=101 gid=102
>     > rec=[52] Log Message=info: spawn_child: Forked child 16262 for
>     process cib
>     > rec=[53] Log Message=info: spawn_child: Forked child 16263 for
>     process lrmd
>     > rec=[54] Log Message=debug: pcmk_user_lookup: Cluster user
>     hacluster has
>     > uid=101 gid=102
>     > rec=[55] Log Message=info: spawn_child: Forked child 16264 for
>     process attrd
>     > rec=[56] Log Message=debug: pcmk_user_lookup: Cluster user
>     hacluster has
>     > uid=101 gid=102
>     > rec=[57] Log Message=info: spawn_child: Forked child 16265 for process
>     > pengine
>     > rec=[58] Log Message=debug: pcmk_user_lookup: Cluster user
>     hacluster has
>     > uid=101 gid=102
>     > rec=[59] Log Message=info: spawn_child: Forked child 16266 for
>     process crmd
>     > *rec=[60] Log Message=spawn_child: FATAL: Cannot exec
>     > /usr/lib64/heartbeat/mgmtd: (2) No such file or directory*
>     > *** buffer overflow detected ***: corosync-fplay terminated
>     > ======= Backtrace: =========
>     > /lib64/libc.so.6(__chk_fail+0x2f)[0x37a72e6c2f]
>     > corosync-fplay[0x400c0b]
>     > /lib64/libc.so.6(__libc_start_main+0xf4)[0x37a721d974]
>     > corosync-fplay[0x4008d9]
>     > ======= Memory map: ========
>     > 00400000-00402000 r-xp 00000000 08:05 1041788
>     >  /usr/sbin/corosync-fplay
>     > 00602000-00603000 rw-p 00002000 08:05 1041788
>     >  /usr/sbin/corosync-fplay
>     > 00603000-0060d000 rw-p 00603000 00:00 0
>     > 0af45000-0af66000 rw-p 0af45000 00:00 0
>     >  [heap]
>     > 3190400000-319040d000 r-xp 00000000 08:02 911558
>     > /lib64/libgcc_s-4.1.2-20080825.so.1
>     > 319040d000-319060d000 ---p 0000d000 08:02 911558
>     > /lib64/libgcc_s-4.1.2-20080825.so.1
>     > 319060d000-319060e000 rw-p 0000d000 08:02 911558
>     > /lib64/libgcc_s-4.1.2-20080825.so.1
>     > 37a6e00000-37a6e1c000 r-xp 00000000 08:02 911525
>     > /lib64/ld-2.5.so <http://ld-2.5.so> <http://ld-2.5.so>
>     > 37a701b000-37a701c000 r--p 0001b000 08:02 911525
>     > /lib64/ld-2.5.so <http://ld-2.5.so> <http://ld-2.5.so>
>     > 37a701c000-37a701d000 rw-p 0001c000 08:02 911525
>     > /lib64/ld-2.5.so <http://ld-2.5.so> <http://ld-2.5.so>
>     > 37a7200000-37a734c000 r-xp 00000000 08:02 911526
>     > /lib64/libc-2.5.so <http://libc-2.5.so> <http://libc-2.5.so>
>     > 37a734c000-37a754c000 ---p 0014c000 08:02 911526
>     > /lib64/libc-2.5.so <http://libc-2.5.so> <http://libc-2.5.so>
>     > 37a754c000-37a7550000 r--p 0014c000 08:02 911526
>     > /lib64/libc-2.5.so <http://libc-2.5.so> <http://libc-2.5.so>
>     > 37a7550000-37a7551000 rw-p 00150000 08:02 911526
>     > /lib64/libc-2.5.so <http://libc-2.5.so> <http://libc-2.5.so>
>     > 37a7551000-37a7556000 rw-p 37a7551000 00:00 0
>     > 37a7600000-37a7602000 r-xp 00000000 08:02 911527
>     > /lib64/libdl-2.5.so <http://libdl-2.5.so> <http://libdl-2.5.so>
>     > 37a7602000-37a7802000 ---p 00002000 08:02 911527
>     > /lib64/libdl-2.5.so <http://libdl-2.5.so> <http://libdl-2.5.so>
>     > 37a7802000-37a7803000 r--p 00002000 08:02 911527
>     > /lib64/libdl-2.5.so <http://libdl-2.5.so> <http://libdl-2.5.so>
>     > 37a7803000-37a7804000 rw-p 00003000 08:02 911527
>     > /lib64/libdl-2.5.so <http://libdl-2.5.so> <http://libdl-2.5.so>
>     > 37a7e00000-37a7e16000 r-xp 00000000 08:02 911531
>     > /lib64/libpthread-2.5.so <http://libpthread-2.5.so>
>     <http://libpthread-2.5.so>
>     > 37a7e16000-37a8015000 ---p 00016000 08:02 911531
>     > /lib64/libpthread-2.5.so <http://libpthread-2.5.so>
>     <http://libpthread-2.5.so>
>     > 37a8015000-37a8016000 r--p 00015000 08:02 911531
>     > /lib64/libpthread-2.5.so <http://libpthread-2.5.so>
>     <http://libpthread-2.5.so>
>     > 37a8016000-37a8017000 rw-p 00016000 08:02 911531
>     > /lib64/libpthread-2.5.so <http://libpthread-2.5.so>
>     <http://libpthread-2.5.so>
>     > 37a8017000-37a801b000 rw-p 37a8017000 00:00 0
>     > 37a8e00000-37a8e07000 r-xp 00000000 08:02 911532
>     > /lib64/librt-2.5.so <http://librt-2.5.so> <http://librt-2.5.so>
>     > 37a8e07000-37a9007000 ---p 00007000 08:02 911532
>     > /lib64/librt-2.5.so <http://librt-2.5.so> <http://librt-2.5.so>
>     > 37a9007000-37a9008000 r--p 00007000 08:02 911532
>     > /lib64/librt-2.5.so <http://librt-2.5.so> <http://librt-2.5.so>
>     > 37a9008000-37a9009000 rw-p 00008000 08:02 911532
>     > /lib64/librt-2.5.so <http://librt-2.5.so> <http://librt-2.5.so>
>     > 2b3bae55d000-2b3bae55e000 rw-p 2b3bae55d000 00:00 0
>     > 2b3bae56a000-2b3bae943000 rw-p 2b3bae56a000 00:00 0
>     > 7ffffc537000-7ffffc54c000 rw-p 7ffffffea000 00:00 0
>     >  [stack]
>     > ffffffffff600000-ffffffffffe00000 ---p 00000000 00:00 0
>     >  [vdso]
>     > /usr/bin/corosync-blackbox: line 34: 16676 Aborted
>     > corosync-fplay
>     >
>     > I see the error message with mgmtd however I've performed the same
>     test
>     > on a pair of XEN VM's with the exact same packages (clean install, no
>     > upgrade from openais-0.80 like the real hardware) and mgmtd doesn't
>     > exist either, but it says
>     >
>     > rec=[55] Log Message=info: spawn_child: Forked child 4459 for
>     process mgmtd
>     >
>     > # ll /usr/lib64/heartbeat/mgmtd
>     > ls: /usr/lib64/heartbeat/mgmtd: No such file or directory
>     >
>     > # rpm -ql pacemaker-1.0.10-1.4 | grep mgm
>     > /usr/lib64/python2.4/site-packages/crm/idmgmt.py
>     > /usr/lib64/python2.4/site-packages/crm/idmgmt.pyc
>     > /usr/lib64/python2.4/site-packages/crm/idmgmt.pyo
>     >
>     > Anyone?
>     >
>     > Regards,
>     > Dan
>     >
>     > On Wed, Jan 26, 2011 at 1:35 PM, Dan Frincu <[email protected]
>     <mailto:[email protected]>
>     > <mailto:[email protected] <mailto:[email protected]>>> wrote:
>     >
>     >     Hi,
>     >
>     >     I've got a pair of servers running on RHEL5 x86_64 with
>     openais-0.80
>     >     (older install) which I want to upgrade to corosync-1.3.0 +
>     >     pacemaker-1.0.10. Downtime is not an issue and corosync 1.3.0 is
>     >     needed for UDPU, so I built it from the corosync.org
>     <http://corosync.org>
>     >     <http://corosync.org/> website.
>     >
>     >     With pacemaker, we won't be using the heartbeat stack, so I built
>     >     the pacemaker package from the clusterlabs.org
>     <http://clusterlabs.org>
>     >     <http://clusterlabs.org/> src.rpm without heartbeat support. To be
>     >     more precise I used
>     >
>     >     rpmbuild --without heartbeat --with ais --with snmp --with
>     esmtp -ba
>     >     pacemaker-epel.spec
>     >
>     >     Now I've tested the rpm list below on a pair of XEN VM's, it works
>     >     just fine.
>     >
>     >     cluster-glue-1.0.6-1.6.el5.x86_64.rpm
>     >     cluster-glue-libs-1.0.6-1.6.el5.x86_64.rpm
>     >     corosync-1.3.0-1.x86_64.rpm
>     >     corosynclib-1.3.0-1.x86_64.rpm
>     >     libesmtp-1.0.4-5.el5.x86_64.rpm
>     >     libibverbs-1.1.2-1.el5.x86_64.rpm
>     >     librdmacm-1.0.8-1.el5.x86_64.rpm
>     >     libtool-ltdl-1.5.22-6.1.x86_64.rpm
>     >     openais-1.1.4-2.x86_64.rpm
>     >     openaislib-1.1.4-2.x86_64.rpm
>     >     openhpi-2.10.2-1.el5.x86_64.rpm
>     >     openib-1.3.2-0.20080728.0355.3.el5.noarch.rpm
>     >     pacemaker-1.0.10-1.4.x86_64.rpm
>     >     pacemaker-libs-1.0.10-1.4.x86_64.rpm
>     >     perl-TimeDate-1.16-5.el5.noarch.rpm
>     >     resource-agents-1.0.3-2.6.el5.x86_64.rpm
>     >
>     >     However when performing the upgrade on the servers running
>     >     openais-0.80, first I removed the heartbeat, heartbeat-libs and
>     >     PyXML rpms (conflicting dependencies issue) then rpm -Uvh on
>     the rpm
>     >     list above. Installation went fine, removed existing cib.xml and
>     >     signatures, fresh start. Then I configured corosync, then
>     started it
>     >     on both servers, and nothing. At first I got an error related to
>     >     pacemaker mgmt, which was an old package installed with the old
>     >     rpms. Removed it, tried again. Nothing. Removed all cluster
>     related
>     >     rpms old and new + deps, except for DRBD, then installed the list
>     >     above, then again, nothing. What nothing means:
>     >     - corosync starts, never elects DC, never sees the other node or
>     >     itself for that matter.
>     >     - can stop corosync via the init script, it goes into an endless
>     >     phase where it just prints dots to the screen, have to kill the
>     >     process to make it stop.
>     >
>     >     Troubleshooting done so far:
>     >     - tested network sockets (nc from side to side), firewall rules
>     >     (iptables down), communication is ok
>     >     - searched for the original RPM's list, removed all remaining
>     RPMs,
>     >     ran ldconfig, removed new RPM's, installed new RPM's
>     >
>     >     My guess is that there are some leftovers from the old
>     openais-0.80
>     >     installation, which mess with the current installation, seeing as
>     >     how the same set of RPMs on a pair of XEN VM's with the same
>     OS work
>     >     fine, however I cannot put my finger on the culprit for the real
>     >     servers' issue.
>     >
>     >     Logs: http://pastebin.com/i0maZM4p
>     >
>     >     Removed everything after removing the RPM's, just to be extra
>     >     paranoid about leftovers (rpm -qpl *.rpm >> file && for i in `cat
>     >     file `; do [[ -e "$i" ]] && echo "$i" >> newfile ; done && for
>     i in
>     >     `cat newfile` ; do rm -rf $i ; done)
>     >
>     >     Installed RPMs (without openais)
>     >
>     >     Same output
>     >
>     >     http://pastebin.com/3iPHSXua
>     >
>     >     It seems to go into some sort of loop.
>     >
>     >     Jan 26 12:13:41 cluster1 crmd: [15612]: ERROR: crm_timer_popped:
>     >     Integration Timer (I_INTEGRATED) just popped!
>     >     Jan 26 12:13:41 cluster1 crmd: [15612]: info: crm_timer_popped:
>     >     Welcomed: 1, Integrated: 0
>     >     Jan 26 12:13:41 cluster1 crmd: [15612]: info: do_state_transition:
>     >     State transition S_INTEGRATION -> S_FINALIZE_JOIN [
>     >     input=I_INTEGRATED cause=C_TIMER_POPPED origin=crm_timer_popped ]
>     >     Jan 26 12:13:41 cluster1 crmd: [15612]: WARN: do_state_transition:
>     >     Progressed to state S_FINALIZE_JOIN after C_TIMER_POPPED
>     >     Jan 26 12:13:41 cluster1 crmd: [15612]: WARN:
>     do_state_transition: 1
>     >     cluster nodes failed to respond to the join offer.
>     >     Jan 26 12:13:41 cluster1 crmd: [15612]: info: ghash_print_node:
>     >     Welcome reply not received from: cluster1 7
>     >     Jan 26 12:13:41 cluster1 crmd: [15612]: WARN: do_log: FSA: Input
>     >     I_ELECTION_DC from do_dc_join_finalize() received in state
>     >     S_FINALIZE_JOIN
>     >     Jan 26 12:13:41 cluster1 crmd: [15612]: info: do_state_transition:
>     >     State transition S_FINALIZE_JOIN -> S_INTEGRATION [
>     >     input=I_ELECTION_DC cause=C_FSA_INTERNAL
>     origin=do_dc_join_finalize ]
>     >     Jan 26 12:13:41 cluster1 crmd: [15612]: info:
>     do_dc_join_offer_all:
>     >     join-8: Waiting on 1 outstanding join acks
>     >     Jan 26 12:16:41 cluster1 crmd: [15612]: ERROR: crm_timer_popped:
>     >     Integration Timer (I_INTEGRATED) just popped!
>     >     Jan 26 12:16:41 cluster1 crmd: [15612]: info: crm_timer_popped:
>     >     Welcomed: 1, Integrated: 0
>     >     Jan 26 12:16:41 cluster1 crmd: [15612]: info: do_state_transition:
>     >     State transition S_INTEGRATION -> S_FINALIZE_JOIN [
>     >     input=I_INTEGRATED cause=C_TIMER_POPPED origin=crm_timer_popped ]
>     >     Jan 26 12:16:41 cluster1 crmd: [15612]: WARN: do_state_transition:
>     >     Progressed to state S_FINALIZE_JOIN after C_TIMER_POPPED
>     >     Jan 26 12:16:41 cluster1 crmd: [15612]: WARN:
>     do_state_transition: 1
>     >     cluster nodes failed to respond to the join offer.
>     >     Jan 26 12:16:41 cluster1 crmd: [15612]: info: ghash_print_node:
>     >     Welcome reply not received from: cluster1 8
>     >     Jan 26 12:16:41 cluster1 crmd: [15612]: WARN: do_log: FSA: Input
>     >     I_ELECTION_DC from do_dc_join_finalize() received in state
>     >     S_FINALIZE_JOIN
>     >     Jan 26 12:16:41 cluster1 crmd: [15612]: info: do_state_transition:
>     >     State transition S_FINALIZE_JOIN -> S_INTEGRATION [
>     >     input=I_ELECTION_DC cause=C_FSA_INTERNAL
>     origin=do_dc_join_finalize ]
>     >     Jan 26 12:16:41 cluster1 crmd: [15612]: info:
>     do_dc_join_offer_all:
>     >     join-9: Waiting on 1 outstanding join acks
>     >     Jan 26 12:19:41 cluster1 crmd: [15612]: ERROR: crm_timer_popped:
>     >     Integration Timer (I_INTEGRATED) just popped!
>     >     Jan 26 12:19:41 cluster1 crmd: [15612]: info: crm_timer_popped:
>     >     Welcomed: 1, Integrated: 0
>     >     Jan 26 12:19:41 cluster1 crmd: [15612]: info: do_state_transition:
>     >     State transition S_INTEGRATION -> S_FINALIZE_JOIN [
>     >     input=I_INTEGRATED cause=C_TIMER_POPPED origin=crm_timer_popped ]
>     >     Jan 26 12:19:41 cluster1 crmd: [15612]: WARN: do_state_transition:
>     >     Progressed to state S_FINALIZE_JOIN after C_TIMER_POPPED
>     >     Jan 26 12:19:41 cluster1 crmd: [15612]: WARN:
>     do_state_transition: 1
>     >     cluster nodes failed to respond to the join offer.
>     >     Jan 26 12:19:41 cluster1 crmd: [15612]: info: ghash_print_node:
>     >     Welcome reply not received from: cluster1 9
>     >     Jan 26 12:19:41 cluster1 crmd: [15612]: WARN: do_log: FSA: Input
>     >     I_ELECTION_DC from do_dc_join_finalize() received in state
>     >     S_FINALIZE_JOIN
>     >     Jan 26 12:19:41 cluster1 crmd: [15612]: info: do_state_transition:
>     >     State transition S_FINALIZE_JOIN -> S_INTEGRATION [
>     >     input=I_ELECTION_DC cause=C_FSA_INTERNAL
>     origin=do_dc_join_finalize ]
>     >     Jan 26 12:19:41 cluster1 crmd: [15612]: info:
>     do_dc_join_offer_all:
>     >     join-10: Waiting on 1 outstanding join acks
>     >     Jan 26 12:20:11 cluster1 cib: [15608]: info: cib_stats:
>     Processed 1
>     >     operations (0.00us average, 0% utilization) in the last 10min
>     >
>     >     Any suggestions?
>     >
>     >     TIA.
>     >
>     >     Regards,
>     >     Dan
>     >
>     >     --
>     >     Dan Frîncu
>     >     CCNA, RHCE
>     >
>     >
>     >
>     >
>     > --
>     > Dan Frîncu
>     > CCNA, RHCE
>     >
>     >
>     >
>     > _______________________________________________
>     > Openais mailing list
>     > [email protected]
>     <mailto:[email protected]>
>     > https://lists.linux-foundation.org/mailman/listinfo/openais
> 
> 
> 
> 
> -- 
> Dan Frîncu
> CCNA, RHCE
> 

_______________________________________________
Openais mailing list
[email protected]
https://lists.linux-foundation.org/mailman/listinfo/openais

Re: [Openais] Upgrade from openais-0.80 to corosync-1.3.0 fails

Reply via email to