On 01/26/2011 10:10 AM, Dan Frincu wrote: > Hi Steve, > > On Wed, Jan 26, 2011 at 6:53 PM, Steven Dake <[email protected] > <mailto:[email protected]>> wrote: > > Gather from state 3 back to back is an indicator that iptables are not > properly configured on the node. I know you said iptables are turned > off, but if iptables are off, the node would at least form a > singleton ring. > > Could you send your config file? > > > I haven't read about the MCP deployment model that you've mentioned in > your previous email, I'd like to know more, if you can point me to the > right documentation I'd appreciate it. >
http://theclusterguy.clusterlabs.org/post/907043024/introducing-the-pacemaker-master-control-process-for The udpu mode has not been tested at all by me or anyone that I know with rrp. In the case your using, you are using the same port on both rings. The port numbers should be different at a minimum. Give that a go and see if it solves your problems. Regards -steve > Here is the config file, iptables output, corosync-cfgtool -s, > corosync-fplay: http://pastebin.com/bChQZgaE > > Let me know if there's anything else that I can provide. > > Regards, > Dan > > > Regards > -steve > > > On 01/26/2011 09:01 AM, Dan Frincu wrote: > > Update: increased verbosity to debug and I get the following > > > > Jan 26 16:36:45 corosync [TOTEM ] The consensus timeout expired. > > Jan 26 16:36:45 corosync [TOTEM ] entering GATHER state from 3. > > Jan 26 16:36:53 corosync [TOTEM ] The consensus timeout expired. > > Jan 26 16:36:53 corosync [TOTEM ] entering GATHER state from 3. > > Jan 26 16:37:00 corosync [TOTEM ] The consensus timeout expired. > > Jan 26 16:37:00 corosync [TOTEM ] entering GATHER state from 3. > > Jan 26 16:37:08 corosync [TOTEM ] The consensus timeout expired. > > Jan 26 16:37:08 corosync [TOTEM ] entering GATHER state from 3. > > Jan 26 16:37:12 cluster1 crmd: [16266]: ERROR: crm_timer_popped: > > Integration Timer (I_INTEGRATED) just popped! > > Jan 26 16:37:12 cluster1 crmd: [16266]: info: crm_timer_popped: > > Welcomed: 1, Integrated: 0 > > Jan 26 16:37:12 cluster1 crmd: [16266]: debug: s_crmd_fsa: Processing > > I_INTEGRATED: [ state=S_INTEGRATION cause=C_TIMER_POPPED > > origin=crm_timer_popped ] > > Jan 26 16:37:12 cluster1 crmd: [16266]: info: do_state_transition: > State > > transition S_INTEGRATION -> S_FINALIZE_JOIN [ input=I_INTEGRATED > > cause=C_TIMER_POPPED origin=crm_timer_popped ] > > Jan 26 16:37:12 cluster1 crmd: [16266]: WARN: do_state_transition: > > Progressed to state S_FINALIZE_JOIN after C_TIMER_POPPED > > Jan 26 16:37:12 cluster1 crmd: [16266]: WARN: do_state_transition: 1 > > cluster nodes failed to respond to the join offer. > > Jan 26 16:37:12 cluster1 crmd: [16266]: info: ghash_print_node: > > Welcome reply not received from: cluster1 42 > > Jan 26 16:37:12 cluster1 crmd: [16266]: debug: do_fsa_action: > > actions:trace: // A_DC_TIMER_STOP > > Jan 26 16:37:12 cluster1 crmd: [16266]: debug: do_fsa_action: > > actions:trace: // A_INTEGRATE_TIMER_STOP > > Jan 26 16:37:12 cluster1 crmd: [16266]: debug: do_fsa_action: > > actions:trace: // A_FINALIZE_TIMER_START > > Jan 26 16:37:12 cluster1 crmd: [16266]: debug: crm_timer_start: > Started > > Finalization Timer (I_ELECTION:1800000ms), src=102 > > Jan 26 16:37:12 cluster1 crmd: [16266]: debug: do_fsa_action: > > actions:trace: // A_DC_JOIN_FINALIZE > > Jan 26 16:37:12 cluster1 crmd: [16266]: debug: do_dc_join_finalize: > > Finializing join-42 for 0 clients > > Jan 26 16:37:12 cluster1 crmd: [16266]: debug: s_crmd_fsa: Processing > > I_ELECTION_DC: [ state=S_FINALIZE_JOIN cause=C_FSA_INTERNAL > > origin=do_dc_join_finalize ] > > Jan 26 16:37:12 cluster1 crmd: [16266]: debug: do_fsa_action: > > actions:trace: // A_WARN > > Jan 26 16:37:12 cluster1 crmd: [16266]: WARN: do_log: FSA: Input > > I_ELECTION_DC from do_dc_join_finalize() received in state > S_FINALIZE_JOIN > > Jan 26 16:37:12 cluster1 crmd: [16266]: info: do_state_transition: > State > > transition S_FINALIZE_JOIN -> S_INTEGRATION [ input=I_ELECTION_DC > > cause=C_FSA_INTERNAL origin=do_dc_join_finalize ] > > Jan 26 16:37:12 cluster1 crmd: [16266]: debug: do_fsa_action: > > actions:trace: // A_DC_TIMER_STOP > > Jan 26 16:37:12 cluster1 crmd: [16266]: debug: do_fsa_action: > > actions:trace: // A_INTEGRATE_TIMER_START > > Jan 26 16:37:12 cluster1 crmd: [16266]: debug: crm_timer_start: > Started > > Integration Timer (I_INTEGRATED:180000ms), src=103 > > Jan 26 16:37:12 cluster1 crmd: [16266]: debug: do_fsa_action: > > actions:trace: // A_FINALIZE_TIMER_STOP > > Jan 26 16:37:12 cluster1 crmd: [16266]: debug: do_fsa_action: > > actions:trace: // A_ELECTION_VOTE > > Jan 26 16:37:12 corosync [TOTEM ] mcasted message added to pending > queue > > Jan 26 16:37:12 cluster1 crmd: [16266]: debug: do_election_vote: > Started > > election 44 > > Jan 26 16:37:12 cluster1 crmd: [16266]: debug: do_fsa_action: > > actions:trace: // A_DC_JOIN_OFFER_ALL > > Jan 26 16:37:12 cluster1 crmd: [16266]: debug: initialize_join: > join-43: > > Initializing join data (flag=true) > > Jan 26 16:37:12 cluster1 crmd: [16266]: debug: join_make_offer: > join-43: > > Sending offer to cluster1 > > Jan 26 16:37:12 cluster1 crmd: [16266]: info: do_dc_join_offer_all: > > join-43: Waiting on 1 outstanding join acks > > Jan 26 16:37:15 corosync [TOTEM ] The consensus timeout expired. > > Jan 26 16:37:16 corosync [TOTEM ] entering GATHER state from 3. > > Jan 26 16:37:23 corosync [TOTEM ] The consensus timeout expired. > > Jan 26 16:37:23 corosync [TOTEM ] entering GATHER state from 3. > > Jan 26 16:37:31 corosync [TOTEM ] The consensus timeout expired. > > > > Running corosync-blackbox gives me: > > > > # corosync-blackbox > > Starting replay: head [67420] tail [0] > > rec=[1] Log Message=Corosync Cluster Engine ('1.3.0'): started and > ready > > to provide service. > > rec=[2] Log Message=Corosync built-in features: nss rdma > > rec=[3] Log Message=Successfully read main configuration file > > '/etc/corosync/corosync.conf'. > > rec=[4] Log Message=Token Timeout (5000 ms) retransmit timeout > (490 ms) > > rec=[5] Log Message=token hold (382 ms) retransmits before loss > (10 retrans) > > rec=[6] Log Message=join (1000 ms) send_join (45 ms) consensus > (2500 ms) > > merge (200 ms) > > rec=[7] Log Message=downcheck (1000 ms) fail to recv const (50 msgs) > > rec=[8] Log Message=seqno unchanged const (30 rotations) Maximum > network > > MTU 1402 > > rec=[9] Log Message=window size per rotation (50 messages) maximum > > messages per rotation (25 messages) > > rec=[10] Log Message=send threads (0 threads) > > rec=[11] Log Message=RRP token expired timeout (490 ms) > > rec=[12] Log Message=RRP token problem counter (2000 ms) > > rec=[13] Log Message=RRP threshold (10 problem count) > > rec=[14] Log Message=RRP mode set to active. > > rec=[15] Log Message=heartbeat_failures_allowed (0) > > rec=[16] Log Message=max_network_delay (50 ms) > > rec=[17] Log Message=HeartBeat is Disabled. To enable set > > heartbeat_failures_allowed > 0 > > rec=[18] Log Message=Initializing transport (UDP/IP Unicast). > > rec=[19] Log Message=Initializing transmit/receive security: > libtomcrypt > > SOBER128/SHA1HMAC (mode 0). > > rec=[20] Log Message=Initializing transport (UDP/IP Unicast). > > rec=[21] Log Message=Initializing transmit/receive security: > libtomcrypt > > SOBER128/SHA1HMAC (mode 0). > > rec=[22] Log Message=you are using ipc api v2 > > rec=[23] Log Message=The network interface [10.0.2.11] is now up. > > rec=[24] Log Message=Created or loaded sequence id 0.10.0.2.11 for > this > > ring. > > rec=[25] Log Message=debug: pcmk_user_lookup: Cluster user root has > > uid=0 gid=0 > > rec=[26] Log Message=info: process_ais_conf: Reading configure > > rec=[27] Log Message=info: config_find_init: Local handle: > > 4552499517957603332 for logging > > rec=[28] Log Message=info: config_find_next: Processing additional > > logging options... > > rec=[29] Log Message=info: get_config_opt: Found 'on' for option: > debug > > rec=[30] Log Message=info: get_config_opt: Found 'yes' for option: > > to_logfile > > rec=[31] Log Message=info: get_config_opt: Found > > '/var/log/cluster/corosync.log' for option: logfile > > rec=[32] Log Message=info: get_config_opt: Found 'no' for option: > to_syslog > > rec=[33] Log Message=info: process_ais_conf: User configured file > based > > logging and explicitly disabled syslog. > > rec=[34] Log Message=info: config_find_init: Local handle: > > 8972265949260414981 for service > > rec=[35] Log Message=info: config_find_next: Processing additional > > service options... > > rec=[36] Log Message=info: get_config_opt: Defaulting to 'pcmk' for > > option: clustername > > rec=[37] Log Message=info: get_config_opt: Found 'no' for option: > use_logd > > rec=[38] Log Message=info: get_config_opt: Found 'yes' for option: > use_mgmtd > > rec=[39] Log Message=info: pcmk_startup: CRM: Initialized > > rec=[40] Log Message=Logging: Initialized pcmk_startup > > rec=[41] Log Message=info: pcmk_startup: Maximum core file size is: > > 18446744073709551615 > > rec=[42] Log Message=debug: pcmk_user_lookup: Cluster user > hacluster has > > uid=101 gid=102 > > rec=[43] Log Message=info: pcmk_startup: Service: 9 > > rec=[44] Log Message=info: pcmk_startup: Local hostname: cluster1 > > rec=[45] Log Message=info: pcmk_update_nodeid: Local node id: > 184680458 > > rec=[46] Log Message=info: update_member: Creating entry for node > > 184680458 born on 0 > > rec=[47] Log Message=info: update_member: 0x5fe73e0 Node 184680458 now > > known as cluster1 (was: (null)) > > rec=[48] Log Message=info: update_member: Node cluster1 now has 1 > quorum > > votes (was 0) > > rec=[49] Log Message=info: update_member: Node 184680458/cluster1 is > > now: member > > rec=[50] Log Message=info: spawn_child: Forked child 16261 for process > > stonithd > > rec=[51] Log Message=debug: pcmk_user_lookup: Cluster user > hacluster has > > uid=101 gid=102 > > rec=[52] Log Message=info: spawn_child: Forked child 16262 for > process cib > > rec=[53] Log Message=info: spawn_child: Forked child 16263 for > process lrmd > > rec=[54] Log Message=debug: pcmk_user_lookup: Cluster user > hacluster has > > uid=101 gid=102 > > rec=[55] Log Message=info: spawn_child: Forked child 16264 for > process attrd > > rec=[56] Log Message=debug: pcmk_user_lookup: Cluster user > hacluster has > > uid=101 gid=102 > > rec=[57] Log Message=info: spawn_child: Forked child 16265 for process > > pengine > > rec=[58] Log Message=debug: pcmk_user_lookup: Cluster user > hacluster has > > uid=101 gid=102 > > rec=[59] Log Message=info: spawn_child: Forked child 16266 for > process crmd > > *rec=[60] Log Message=spawn_child: FATAL: Cannot exec > > /usr/lib64/heartbeat/mgmtd: (2) No such file or directory* > > *** buffer overflow detected ***: corosync-fplay terminated > > ======= Backtrace: ========= > > /lib64/libc.so.6(__chk_fail+0x2f)[0x37a72e6c2f] > > corosync-fplay[0x400c0b] > > /lib64/libc.so.6(__libc_start_main+0xf4)[0x37a721d974] > > corosync-fplay[0x4008d9] > > ======= Memory map: ======== > > 00400000-00402000 r-xp 00000000 08:05 1041788 > > /usr/sbin/corosync-fplay > > 00602000-00603000 rw-p 00002000 08:05 1041788 > > /usr/sbin/corosync-fplay > > 00603000-0060d000 rw-p 00603000 00:00 0 > > 0af45000-0af66000 rw-p 0af45000 00:00 0 > > [heap] > > 3190400000-319040d000 r-xp 00000000 08:02 911558 > > /lib64/libgcc_s-4.1.2-20080825.so.1 > > 319040d000-319060d000 ---p 0000d000 08:02 911558 > > /lib64/libgcc_s-4.1.2-20080825.so.1 > > 319060d000-319060e000 rw-p 0000d000 08:02 911558 > > /lib64/libgcc_s-4.1.2-20080825.so.1 > > 37a6e00000-37a6e1c000 r-xp 00000000 08:02 911525 > > /lib64/ld-2.5.so <http://ld-2.5.so> <http://ld-2.5.so> > > 37a701b000-37a701c000 r--p 0001b000 08:02 911525 > > /lib64/ld-2.5.so <http://ld-2.5.so> <http://ld-2.5.so> > > 37a701c000-37a701d000 rw-p 0001c000 08:02 911525 > > /lib64/ld-2.5.so <http://ld-2.5.so> <http://ld-2.5.so> > > 37a7200000-37a734c000 r-xp 00000000 08:02 911526 > > /lib64/libc-2.5.so <http://libc-2.5.so> <http://libc-2.5.so> > > 37a734c000-37a754c000 ---p 0014c000 08:02 911526 > > /lib64/libc-2.5.so <http://libc-2.5.so> <http://libc-2.5.so> > > 37a754c000-37a7550000 r--p 0014c000 08:02 911526 > > /lib64/libc-2.5.so <http://libc-2.5.so> <http://libc-2.5.so> > > 37a7550000-37a7551000 rw-p 00150000 08:02 911526 > > /lib64/libc-2.5.so <http://libc-2.5.so> <http://libc-2.5.so> > > 37a7551000-37a7556000 rw-p 37a7551000 00:00 0 > > 37a7600000-37a7602000 r-xp 00000000 08:02 911527 > > /lib64/libdl-2.5.so <http://libdl-2.5.so> <http://libdl-2.5.so> > > 37a7602000-37a7802000 ---p 00002000 08:02 911527 > > /lib64/libdl-2.5.so <http://libdl-2.5.so> <http://libdl-2.5.so> > > 37a7802000-37a7803000 r--p 00002000 08:02 911527 > > /lib64/libdl-2.5.so <http://libdl-2.5.so> <http://libdl-2.5.so> > > 37a7803000-37a7804000 rw-p 00003000 08:02 911527 > > /lib64/libdl-2.5.so <http://libdl-2.5.so> <http://libdl-2.5.so> > > 37a7e00000-37a7e16000 r-xp 00000000 08:02 911531 > > /lib64/libpthread-2.5.so <http://libpthread-2.5.so> > <http://libpthread-2.5.so> > > 37a7e16000-37a8015000 ---p 00016000 08:02 911531 > > /lib64/libpthread-2.5.so <http://libpthread-2.5.so> > <http://libpthread-2.5.so> > > 37a8015000-37a8016000 r--p 00015000 08:02 911531 > > /lib64/libpthread-2.5.so <http://libpthread-2.5.so> > <http://libpthread-2.5.so> > > 37a8016000-37a8017000 rw-p 00016000 08:02 911531 > > /lib64/libpthread-2.5.so <http://libpthread-2.5.so> > <http://libpthread-2.5.so> > > 37a8017000-37a801b000 rw-p 37a8017000 00:00 0 > > 37a8e00000-37a8e07000 r-xp 00000000 08:02 911532 > > /lib64/librt-2.5.so <http://librt-2.5.so> <http://librt-2.5.so> > > 37a8e07000-37a9007000 ---p 00007000 08:02 911532 > > /lib64/librt-2.5.so <http://librt-2.5.so> <http://librt-2.5.so> > > 37a9007000-37a9008000 r--p 00007000 08:02 911532 > > /lib64/librt-2.5.so <http://librt-2.5.so> <http://librt-2.5.so> > > 37a9008000-37a9009000 rw-p 00008000 08:02 911532 > > /lib64/librt-2.5.so <http://librt-2.5.so> <http://librt-2.5.so> > > 2b3bae55d000-2b3bae55e000 rw-p 2b3bae55d000 00:00 0 > > 2b3bae56a000-2b3bae943000 rw-p 2b3bae56a000 00:00 0 > > 7ffffc537000-7ffffc54c000 rw-p 7ffffffea000 00:00 0 > > [stack] > > ffffffffff600000-ffffffffffe00000 ---p 00000000 00:00 0 > > [vdso] > > /usr/bin/corosync-blackbox: line 34: 16676 Aborted > > corosync-fplay > > > > I see the error message with mgmtd however I've performed the same > test > > on a pair of XEN VM's with the exact same packages (clean install, no > > upgrade from openais-0.80 like the real hardware) and mgmtd doesn't > > exist either, but it says > > > > rec=[55] Log Message=info: spawn_child: Forked child 4459 for > process mgmtd > > > > # ll /usr/lib64/heartbeat/mgmtd > > ls: /usr/lib64/heartbeat/mgmtd: No such file or directory > > > > # rpm -ql pacemaker-1.0.10-1.4 | grep mgm > > /usr/lib64/python2.4/site-packages/crm/idmgmt.py > > /usr/lib64/python2.4/site-packages/crm/idmgmt.pyc > > /usr/lib64/python2.4/site-packages/crm/idmgmt.pyo > > > > Anyone? > > > > Regards, > > Dan > > > > On Wed, Jan 26, 2011 at 1:35 PM, Dan Frincu <[email protected] > <mailto:[email protected]> > > <mailto:[email protected] <mailto:[email protected]>>> wrote: > > > > Hi, > > > > I've got a pair of servers running on RHEL5 x86_64 with > openais-0.80 > > (older install) which I want to upgrade to corosync-1.3.0 + > > pacemaker-1.0.10. Downtime is not an issue and corosync 1.3.0 is > > needed for UDPU, so I built it from the corosync.org > <http://corosync.org> > > <http://corosync.org/> website. > > > > With pacemaker, we won't be using the heartbeat stack, so I built > > the pacemaker package from the clusterlabs.org > <http://clusterlabs.org> > > <http://clusterlabs.org/> src.rpm without heartbeat support. To be > > more precise I used > > > > rpmbuild --without heartbeat --with ais --with snmp --with > esmtp -ba > > pacemaker-epel.spec > > > > Now I've tested the rpm list below on a pair of XEN VM's, it works > > just fine. > > > > cluster-glue-1.0.6-1.6.el5.x86_64.rpm > > cluster-glue-libs-1.0.6-1.6.el5.x86_64.rpm > > corosync-1.3.0-1.x86_64.rpm > > corosynclib-1.3.0-1.x86_64.rpm > > libesmtp-1.0.4-5.el5.x86_64.rpm > > libibverbs-1.1.2-1.el5.x86_64.rpm > > librdmacm-1.0.8-1.el5.x86_64.rpm > > libtool-ltdl-1.5.22-6.1.x86_64.rpm > > openais-1.1.4-2.x86_64.rpm > > openaislib-1.1.4-2.x86_64.rpm > > openhpi-2.10.2-1.el5.x86_64.rpm > > openib-1.3.2-0.20080728.0355.3.el5.noarch.rpm > > pacemaker-1.0.10-1.4.x86_64.rpm > > pacemaker-libs-1.0.10-1.4.x86_64.rpm > > perl-TimeDate-1.16-5.el5.noarch.rpm > > resource-agents-1.0.3-2.6.el5.x86_64.rpm > > > > However when performing the upgrade on the servers running > > openais-0.80, first I removed the heartbeat, heartbeat-libs and > > PyXML rpms (conflicting dependencies issue) then rpm -Uvh on > the rpm > > list above. Installation went fine, removed existing cib.xml and > > signatures, fresh start. Then I configured corosync, then > started it > > on both servers, and nothing. At first I got an error related to > > pacemaker mgmt, which was an old package installed with the old > > rpms. Removed it, tried again. Nothing. Removed all cluster > related > > rpms old and new + deps, except for DRBD, then installed the list > > above, then again, nothing. What nothing means: > > - corosync starts, never elects DC, never sees the other node or > > itself for that matter. > > - can stop corosync via the init script, it goes into an endless > > phase where it just prints dots to the screen, have to kill the > > process to make it stop. > > > > Troubleshooting done so far: > > - tested network sockets (nc from side to side), firewall rules > > (iptables down), communication is ok > > - searched for the original RPM's list, removed all remaining > RPMs, > > ran ldconfig, removed new RPM's, installed new RPM's > > > > My guess is that there are some leftovers from the old > openais-0.80 > > installation, which mess with the current installation, seeing as > > how the same set of RPMs on a pair of XEN VM's with the same > OS work > > fine, however I cannot put my finger on the culprit for the real > > servers' issue. > > > > Logs: http://pastebin.com/i0maZM4p > > > > Removed everything after removing the RPM's, just to be extra > > paranoid about leftovers (rpm -qpl *.rpm >> file && for i in `cat > > file `; do [[ -e "$i" ]] && echo "$i" >> newfile ; done && for > i in > > `cat newfile` ; do rm -rf $i ; done) > > > > Installed RPMs (without openais) > > > > Same output > > > > http://pastebin.com/3iPHSXua > > > > It seems to go into some sort of loop. > > > > Jan 26 12:13:41 cluster1 crmd: [15612]: ERROR: crm_timer_popped: > > Integration Timer (I_INTEGRATED) just popped! > > Jan 26 12:13:41 cluster1 crmd: [15612]: info: crm_timer_popped: > > Welcomed: 1, Integrated: 0 > > Jan 26 12:13:41 cluster1 crmd: [15612]: info: do_state_transition: > > State transition S_INTEGRATION -> S_FINALIZE_JOIN [ > > input=I_INTEGRATED cause=C_TIMER_POPPED origin=crm_timer_popped ] > > Jan 26 12:13:41 cluster1 crmd: [15612]: WARN: do_state_transition: > > Progressed to state S_FINALIZE_JOIN after C_TIMER_POPPED > > Jan 26 12:13:41 cluster1 crmd: [15612]: WARN: > do_state_transition: 1 > > cluster nodes failed to respond to the join offer. > > Jan 26 12:13:41 cluster1 crmd: [15612]: info: ghash_print_node: > > Welcome reply not received from: cluster1 7 > > Jan 26 12:13:41 cluster1 crmd: [15612]: WARN: do_log: FSA: Input > > I_ELECTION_DC from do_dc_join_finalize() received in state > > S_FINALIZE_JOIN > > Jan 26 12:13:41 cluster1 crmd: [15612]: info: do_state_transition: > > State transition S_FINALIZE_JOIN -> S_INTEGRATION [ > > input=I_ELECTION_DC cause=C_FSA_INTERNAL > origin=do_dc_join_finalize ] > > Jan 26 12:13:41 cluster1 crmd: [15612]: info: > do_dc_join_offer_all: > > join-8: Waiting on 1 outstanding join acks > > Jan 26 12:16:41 cluster1 crmd: [15612]: ERROR: crm_timer_popped: > > Integration Timer (I_INTEGRATED) just popped! > > Jan 26 12:16:41 cluster1 crmd: [15612]: info: crm_timer_popped: > > Welcomed: 1, Integrated: 0 > > Jan 26 12:16:41 cluster1 crmd: [15612]: info: do_state_transition: > > State transition S_INTEGRATION -> S_FINALIZE_JOIN [ > > input=I_INTEGRATED cause=C_TIMER_POPPED origin=crm_timer_popped ] > > Jan 26 12:16:41 cluster1 crmd: [15612]: WARN: do_state_transition: > > Progressed to state S_FINALIZE_JOIN after C_TIMER_POPPED > > Jan 26 12:16:41 cluster1 crmd: [15612]: WARN: > do_state_transition: 1 > > cluster nodes failed to respond to the join offer. > > Jan 26 12:16:41 cluster1 crmd: [15612]: info: ghash_print_node: > > Welcome reply not received from: cluster1 8 > > Jan 26 12:16:41 cluster1 crmd: [15612]: WARN: do_log: FSA: Input > > I_ELECTION_DC from do_dc_join_finalize() received in state > > S_FINALIZE_JOIN > > Jan 26 12:16:41 cluster1 crmd: [15612]: info: do_state_transition: > > State transition S_FINALIZE_JOIN -> S_INTEGRATION [ > > input=I_ELECTION_DC cause=C_FSA_INTERNAL > origin=do_dc_join_finalize ] > > Jan 26 12:16:41 cluster1 crmd: [15612]: info: > do_dc_join_offer_all: > > join-9: Waiting on 1 outstanding join acks > > Jan 26 12:19:41 cluster1 crmd: [15612]: ERROR: crm_timer_popped: > > Integration Timer (I_INTEGRATED) just popped! > > Jan 26 12:19:41 cluster1 crmd: [15612]: info: crm_timer_popped: > > Welcomed: 1, Integrated: 0 > > Jan 26 12:19:41 cluster1 crmd: [15612]: info: do_state_transition: > > State transition S_INTEGRATION -> S_FINALIZE_JOIN [ > > input=I_INTEGRATED cause=C_TIMER_POPPED origin=crm_timer_popped ] > > Jan 26 12:19:41 cluster1 crmd: [15612]: WARN: do_state_transition: > > Progressed to state S_FINALIZE_JOIN after C_TIMER_POPPED > > Jan 26 12:19:41 cluster1 crmd: [15612]: WARN: > do_state_transition: 1 > > cluster nodes failed to respond to the join offer. > > Jan 26 12:19:41 cluster1 crmd: [15612]: info: ghash_print_node: > > Welcome reply not received from: cluster1 9 > > Jan 26 12:19:41 cluster1 crmd: [15612]: WARN: do_log: FSA: Input > > I_ELECTION_DC from do_dc_join_finalize() received in state > > S_FINALIZE_JOIN > > Jan 26 12:19:41 cluster1 crmd: [15612]: info: do_state_transition: > > State transition S_FINALIZE_JOIN -> S_INTEGRATION [ > > input=I_ELECTION_DC cause=C_FSA_INTERNAL > origin=do_dc_join_finalize ] > > Jan 26 12:19:41 cluster1 crmd: [15612]: info: > do_dc_join_offer_all: > > join-10: Waiting on 1 outstanding join acks > > Jan 26 12:20:11 cluster1 cib: [15608]: info: cib_stats: > Processed 1 > > operations (0.00us average, 0% utilization) in the last 10min > > > > Any suggestions? > > > > TIA. > > > > Regards, > > Dan > > > > -- > > Dan Frîncu > > CCNA, RHCE > > > > > > > > > > -- > > Dan Frîncu > > CCNA, RHCE > > > > > > > > _______________________________________________ > > Openais mailing list > > [email protected] > <mailto:[email protected]> > > https://lists.linux-foundation.org/mailman/listinfo/openais > > > > > -- > Dan Frîncu > CCNA, RHCE > _______________________________________________ Openais mailing list [email protected] https://lists.linux-foundation.org/mailman/listinfo/openais
