On 08/04/2013, at 9:44 PM, Jimmy Magee <[email protected]> wrote:
> Hi Andrew,
>
> thanks for your reply, we are running at debug level with the following
> config from corosync.conf
>
> logging {
> fileline: off
> to_syslog: yes
> to_stderr: no
> syslog_facility: daemon
> debug: on
> timestamp: on
> }
>
> Looking at the issue further, there seems to be 2 instances of some pacemaker
> daemons running on this particular nodeā¦.
>
>
> ps aux | grep pace
>
> 495 3050 0.2 0.0 89956 7184 ? S 07:10 0:01
> /usr/libexec/pacemaker/cib
> root 3051 0.0 0.0 87128 3152 ? S 07:10 0:00
> /usr/libexec/pacemaker/stonithd
> 495 3053 0.0 0.0 91188 2840 ? S 07:10 0:00
> /usr/libexec/pacemaker/attrd
> 495 3054 0.0 0.0 87336 2484 ? S 07:10 0:00
> /usr/libexec/pacemaker/pengine
> 495 3055 0.0 0.0 91332 3156 ? S 07:10 0:00
> /usr/libexec/pacemaker/crmd
> 495 3057 0.0 0.0 88876 5224 ? S 07:10 0:00
> /usr/libexec/pacemaker/cib
> root 3058 0.0 0.0 87128 3132 ? S 07:10 0:00
> /usr/libexec/pacemaker/stonithd
> 495 3060 0.0 0.0 91188 2788 ? S 07:10 0:00
> /usr/libexec/pacemaker/attrd
> 495 3062 0.0 0.0 91436 3932 ? S 07:10 0:00
> /usr/libexec/pacemaker/crmd
>
>
> ps aux | grep corosync
> root 3044 0.1 0.0 977852 9264 ? Ssl 07:10 0:01 corosync
> root 9363 0.0 0.0 103248 856 pts/0 S+ 07:33 0:00 grep corosync
>
>
> ps aux | grep lrmd
> root 3052 0.0 0.0 76464 2528 ? S 07:10 0:00
> /usr/lib64/heartbeat/lrmd
>
>
> Not sure why this is the case? Appreciate any help..
>
Have you perhaps specified "ver: 0" for the pacemaker plugin and run "service
pacemaker start" ?
> Cheers,
> Jimmy.
>
>
>
>
>
> On 8 Apr 2013, at 03:00, Andrew Beekhof <[email protected]> wrote:
>
>> This doesn't look promising:
>>
>> lrmd: [4939]: info: G_main_add_SignalHandler: Added signal handler for
>> signal 15
>> lrmd: [4946]: info: Signal sent to pid=4939, waiting for process to exit
>> lrmd: [4939]: info: G_main_add_SignalHandler: Added signal handler for
>> signal 17
>> lrmd: [4939]: info: enabling coredumps
>> lrmd: [4939]: info: G_main_add_SignalHandler: Added signal handler for
>> signal 10
>> lrmd: [4939]: info: G_main_add_SignalHandler: Added signal handler for
>> signal 12
>> lrmd: [4939]: info: Started.
>> lrmd: [4939]: info: lrmd is shutting down
>>
>> The lrmd comes up but then immediately shuts down.
>> Perhaps try enabling debug to see if that sheds any light.
>>
>> On 06/04/2013, at 4:58 AM, Jimmy Magee <[email protected]> wrote:
>>
>>> Hi guys,
>>>
>>> Apologies for reposting this query, it inadvertently got added to an
>>> existing topic!
>>>
>>>
>>> We have a three node cluster deployed in a customer's network:
>>> - 2 nodes are on the same switch
>>> - 3rd node on the same subnet but there's a router in between.
>>> - IP Multicast is enabled and has been tested using omping as follows..
>>>
>>> On each node ran..
>>>
>>> omping node01 node02 node3
>>>
>>>
>>> ON node 3
>>>
>>> Node01 : unicast, xmt/rcv/%loss = 23/23/0%, min/avg/max/std-dev =
>>> 0.128/0.181/0.255/0.025
>>> Node01 : multicast, xmt/rcv/%loss = 23/23/0%, min/avg/max/std-dev =
>>> 0.140/0.187/0.219/0.021
>>> Node02 : unicast, xmt/rcv/%loss = 8/8/0%, min/avg/max/std-dev =
>>> 0.115/0.150/0.168/0.021
>>> Node02 : multicast, xmt/rcv/%loss = 8/8/0%, min/avg/max/std-dev =
>>> 0.134/0.162/0.177/0.014
>>>
>>>
>>> On node 2
>>>
>>>
>>> Node01 : unicast, xmt/rcv/%loss = 9/9/0%, min/avg/max/std-dev =
>>> 0.168/0.191/0.205/0.014
>>> Node01 : multicast, xmt/rcv/%loss = 9/8/11% (seq>=2 0%),
>>> min/avg/max/std-dev = 0.138/0.179/0.206/0.028
>>> Node03 : unicast, xmt/rcv/%loss = 9/9/0%, min/avg/max/std-dev =
>>> 0.112/0.149/0.175/0.022
>>> Node03 : multicast, xmt/rcv/%loss = 9/8/11% (seq>=2 0%),
>>> min/avg/max/std-dev = 0.124/0.167/0.178/0.018
>>>
>>>
>>>
>>> On node 1
>>>
>>> Node02 : unicast, xmt/rcv/%loss = 8/8/0%, min/avg/max/std-dev =
>>> 0.154/0.185/0.208/0.019
>>> Node02 : multicast, xmt/rcv/%loss = 8/8/0%, min/avg/max/std-dev =
>>> 0.175/0.198/0.214/0.015
>>> Node03 : unicast, xmt/rcv/%loss = 23/23/0%, min/avg/max/std-dev =
>>> 0.114/0.160/0.185/0.019
>>> Node03 : multicast, xmt/rcv/%loss = 23/22/4% (seq>=2 0%),
>>> min/avg/max/std-dev = 0.124/0.172/0.197/0.019
>>>
>>>
>>> - Problem is intermittent but frequent. Occasionally starts fine when
>>> started from scratch.
>>>
>>> We suspect the problem is related to node 3 as we can see lrmd failures as
>>> per the attached log. We've checked permissions are ok as per
>>> https://bugs.launchpad.net/ubuntu/+source/cluster-glue/+bug/676391
>>>
>>>
>>>
>>> stonith-ng[1437]: error: ais_dispatch: AIS connection failed
>>> stonith-ng[1437]: error: stonith_peer_ais_destroy: AIS connection
>>> terminated
>>> corosync[1430]: [SERV ] Service engine unloaded: Pacemaker Cluster
>>> Manager 1.1.6
>>> corosync[1430]: [SERV ] Service engine unloaded: corosync extended
>>> virtual synchrony service
>>> corosync[1430]: [SERV ] Service engine unloaded: corosync configuration
>>> service
>>> corosync[1430]: [SERV ] Service engine unloaded: corosync cluster closed
>>> process group service v1.01
>>> corosync[1430]: [SERV ] Service engine unloaded: corosync cluster config
>>> database access v1.01
>>> corosync[1430]: [SERV ] Service engine unloaded: corosync profile
>>> loading service
>>> corosync[1430]: [SERV ] Service engine unloaded: corosync cluster quorum
>>> service v0.1
>>> corosync[1430]: [MAIN ] Corosync Cluster Engine exiting with status 0 at
>>> main.c:1894.
>>>
>>> corosync[4931]: [MAIN ] Corosync built-in features: nss dbus rdma snmp
>>> corosync[4931]: [MAIN ] Successfully read main configuration file
>>> '/etc/corosync/corosync.conf'.
>>> corosync[4931]: [TOTEM ] Initializing transport (UDP/IP Multicast).
>>> corosync[4931]: [TOTEM ] Initializing transmit/receive security:
>>> libtomcrypt SOBER128/SHA1HMAC (mode 0).
>>> corosync[4931]: [TOTEM ] The network interface [10.87.79.59] is now up.
>>> corosync[4931]: [pcmk ] Logging: Initialized pcmk_startup
>>> corosync[4931]: [SERV ] Service engine loaded: Pacemaker Cluster Manager
>>> 1.1.6
>>> corosync[4931]: [pcmk ] Logging: Initialized pcmk_startup
>>> corosync[4931]: [SERV ] Service engine loaded: Pacemaker Cluster Manager
>>> 1.1.6
>>> corosync[4931]: [SERV ] Service engine loaded: corosync extended virtual
>>> synchrony service
>>> corosync[4931]: [SERV ] Service engine loaded: corosync configuration
>>> service
>>> orosync[4931]: [SERV ] Service engine loaded: corosync cluster closed
>>> process group service v1.01
>>> corosync[4931]: [SERV ] Service engine loaded: corosync cluster config
>>> database access v1.01
>>> corosync[4931]: [SERV ] Service engine loaded: corosync profile loading
>>> service
>>> corosync[4931]: [SERV ] Service engine loaded: corosync cluster quorum
>>> service v0.1
>>> corosync[4931]: [MAIN ] Compatibility mode set to whitetank. Using V1
>>> and V2 of the synchronization engine.
>>> corosync[4931]: [TOTEM ] A processor joined or left the membership and a
>>> new membership was formed.
>>> corosync[4931]: [CPG ] chosen downlist: sender r(0) ip(10.87.79.59) ;
>>> members(old:0 left:0)
>>> corosync[4931]: [MAIN ] Completed service synchronization, ready to
>>> provide service.
>>> cib[4937]: info: crm_log_init_worker: Changed active directory to
>>> /var/lib/heartbeat/cores/hacluster
>>> cib[4937]: info: retrieveCib: Reading cluster configuration from:
>>> /var/lib/heartbeat/crm/cib.xml (digest: /var/lib/heartbeat/crm/cib.xml.sig)
>>> cib[4937]: info: validate_with_relaxng: Creating RNG parser context
>>> stonith-ng[4945]: info: crm_log_init_worker: Changed active directory
>>> to /var/lib/heartbeat/cores/root
>>> stonith-ng[4945]: info: get_cluster_type: Cluster type is: 'openais'
>>> stonith-ng[4945]: notice: crm_cluster_connect: Connecting to cluster
>>> infrastructure: classic openais (with plugin)
>>> stonith-ng[4945]: info: init_ais_connection_classic: Creating
>>> connection to our Corosync plugin
>>> cib[4944]: info: crm_log_init_worker: Changed active directory to
>>> /var/lib/heartbeat/cores/hacluster
>>> cib[4944]: info: retrieveCib: Reading cluster configuration from:
>>> /var/lib/heartbeat/crm/cib.xml (digest: /var/lib/heartbeat/crm/cib.xml.sig)
>>> stonith-ng[4945]: info: init_ais_connection_classic: AIS connection
>>> established
>>> stonith-ng[4945]: info: get_ais_nodeid: Server details: id=1003428268
>>> uname=w0110Danmtapp03 cname=pcmk
>>> stonith-ng[4945]: info: init_ais_connection_once: Connection to
>>> 'classic openais (with plugin)': established
>>> stonith-ng[4945]: info: crm_new_peer: Node node03 now has id: 1003428268
>>> stonith-ng[4945]: info: crm_new_peer: Node 1003428268 is now known as
>>> node03
>>> cib[4944]: info: validate_with_relaxng: Creating RNG parser context
>>> lrmd: [4939]: info: G_main_add_SignalHandler: Added signal handler for
>>> signal 15
>>> lrmd: [4946]: info: Signal sent to pid=4939, waiting for process to exit
>>> lrmd: [4939]: info: G_main_add_SignalHandler: Added signal handler for
>>> signal 17
>>> lrmd: [4939]: info: enabling coredumps
>>> stonith-ng[4938]: info: crm_log_init_worker: Changed active directory
>>> to /var/lib/heartbeat/cores/root
>>> lrmd: [4939]: info: G_main_add_SignalHandler: Added signal handler for
>>> signal 10
>>> lrmd: [4939]: info: G_main_add_SignalHandler: Added signal handler for
>>> signal 12
>>> lrmd: [4939]: info: Started.
>>> stonith-ng[4938]: info: get_cluster_type: Cluster type is: 'openais'
>>> lrmd: [4939]: info: lrmd is shutting down
>>> stonith-ng[4938]: notice: crm_cluster_connect: Connecting to cluster
>>> infrastructure: classic openais (with plugin)
>>> stonith-ng[4938]: info: init_ais_connection_classic: Creating
>>> connection to our Corosync plugin
>>> attrd[4940]: info: crm_log_init_worker: Changed active directory to
>>> /var/lib/heartbeat/cores/hacluster
>>> pengine[4941]: info: crm_log_init_worker: Changed active directory to
>>> /var/lib/heartbeat/cores/hacluster
>>> attrd[4940]: info: main: Starting up
>>> attrd[4940]: info: get_cluster_type: Cluster type is: 'openais'
>>> attrd[4940]: notice: crm_cluster_connect: Connecting to cluster
>>> infrastructure: classic openais (with plugin)
>>> attrd[4940]: info: init_ais_connection_classic: Creating connection to
>>> our Corosync plugin
>>> crmd[4942]: info: crm_log_init_worker: Changed active directory to
>>> /var/lib/heartbeat/cores/hacluster
>>> pengine[4941]: info: main: Starting pengine
>>> crmd[4942]: notice: main: CRM Hg Version:
>>> 148fccfd5985c5590cc601123c6c16e966b85d14
>>> pengine[4948]: info: crm_log_init_worker: Changed active directory to
>>> /var/lib/heartbeat/cores/hacluster
>>> pengine[4948]: warning: main: Terminating previous PE instance
>>> attrd[4947]: info: crm_log_init_worker: Changed active directory to
>>> /var/lib/heartbeat/cores/hacluster
>>> pengine[4941]: warning: process_pe_message: Received quit message,
>>> terminating
>>> attrd[4947]: info: main: Starting up
>>> attrd[4947]: info: get_cluster_type: Cluster type is: 'openais'
>>> attrd[4947]: notice: crm_cluster_connect: Connecting to cluster
>>> infrastructure: classic openais (with plugin)
>>> attrd[4947]: info: init_ais_connection_classic: Creating connection to
>>> our Corosync plugin
>>> crmd[4949]: info: crm_log_init_worker: Changed active directory to
>>> /var/lib/heartbeat/cores/hacluster
>>> crmd[4949]: notice: main: CRM Hg Version:
>>> 148fccfd5985c5590cc601123c6c16e966b85d14
>>> stonith-ng[4938]: info: init_ais_connection_classic: AIS connection
>>> established
>>> stonith-ng[4938]: info: get_ais_nodeid: Server details: id=1003428268
>>> uname=node03 cname=pcmk
>>> stonith-ng[4938]: info: init_ais_connection_once: Connection to
>>> 'classic openais (with plugin)': established
>>> stonith-ng[4938]: info: crm_new_peer: Node node03 now has id: 1003428268
>>> stonith-ng[4938]: info: crm_new_peer: Node 1003428268 is now known as
>>> node03
>>> attrd[4940]: info: init_ais_connection_classic: AIS connection
>>> established
>>> attrd[4940]: info: get_ais_nodeid: Server details: id=1003428268
>>> uname=node03 cname=pcmk
>>> attrd[4940]: info: init_ais_connection_once: Connection to 'classic
>>> openais (with plugin)': established
>>> attrd[4940]: info: crm_new_peer: Node node03 now has id: 1003428268
>>> attrd[4940]: info: crm_new_peer: Node 1003428268 is now known as node03
>>> attrd[4940]: info: main: Cluster connection active
>>> attrd[4940]: info: main: Accepting attribute updates
>>> attrd[4940]: notice: main: Starting mainloop...
>>> attrd[4947]: info: init_ais_connection_classic: AIS connection
>>> established
>>> attrd[4947]: info: get_ais_nodeid: Server details: id=1003428268
>>> uname=node03 cname=pcmk
>>> attrd[4947]: info: init_ais_connection_once: Connection to 'classic
>>> openais (with plugin)': established
>>> attrd[4947]: info: crm_new_peer: Node node03 now has id: 1003428268
>>> attrd[4947]: info: crm_new_peer: Node 1003428268 is now known as node03
>>> attrd[4947]: info: main: Cluster connection active
>>> attrd[4947]: info: main: Accepting attribute updates
>>> attrd[4947]: notice: main: Starting mainloop...
>>> cib[4937]: info: startCib: CIB Initialization completed successfully
>>> cib[4937]: info: get_cluster_type: Cluster type is: 'openais'
>>> cib[4937]: notice: crm_cluster_connect: Connecting to cluster
>>> infrastructure: classic openais (with plugin)
>>> cib[4937]: info: init_ais_connection_classic: Creating connection to
>>> our Corosync plugin
>>> cib[4944]: info: startCib: CIB Initialization completed successfully
>>> cib[4944]: info: get_cluster_type: Cluster type is: 'openais'
>>> cib[4944]: notice: crm_cluster_connect: Connecting to cluster
>>> infrastructure: classic openais (with plugin)
>>> cib[4944]: info: init_ais_connection_classic: Creating connection to
>>> our Corosync plugin
>>> cib[4937]: info: init_ais_connection_classic: AIS connection established
>>> cib[4937]: info: get_ais_nodeid: Server details: id=1003428268
>>> uname=node03 cname=pcmk
>>> cib[4937]: info: init_ais_connection_once: Connection to 'classic
>>> openais (with plugin)': established
>>> cib[4937]: info: crm_new_peer: Node node03 now has id: 1003428268
>>> cib[4937]: info: crm_new_peer: Node 1003428268 is now known as node03
>>> cib[4937]: info: cib_init: Starting cib mainloop
>>> cib[4937]: info: ais_dispatch_message: Membership 6892: quorum still
>>> lost
>>> cib[4937]: info: crm_update_peer: Node node03: id=1003428268
>>> state=member (new) addr=r(0) ip(10.87.79.59) (new) votes=1 (new) born=0
>>> seen=6892 proc=00000000000000000000000000111312 (new)
>>> cib[4944]: info: init_ais_connection_classic: AIS connection established
>>> cib[4944]: info: get_ais_nodeid: Server details: id=1003428268
>>> uname=node03 cname=pcmk
>>> cib[4944]: info: init_ais_connection_once: Connection to 'classic
>>> openais (with plugin)': established
>>> cib[4944]: info: crm_new_peer: Node node03 now has id: 1003428268
>>> cib[4944]: info: crm_new_peer: Node 1003428268 is now known as node03
>>> cib[4944]: info: cib_init: Starting cib mainloop
>>> stonith-ng[4945]: notice: setup_cib: Watching for stonith topology changes
>>> stonith-ng[4945]: info: main: Starting stonith-ng mainloop
>>> cib[4937]: info: ais_dispatch_message: Membership 6896: quorum still
>>> lost
>>> corosync[4931]: [TOTEM ] A processor joined or left the membership and a
>>> new membership was formed.
>>> cib[4937]: info: crm_new_peer: Node <null> now has id: 969873836
>>> cib[4937]: info: crm_update_peer: Node (null): id=969873836
>>> state=member (new) addr=r(0) ip(172.25.207.57) votes=0 born=0 seen=6896
>>> proc=00000000000000000000000000000000
>>> cib[4937]: info: crm_new_peer: Node <null> now has id: 986651052
>>> cib[4937]: info: crm_update_peer: Node (null): id=986651052
>>> state=member (new) addr=r(0) ip(172.25.207.58) votes=0 born=0 seen=6896
>>> proc=00000000000000000000000000000000
>>> cib[4937]: notice: ais_dispatch_message: Membership 6896: quorum acquired
>>> cib[4937]: info: crm_get_peer: Node 986651052 is now known as node02
>>> cib[4937]: info: crm_update_peer: Node node02: id=986651052
>>> state=member addr=r(0) ip(172.25.207.58) votes=1 (new) born=6812 seen=6896
>>> proc=00000000000000000000000000111312 (new)
>>> cib[4937]: info: ais_dispatch_message: Membership 6896: quorum retained
>>> cib[4937]: info: crm_get_peer: Node 969873836 is now known as node01
>>> cib[4937]: info: crm_update_peer: Node node01: id=969873836
>>> state=member addr=r(0) ip(172.25.207.57) votes=1 (new) born=6848 seen=6896
>>> proc=00000000000000000000000000111312 (new)
>>> rsyslogd-2177: imuxsock begins to drop messages from pid 4931 due to
>>> rate-limiting
>>> crmd[4942]: info: do_cib_control: CIB connection established
>>> crmd[4942]: info: get_cluster_type: Cluster type is: 'openais'
>>> crmd[4942]: notice: crm_cluster_connect: Connecting to cluster
>>> infrastructure: classic openais (with plugin)
>>> crmd[4942]: info: init_ais_connection_classic: Creating connection to
>>> our Corosync plugin
>>> cib[4937]: info: cib_process_diff: Diff 1.249.28 -> 1.249.29 not
>>> applied to 1.249.0: current "num_updates" is less than required
>>> cib[4937]: info: cib_server_process_diff: Requesting re-sync from peer
>>> crmd[4949]: info: do_cib_control: CIB connection established
>>> crmd[4949]: info: get_cluster_type: Cluster type is: 'openais'
>>> crmd[4949]: notice: crm_cluster_connect: Connecting to cluster
>>> infrastructure: classic openais (with plugin)
>>> crmd[4949]: info: init_ais_connection_classic: Creating connection to
>>> our Corosync plugin
>>> stonith-ng[4938]: notice: setup_cib: Watching for stonith topology changes
>>> stonith-ng[4938]: info: main: Starting stonith-ng mainloop
>>> cib[4937]: notice: cib_server_process_diff: Not applying diff 1.249.29 ->
>>> 1.249.30 (sync in progress)
>>> crmd[4942]: info: init_ais_connection_classic: AIS connection
>>> established
>>> crmd[4942]: info: get_ais_nodeid: Server details: id=1003428268
>>> uname=node03 cname=pcmk
>>> crmd[4942]: info: init_ais_connection_once: Connection to 'classic
>>> openais (with plugin)': established
>>> crmd[4942]: info: crm_new_peer: Node node03 now has id: 1003428268
>>> crmd[4942]: info: crm_new_peer: Node 1003428268 is now known as node03
>>> crmd[4942]: info: ais_status_callback: status: node03 is now unknown
>>> crmd[4942]: info: do_ha_control: Connected to the cluster
>>> crmd[4942]: warning: do_lrm_control: Failed to sign on to the LRM 1 (30
>>> max) times
>>> crmd[4949]: info: init_ais_connection_classic: AIS connection
>>> established
>>> crmd[4949]: info: get_ais_nodeid: Server details: id=1003428268
>>> uname=node03 cname=pcmk
>>> crmd[4949]: info: init_ais_connection_once: Connection to 'classic
>>> openais (with plugin)': established
>>> crmd[4942]: notice: ais_dispatch_message: Membership 6896: quorum acquired
>>> crmd[4949]: info: crm_new_peer: Node node03 now has id: 1003428268
>>> crmd[4949]: info: crm_new_peer: Node 1003428268 is now known as node03
>>> crmd[4942]: info: crm_new_peer: Node node01 now has id: 969873836
>>> crmd[4949]: info: ais_status_callback: status: node03 is now unknown
>>> crmd[4942]: info: crm_new_peer: Node 969873836 is now known as node01
>>> crmd[4949]: info: do_ha_control: Connected to the cluster
>>> crmd[4942]: info: ais_status_callback: status: node01 is now unknown
>>> crmd[4942]: info: ais_status_callback: status: node01 is now member
>>> (was unknown)
>>> crmd[4942]: info: crm_update_peer: Node node01: id=969873836
>>> state=member (new) addr=r(0) ip(172.25.207.57) votes=1 born=6848 seen=6896
>>> proc=00000000000000000000000000111312
>>> crmd[4942]: info: crm_new_peer: Node node02 now has id: 986651052
>>> crmd[4942]: info: crm_new_peer: Node 986651052 is now known as node02
>>> crmd[4942]: info: ais_status_callback: status: node02 is now unknown
>>> crmd[4949]: warning: do_lrm_control: Failed to sign on to the LRM 1 (30
>>> max) times
>>> crmd[4942]: info: ais_status_callback: status: node02 is now member
>>> (was unknown)
>>> crmd[4942]: info: crm_update_peer: Node node02: id=986651052
>>> state=member (new) addr=r(0) ip(172.25.207.58) votes=1 born=6812 seen=6896
>>> proc=00000000000000000000000000111312
>>> crmd[4942]: notice: crmd_peer_update: Status update: Client node03/crmd
>>> now has status [online] (DC=<null>)
>>> crmd[4942]: info: ais_status_callback: status: node03 is now member
>>> (was unknown)
>>> crmd[4942]: info: crm_update_peer: Node node03: id=1003428268
>>> state=member (new) addr=r(0) ip(10.87.79.59) (new) votes=1 (new) born=6896
>>> seen=6896 proc=00000000000000000000000000111312 (new)
>>> crmd[4942]: info: ais_dispatch_message: Membership 6896: quorum retained
>>> cib[4937]: notice: cib_server_process_diff: Not applying diff 1.249.30 ->
>>> 1.249.31 (sync in progress)
>>> crmd[4942]: warning: do_lrm_control: Failed to sign on to the LRM 2 (30
>>> max) times
>>> crmd[4942]: warning: do_lrm_control: Failed to sign on to the LRM 3 (30
>>> max) times
>>> crmd[4949]: warning: do_lrm_control: Failed to sign on to the LRM 2 (30
>>> max) times
>>> crmd[4949]: notice: ais_dispatch_message: Membership 6896: quorum acquired
>>> rsyslogd-2177: imuxsock begins to drop messages from pid 4937 due to
>>> rate-limiting
>>> crmd[4942]: warning: do_lrm_control: Failed to sign on to the LRM 4 (30
>>> max) times
>>> crmd[4942]: warning: do_lrm_control: Failed to sign on to the LRM 5 (30
>>> max) times
>>> pengine[4948]: info: main: Starting pengine
>>> crmd[4942]: info: crm_timer_popped: Wait Timer (I_NULL) just popped
>>> (2000ms)
>>> warning: do_lrm_control: Failed to sign on to the LRM 6 (30 max) times
>>> crmd[4949]: info: crm_timer_popped: Wait Timer (I_NULL) just popped
>>> (2000ms)
>>> crmd[4949]: warning: do_lrm_control: Failed to sign on to the LRM 3 (30
>>> max) times
>>> attrd[4940]: info: cib_connect: Connected to the CIB after 1 signon
>>> attempts
>>> attrd[4940]: info: cib_connect: Sending full refresh
>>> crmd[4942]: info: crm_timer_popped: Wait Timer (I_NULL) just popped
>>> (2000ms)
>>> crmd[4942]: warning: do_lrm_control: Failed to sign on to the LRM 7 (30
>>> max) times
>>> attrd[4947]: info: cib_connect: Connected to the CIB after 1 signon
>>> attempts
>>> attrd[4947]: info: cib_connect: Sending full refresh
>>> crmd[4949]: info: crm_timer_popped: Wait Timer (I_NULL) just popped
>>> (2000ms)
>>> crmd[4949]: warning: do_lrm_control: Failed to sign on to the LRM 4 (30
>>> max) times
>>> crmd[4942]: info: crm_timer_popped: Wait Timer (I_NULL) just popped
>>> (2000ms)
>>> crmd[4942]: warning: do_lrm_control: Failed to sign on to the LRM 8 (30
>>> max) times
>>> crmd[4949]: info: crm_timer_popped: Wait Timer (I_NULL) just popped
>>> (2000ms)
>>> crmd[4949]: warning: do_lrm_control: Failed to sign on to the LRM 5 (30
>>> max) times
>>> crmd[4942]: info: crm_timer_popped: Wait Timer (I_NULL) just popped
>>> (2000ms)
>>> crmd[4942]: warning: do_lrm_control: Failed to sign on to the LRM 9 (30
>>> max) times
>>> crmd[4949]: info: crm_timer_popped: Wait Timer (I_NULL) just popped
>>> (2000ms)
>>> crmd[4949]: warning: do_lrm_control: Failed to sign on to the LRM 6 (30
>>> max) times
>>> crmd[4942]: info: crm_timer_popped: Wait Timer (I_NULL) just popped
>>> (2000ms)
>>> crmd[4942]: warning: do_lrm_control: Failed to sign on to the LRM 10 (30
>>> max) times
>>> crmd[4949]: info: crm_timer_popped: Wait Timer (I_NULL) just popped
>>> (2000ms)
>>> crmd[4949]: warning: do_lrm_control: Failed to sign on to the LRM 7 (30
>>> max) times
>>> crmd[4942]: info: crm_timer_popped: Wait Timer (I_NULL) just popped
>>> (2000ms)
>>> crmd[4949]: info: crm_timer_popped: Wait Timer (I_NULL) just popped
>>> (2000ms)
>>> crmd[4942]: warning: do_lrm_control: Failed to sign on to the LRM 11 (30
>>> max) times
>>> crmd[4949]: warning: do_lrm_control: Failed to sign on to the LRM 8 (30
>>> max) times
>>> crmd[4942]: info: crm_timer_popped: Wait Timer (I_NULL) just popped
>>> (2000ms)
>>> crmd[4949]: info: crm_timer_popped: Wait Timer (I_NULL) just popped
>>> (2000ms)
>>> crmd[4942]: warning: do_lrm_control: Failed to sign on to the LRM 12 (30
>>> max) times
>>> crmd[4949]: warning: do_lrm_control: Failed to sign on to the LRM 9 (30
>>> max) times
>>> crmd[4949]: info: crm_timer_popped: Wait Timer (I_NULL) just popped
>>> (2000ms)
>>> crmd[4942]: info: crm_timer_popped: Wait Timer (I_NULL) just popped
>>> (2000ms)
>>> crmd[4942]: warning: do_lrm_control: Failed to sign on to the LRM 13 (30
>>> max) times
>>> crmd[4949]: warning: do_lrm_control: Failed to sign on to the LRM 10 (30
>>> max) times
>>> crmd[4949]: info: crm_timer_popped: Wait Timer (I_NULL) just popped
>>> (2000ms)
>>> crmd[4942]: info: crm_timer_popped: Wait Timer (I_NULL) just popped
>>> (2000ms)
>>> crmd[4942]: warning: do_lrm_control: Failed to sign on to the LRM 14 (30
>>> max) times
>>> crmd[4949]: warning: do_lrm_control: Failed to sign on to the LRM 11 (30
>>> max) times
>>> crmd[4949]: info: crm_timer_popped: Wait Timer (I_NULL) just popped
>>> (2000ms)
>>> crmd[4949]: warning: do_lrm_control: Failed to sign on to the LRM 12 (30
>>> max) times
>>> crmd[4942]: info: crm_timer_popped: Wait Timer (I_NULL) just popped
>>> (2000ms)
>>> crmd[4942]: warning: do_lrm_control: Failed to sign on to the LRM 15 (30
>>> max) times
>>> crmd[4949]: info: crm_timer_popped: Wait Timer (I_NULL) just popped
>>> (2000ms)
>>> crmd[4949]: warning: do_lrm_control: Failed to sign on to the LRM 13 (30
>>> max) times
>>> crmd[4942]: info: crm_timer_popped: Wait Timer (I_NULL) just popped
>>> (2000ms)
>>> crmd[4942]: warning: do_lrm_control: Failed to sign on to the LRM 16 (30
>>> max) times
>>> crmd[4949]: info: crm_timer_popped: Wait Timer (I_NULL) just popped
>>> (2000ms)
>>> crmd[4949]: warning: do_lrm_control: Failed to sign on to the LRM 14 (30
>>> max) times
>>> crmd[4942]: info: crm_timer_popped: Wait Timer (I_NULL) just popped
>>> (2000ms)
>>> crmd[4942]: warning: do_lrm_control: Failed to sign on to the LRM 17 (30
>>> max) times
>>> crmd[4949]: info: crm_timer_popped: Wait Timer (I_NULL) just popped
>>> (2000ms)
>>> crmd[4949]: warning: do_lrm_control: Failed to sign on to the LRM 15 (30
>>> max) times
>>> crmd[4942]: info: crm_timer_popped: Wait Timer (I_NULL) just popped
>>> (2000ms)
>>> crmd[4942]: warning: do_lrm_control: Failed to sign on to the LRM 18 (30
>>> max) times
>>>
>>>
>>> We have the following components installed..
>>>
>>>
>>> corosynclib-1.4.1-15.el6.x86_64
>>> corosync-1.4.1-15.el6.x86_64
>>> cluster-glue-libs-1.0.5-6.el6.x86_64
>>> clusterlib-3.0.12.1-49.el6.x86_64
>>> pacemaker-cluster-libs-1.1.7-6.el6.x86_64
>>> cluster-glue-1.0.5-6.el6.x86_64
>>> resource-agents-3.9.2-12.el6.x86_64
>>>
>>>
>>>
>>> We'd appreciate assistance on how to debug what the issue may be and some
>>> possible causes.
>>>
>>> Cheers,
>>> Jimmy
>>> _______________________________________________
>>> Linux-HA mailing list
>>> [email protected]
>>> http://lists.linux-ha.org/mailman/listinfo/linux-ha
>>> See also: http://linux-ha.org/ReportingProblems
>>
>> _______________________________________________
>> Linux-HA mailing list
>> [email protected]
>> http://lists.linux-ha.org/mailman/listinfo/linux-ha
>> See also: http://linux-ha.org/ReportingProblems
>
> _______________________________________________
> Linux-HA mailing list
> [email protected]
> http://lists.linux-ha.org/mailman/listinfo/linux-ha
> See also: http://linux-ha.org/ReportingProblems
_______________________________________________
Linux-HA mailing list
[email protected]
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems