Hi,

On Wed, Jun 03, 2009 at 05:15:08PM +0300, Eli Dorfman (Voltaire) wrote:
> Resending
> 
> -------- Original Message --------
> Subject: heartbeat: WARN: standby message [me] from fig3 ignored.  Other side 
> is in flux
> Date: Mon, 01 Jun 2009 14:04:24 +0300
> From: Eli Dorfman (Voltaire) <[email protected]>
> To: [email protected]
> 
> Hi
> 
> I have a setup with 2 servers fig1 and fig3 with heartbeat and drbd.
> After several attempts to failover (using /usr/lib64/heartbeat/hb_standby) 
> between the 
> 2 servers fig1 and fig3 it seems that failover fails - probably due to the 
> following message:
> "heartbeat: WARN: standby message [me] from fig3 ignored.  Other side is in 
> flux"
> What does it mean?

> May 31 15:40:21 fig3 harc[25831]: [25842]: info: Running 
> /etc/ha.d/rc.d/hb_takeover hb_takeover
> May 31 15:40:21 fig3 hb_standby[25848]: [25855]: Going standby [all].
> May 31 15:40:21 fig3 heartbeat: [15428]: WARN: standby message [me] from fig3 
> ignored.  Other side is in flux.

At this time fig1 still hasn't finished the transition:

> May 31 15:40:28 fig1 heartbeat: [18119]: info: remote resource transition 
> completed.

> Is it a configuration problem?

Probably not.

> Attached the messages log files from both servers.

May 31 15:40:43 fig3 harc[26109]: [26115]: info: Running 
/etc/ha.d/rc.d/hb_takeover hb_takeover
May 31 15:40:50 fig1 harc[25346]: [25352]: info: Running 
/etc/ha.d/rc.d/hb_takeover hb_takeover

So, both sides want to go standby, that's why fig3 is refusing to
accept new requests.

Thanks,

Dejan

> Thanks,
> Eli 
> 
> 
> 

> 
> May 31 15:38:44 fig1 heartbeat: [18119]: WARN: 1 lost packet(s) for [fig3] 
> [1624:1626]
> May 31 15:38:44 fig1 heartbeat: [18119]: info: remote resource transition 
> completed.
> May 31 15:38:44 fig1 heartbeat: [18119]: info: No pkts missing from fig3!
> May 31 15:38:44 fig1 heartbeat: [18119]: info: Other node completed standby 
> takeover of all resources.
> May 31 15:38:49 fig1 harc[22856]: [22862]: info: Running 
> /etc/ha.d/rc.d/hb_takeover hb_takeover
> May 31 15:38:49 fig1 heartbeat: [18119]: info: fig3 wants to go standby [all]
> May 31 15:38:55 fig1 kernel: drbd0: peer( Primary -> Secondary ) 
> May 31 15:38:55 fig1 heartbeat: [18119]: info: standby: acquire [all] 
> resources from fig3
> May 31 15:38:55 fig1 heartbeat: [22869]: info: acquire all HA resources 
> (standby).
> May 31 15:38:55 fig1 ResourceManager[22882]: [22893]: info: Acquiring 
> resource group: fig1 172.28.6.140/24/eth0/172.28.6.255 drbddisk::ufmdb 
> Filesystem::/dev/drbd0::/opt/ufm/files/::ext3 mysqld ufmd
> May 31 15:38:56 fig1 IPaddr[22905]: [22948]: INFO:  Resource is stopped
> May 31 15:38:56 fig1 ResourceManager[22882]: [22965]: info: Running 
> /etc/ha.d/resource.d/IPaddr 172.28.6.140/24/eth0/172.28.6.255 start
> May 31 15:38:56 fig1 IPaddr[22996]: [23030]: INFO: Using calculated netmask 
> for 172.28.6.140: 255.255.255.0
> May 31 15:38:56 fig1 IPaddr[22996]: [23051]: INFO: eval ifconfig eth0:0 
> 172.28.6.140 netmask 255.255.255.0 broadcast 172.28.6.255
> May 31 15:38:56 fig1 avahi-daemon[4509]: Registering new address record for 
> 172.28.6.140 on eth0.
> May 31 15:38:56 fig1 avahi-daemon[4509]: Withdrawing address record for 
> 172.28.6.140 on eth0.
> May 31 15:38:56 fig1 avahi-daemon[4509]: Registering new address record for 
> 172.28.6.140 on eth0.
> May 31 15:38:56 fig1 IPaddr[22967]: [23070]: INFO:  Success
> May 31 15:38:56 fig1 ResourceManager[22882]: [23099]: info: Running 
> /etc/ha.d/resource.d/drbddisk ufmdb start
> May 31 15:38:56 fig1 kernel: drbd0: role( Secondary -> Primary ) 
> May 31 15:38:56 fig1 kernel: drbd0: Writing meta data super block now.
> May 31 15:38:57 fig1 Filesystem[23116]: [23162]: INFO:  Resource is stopped
> May 31 15:38:57 fig1 ResourceManager[22882]: [23176]: info: Running 
> /etc/ha.d/resource.d/Filesystem /dev/drbd0 /opt/ufm/files/ ext3 start
> May 31 15:38:57 fig1 Filesystem[23189]: [23219]: INFO: Running start for 
> /dev/drbd0 on /opt/ufm/files
> May 31 15:38:57 fig1 kernel: kjournald starting.  Commit interval 5 seconds
> May 31 15:38:57 fig1 kernel: EXT3 FS on drbd0, internal journal
> May 31 15:38:57 fig1 kernel: EXT3-fs: mounted filesystem with ordered data 
> mode.
> May 31 15:38:57 fig1 Filesystem[23178]: [23234]: INFO:  Success
> May 31 15:38:57 fig1 ResourceManager[22882]: [23288]: info: Running 
> /etc/init.d/mysqld  start
> May 31 15:38:58 fig1 ResourceManager[22882]: [23411]: info: Running 
> /etc/init.d/ufmd  start
> May 31 15:39:03 fig1 OpenSM[23510]:  
> May 31 15:39:03 fig1 OpenSM[23510]:  Loading Cached Option:max_op_vls = 4  
> May 31 15:39:03 fig1 OpenSM[23510]:  Loading Cached 
> Option:partition_config_file = /opt/ufm/files/conf/opensm/partitions.conf  
> May 31 15:39:03 fig1 OpenSM[23510]:  Loading Cached Option:routing_engine = 
> minhop  
> May 31 15:39:03 fig1 OpenSM[23510]:  Loading Cached 
> Option:lid_matrix_dump_file = /opt/ufm/files/conf/opensm/lid_matrix.conf  
> May 31 15:39:03 fig1 OpenSM[23510]:  Loading Cached Option:lfts_file = 
> /opt/ufm/files/conf/opensm/lfts.conf  
> May 31 15:39:03 fig1 OpenSM[23510]:  Loading Cached Option:root_guid_file = 
> /opt/ufm/files/conf/opensm/root_guid.conf  
> May 31 15:39:03 fig1 OpenSM[23510]:  Loading Cached Option:cn_guid_file = 
> /opt/ufm/files/conf/opensm/cn_guid.conf  
> May 31 15:39:03 fig1 OpenSM[23510]:  Loading Cached Option:ids_guid_file = 
> /opt/ufm/files/conf/opensm/ids_guid.conf  
> May 31 15:39:03 fig1 OpenSM[23510]:  Loading Cached 
> Option:guid_routing_order_file = 
> /opt/ufm/files/conf/opensm/guid_routing_order.conf  
> May 31 15:39:03 fig1 OpenSM[23510]:  Loading Cached Option:sm_priority = 15  
> May 31 15:39:03 fig1 OpenSM[23510]:  Loading Cached Option:event_plugin_name 
> = osmufmpi  
> May 31 15:39:03 fig1 OpenSM[23510]:  Loading Cached Option:log_file = 
> /opt/ufm/files/log/opensm.log  
> May 31 15:39:03 fig1 OpenSM[23510]:  Loading Cached Option:log_max_size = 
> 4096  
> May 31 15:39:03 fig1 OpenSM[23510]:  Loading Cached Option:dump_files_dir = 
> /opt/ufm/files/log/  
> May 31 15:39:03 fig1 OpenSM[23510]:  Loading Cached Option:qos = TRUE  
> May 31 15:39:03 fig1 OpenSM[23510]:  Loading Cached Option:qos_policy_file = 
> /opt/ufm/files/conf/opensm/qos-policy.conf  
> May 31 15:39:03 fig1 OpenSM[23510]:  Loading Cached Option:qos_max_vls = 8  
> May 31 15:39:03 fig1 OpenSM[23510]:  Loading Cached Option:qos_high_limit = 0 
>  
> May 31 15:39:03 fig1 OpenSM[23510]:  Loading Cached Option:qos_vlarb_high = 
> 0:32  
> May 31 15:39:03 fig1 OpenSM[23510]:  Loading Cached Option:qos_vlarb_low = 
> 1:224,2:64,3:32  
> May 31 15:39:03 fig1 OpenSM[23510]:  Loading Cached Option:qos_sl2vl = 
> 0,1,2,3,4,5,6,7,15,15,15,15,15,15,15,15  
> May 31 15:39:03 fig1 OpenSM[23510]:  Loading Cached Option:prefix_routes_file 
> = /opt/ufm/files/conf/opensm/prefix-routes.conf  
> May 31 15:39:03 fig1 OpenSM[23515]: /opt/ufm/files/log/opensm.log log file 
> opened 
> May 31 15:39:03 fig1 OpenSM[23515]: OpenSM 3.3.2_974fc3f_2cbb47c  
> May 31 15:39:03 fig1 OpenSM[23515]: Entering DISCOVERING state  
> May 31 15:39:03 fig1 OpenSM[23515]:  Loading Cached Option:qos_max_vls = 8  
> May 31 15:39:03 fig1 OpenSM[23515]:  Loading Cached Option:qos_high_limit = 0 
>  
> May 31 15:39:03 fig1 OpenSM[23515]:  Loading Cached Option:qos_vlarb_high = 
> 0:32  
> May 31 15:39:03 fig1 OpenSM[23515]:  Loading Cached Option:qos_vlarb_low = 
> 1:224,2:64,3:32  
> May 31 15:39:03 fig1 OpenSM[23515]:  Loading Cached Option:qos_sl2vl = 
> 0,1,2,3,4,5,6,7,15,15,15,15,15,15,15,15  
> May 31 15:39:04 fig1 opensm[23515]: Entering MASTER state  
> May 31 15:39:04 fig1 kernel: ib0: multicast join failed for 
> ff12:401b:ffff:0000:0000:0000:ffff:ffff, status -11
> May 31 15:39:04 fig1 opensm[23515]: SUBNET UP  
> May 31 15:39:07 fig1 heartbeat: [22869]: info: all HA resource acquisition 
> completed (standby).
> May 31 15:39:07 fig1 heartbeat: [18119]: info: Standby resource acquisition 
> done [all].
> May 31 15:39:07 fig1 ufm_monitor: start monitoring...
> May 31 15:39:07 fig1 heartbeat: [18119]: info: remote resource transition 
> completed.
> May 31 15:39:10 fig1 dhcpd: Wrote 0 leases to leases file.
> May 31 15:39:10 fig1 dhcpd: 
> May 31 15:39:10 fig1 dhcpd: No subnet declaration for ib0.8001 (1.1.1.1).
> May 31 15:39:10 fig1 dhcpd: ** Ignoring requests on ib0.8001.  If this is not 
> what
> May 31 15:39:10 fig1 dhcpd:    you want, please write a subnet declaration
> May 31 15:39:10 fig1 dhcpd:    in your dhcpd.conf file for the network segment
> May 31 15:39:10 fig1 dhcpd:    to which interface ib0.8001 is attached. **
> May 31 15:39:10 fig1 dhcpd: 
> May 31 15:39:10 fig1 dhcpd: 
> May 31 15:39:10 fig1 dhcpd: No subnet declaration for eth0 (172.28.6.121).
> May 31 15:39:10 fig1 dhcpd: ** Ignoring requests on eth0.  If this is not what
> May 31 15:39:10 fig1 dhcpd:    you want, please write a subnet declaration
> May 31 15:39:10 fig1 dhcpd:    in your dhcpd.conf file for the network segment
> May 31 15:39:10 fig1 dhcpd:    to which interface eth0 is attached. **
> May 31 15:39:10 fig1 dhcpd: 
> May 31 15:39:10 fig1 dhcpd: Sending on   Socket/fallback/fallback-net
> May 31 15:39:13 fig1 OpenSM[23515]:  Loading Cached Option:qos_max_vls = 8  
> May 31 15:39:13 fig1 OpenSM[23515]:  Loading Cached Option:qos_high_limit = 0 
>  
> May 31 15:39:13 fig1 OpenSM[23515]:  Loading Cached Option:qos_vlarb_high = 
> 0:32  
> May 31 15:39:13 fig1 OpenSM[23515]:  Loading Cached Option:qos_vlarb_low = 
> 1:224,2:64,3:32  
> May 31 15:39:13 fig1 OpenSM[23515]:  Loading Cached Option:qos_sl2vl = 
> 0,1,2,3,4,5,6,7,15,15,15,15,15,15,15,15  
> May 31 15:39:27 fig1 heartbeat: [18119]: info: fig1 wants to go standby [all]
> May 31 15:39:27 fig1 heartbeat: [18119]: info: standby: fig3 can take our all 
> resources
> May 31 15:39:27 fig1 heartbeat: [23681]: info: give up all HA resources 
> (standby).
> May 31 15:39:27 fig1 ResourceManager[23694]: [23705]: info: Releasing 
> resource group: fig1 172.28.6.140/24/eth0/172.28.6.255 drbddisk::ufmdb 
> Filesystem::/dev/drbd0::/opt/ufm/files/::ext3 mysqld ufmd
> May 31 15:39:28 fig1 ResourceManager[23694]: [23716]: info: Running 
> /etc/init.d/ufmd  stop
> May 31 15:39:28 fig1 opensm[23515]: Exiting SM  
> May 31 15:39:28 fig1 ResourceManager[23694]: [23766]: info: Running 
> /etc/init.d/mysqld  stop
> May 31 15:39:31 fig1 ResourceManager[23694]: [23836]: info: Running 
> /etc/ha.d/resource.d/Filesystem /dev/drbd0 /opt/ufm/files/ ext3 stop
> May 31 15:39:31 fig1 Filesystem[23849]: [23879]: INFO: Running stop for 
> /dev/drbd0 on /opt/ufm/files
> May 31 15:39:31 fig1 Filesystem[23849]: [23889]: INFO: Trying to unmount 
> /opt/ufm/files
> May 31 15:39:31 fig1 Filesystem[23849]: [23892]: INFO: unmounted 
> /opt/ufm/files successfully
> May 31 15:39:31 fig1 Filesystem[23838]: [23898]: INFO:  Success
> May 31 15:39:31 fig1 ResourceManager[23694]: [23913]: info: Running 
> /etc/ha.d/resource.d/drbddisk ufmdb stop
> May 31 15:39:31 fig1 kernel: drbd0: role( Primary -> Secondary ) 
> May 31 15:39:31 fig1 kernel: drbd0: Writing meta data super block now.
> May 31 15:39:31 fig1 ResourceManager[23694]: [23934]: info: Running 
> /etc/ha.d/resource.d/IPaddr 172.28.6.140/24/eth0/172.28.6.255 stop
> May 31 15:39:32 fig1 IPaddr[23965]: [23980]: INFO: ifconfig eth0:0 down
> May 31 15:39:32 fig1 avahi-daemon[4509]: Withdrawing address record for 
> 172.28.6.140 on eth0.
> May 31 15:39:32 fig1 IPaddr[23936]: [23983]: INFO:  Success
> May 31 15:39:32 fig1 heartbeat: [23681]: info: all HA resource release 
> completed (standby).
> May 31 15:39:32 fig1 heartbeat: [18119]: info: Local standby process 
> completed [all].
> May 31 15:39:32 fig1 kernel: drbd0: peer( Secondary -> Primary ) 
> May 31 15:39:33 fig1 ufm_monitor: stop monitoring...
> May 31 15:39:40 fig1 kernel: ib0: multicast join failed for 
> ff12:401b:ffff:0000:0000:0000:ffff:ffff, status -11
> May 31 15:39:43 fig1 heartbeat: [18119]: WARN: 1 lost packet(s) for [fig3] 
> [1693:1695]
> May 31 15:39:43 fig1 heartbeat: [18119]: info: remote resource transition 
> completed.
> May 31 15:39:43 fig1 heartbeat: [18119]: info: No pkts missing from fig3!
> May 31 15:39:43 fig1 heartbeat: [18119]: info: Other node completed standby 
> takeover of all resources.
> May 31 15:39:47 fig1 harc[24065]: [24071]: info: Running 
> /etc/ha.d/rc.d/hb_takeover hb_takeover
> May 31 15:39:47 fig1 heartbeat: [18119]: info: fig3 wants to go standby [all]
> May 31 15:39:51 fig1 kernel: drbd0: peer( Primary -> Secondary ) 
> May 31 15:39:51 fig1 heartbeat: [18119]: info: standby: acquire [all] 
> resources from fig3
> May 31 15:39:51 fig1 heartbeat: [24077]: info: acquire all HA resources 
> (standby).
> May 31 15:39:51 fig1 ResourceManager[24090]: [24101]: info: Acquiring 
> resource group: fig1 172.28.6.140/24/eth0/172.28.6.255 drbddisk::ufmdb 
> Filesystem::/dev/drbd0::/opt/ufm/files/::ext3 mysqld ufmd
> May 31 15:39:51 fig1 IPaddr[24113]: [24156]: INFO:  Resource is stopped
> May 31 15:39:51 fig1 ResourceManager[24090]: [24172]: info: Running 
> /etc/ha.d/resource.d/IPaddr 172.28.6.140/24/eth0/172.28.6.255 start
> May 31 15:39:51 fig1 IPaddr[24203]: [24237]: INFO: Using calculated netmask 
> for 172.28.6.140: 255.255.255.0
> May 31 15:39:51 fig1 IPaddr[24203]: [24258]: INFO: eval ifconfig eth0:0 
> 172.28.6.140 netmask 255.255.255.0 broadcast 172.28.6.255
> May 31 15:39:51 fig1 avahi-daemon[4509]: Registering new address record for 
> 172.28.6.140 on eth0.
> May 31 15:39:51 fig1 avahi-daemon[4509]: Withdrawing address record for 
> 172.28.6.140 on eth0.
> May 31 15:39:51 fig1 avahi-daemon[4509]: Registering new address record for 
> 172.28.6.140 on eth0.
> May 31 15:39:51 fig1 IPaddr[24174]: [24277]: INFO:  Success
> May 31 15:39:52 fig1 ResourceManager[24090]: [24306]: info: Running 
> /etc/ha.d/resource.d/drbddisk ufmdb start
> May 31 15:39:52 fig1 kernel: drbd0: role( Secondary -> Primary ) 
> May 31 15:39:52 fig1 kernel: drbd0: Writing meta data super block now.
> May 31 15:39:52 fig1 Filesystem[24323]: [24367]: INFO:  Resource is stopped
> May 31 15:39:52 fig1 ResourceManager[24090]: [24381]: info: Running 
> /etc/ha.d/resource.d/Filesystem /dev/drbd0 /opt/ufm/files/ ext3 start
> May 31 15:39:52 fig1 Filesystem[24394]: [24424]: INFO: Running start for 
> /dev/drbd0 on /opt/ufm/files
> May 31 15:39:52 fig1 kernel: kjournald starting.  Commit interval 5 seconds
> May 31 15:39:52 fig1 kernel: EXT3 FS on drbd0, internal journal
> May 31 15:39:52 fig1 kernel: EXT3-fs: mounted filesystem with ordered data 
> mode.
> May 31 15:39:52 fig1 Filesystem[24383]: [24439]: INFO:  Success
> May 31 15:39:52 fig1 ResourceManager[24090]: [24493]: info: Running 
> /etc/init.d/mysqld  start
> May 31 15:39:53 fig1 ResourceManager[24090]: [24614]: info: Running 
> /etc/init.d/ufmd  start
> May 31 15:39:58 fig1 OpenSM[24710]:  
> May 31 15:39:58 fig1 OpenSM[24710]:  Loading Cached Option:max_op_vls = 4  
> May 31 15:39:58 fig1 OpenSM[24710]:  Loading Cached 
> Option:partition_config_file = /opt/ufm/files/conf/opensm/partitions.conf  
> May 31 15:39:58 fig1 OpenSM[24710]:  Loading Cached Option:routing_engine = 
> minhop  
> May 31 15:39:58 fig1 OpenSM[24710]:  Loading Cached 
> Option:lid_matrix_dump_file = /opt/ufm/files/conf/opensm/lid_matrix.conf  
> May 31 15:39:58 fig1 OpenSM[24710]:  Loading Cached Option:lfts_file = 
> /opt/ufm/files/conf/opensm/lfts.conf  
> May 31 15:39:58 fig1 OpenSM[24710]:  Loading Cached Option:root_guid_file = 
> /opt/ufm/files/conf/opensm/root_guid.conf  
> May 31 15:39:58 fig1 OpenSM[24710]:  Loading Cached Option:cn_guid_file = 
> /opt/ufm/files/conf/opensm/cn_guid.conf  
> May 31 15:39:58 fig1 OpenSM[24710]:  Loading Cached Option:ids_guid_file = 
> /opt/ufm/files/conf/opensm/ids_guid.conf  
> May 31 15:39:58 fig1 OpenSM[24710]:  Loading Cached 
> Option:guid_routing_order_file = 
> /opt/ufm/files/conf/opensm/guid_routing_order.conf  
> May 31 15:39:58 fig1 OpenSM[24710]:  Loading Cached Option:sm_priority = 15  
> May 31 15:39:58 fig1 OpenSM[24710]:  Loading Cached Option:event_plugin_name 
> = osmufmpi  
> May 31 15:39:58 fig1 OpenSM[24710]:  Loading Cached Option:log_file = 
> /opt/ufm/files/log/opensm.log  
> May 31 15:39:58 fig1 OpenSM[24710]:  Loading Cached Option:log_max_size = 
> 4096  
> May 31 15:39:58 fig1 OpenSM[24710]:  Loading Cached Option:dump_files_dir = 
> /opt/ufm/files/log/  
> May 31 15:39:58 fig1 OpenSM[24710]:  Loading Cached Option:qos = TRUE  
> May 31 15:39:58 fig1 OpenSM[24710]:  Loading Cached Option:qos_policy_file = 
> /opt/ufm/files/conf/opensm/qos-policy.conf  
> May 31 15:39:58 fig1 OpenSM[24710]:  Loading Cached Option:qos_max_vls = 8  
> May 31 15:39:58 fig1 OpenSM[24710]:  Loading Cached Option:qos_high_limit = 0 
>  
> May 31 15:39:58 fig1 OpenSM[24710]:  Loading Cached Option:qos_vlarb_high = 
> 0:32  
> May 31 15:39:58 fig1 OpenSM[24710]:  Loading Cached Option:qos_vlarb_low = 
> 1:224,2:64,3:32  
> May 31 15:39:58 fig1 OpenSM[24710]:  Loading Cached Option:qos_sl2vl = 
> 0,1,2,3,4,5,6,7,15,15,15,15,15,15,15,15  
> May 31 15:39:58 fig1 OpenSM[24710]:  Loading Cached Option:prefix_routes_file 
> = /opt/ufm/files/conf/opensm/prefix-routes.conf  
> May 31 15:39:58 fig1 OpenSM[24718]: /opt/ufm/files/log/opensm.log log file 
> opened 
> May 31 15:39:58 fig1 OpenSM[24718]: OpenSM 3.3.2_974fc3f_2cbb47c  
> May 31 15:39:58 fig1 OpenSM[24718]: Entering DISCOVERING state  
> May 31 15:39:58 fig1 OpenSM[24718]:  Loading Cached Option:qos_max_vls = 8  
> May 31 15:39:58 fig1 OpenSM[24718]:  Loading Cached Option:qos_high_limit = 0 
>  
> May 31 15:39:58 fig1 OpenSM[24718]:  Loading Cached Option:qos_vlarb_high = 
> 0:32  
> May 31 15:39:58 fig1 OpenSM[24718]:  Loading Cached Option:qos_vlarb_low = 
> 1:224,2:64,3:32  
> May 31 15:39:59 fig1 OpenSM[24718]:  Loading Cached Option:qos_sl2vl = 
> 0,1,2,3,4,5,6,7,15,15,15,15,15,15,15,15  
> May 31 15:39:59 fig1 opensm[24718]: Entering MASTER state  
> May 31 15:39:59 fig1 kernel: ib0: multicast join failed for 
> ff12:401b:ffff:0000:0000:0000:ffff:ffff, status -11
> May 31 15:39:59 fig1 opensm[24718]: SUBNET UP  
> May 31 15:40:02 fig1 heartbeat: [24077]: info: all HA resource acquisition 
> completed (standby).
> May 31 15:40:02 fig1 heartbeat: [18119]: info: Standby resource acquisition 
> done [all].
> May 31 15:40:02 fig1 ufm_monitor: start monitoring...
> May 31 15:40:03 fig1 heartbeat: [18119]: info: remote resource transition 
> completed.
> May 31 15:40:05 fig1 dhcpd: Wrote 0 leases to leases file.
> May 31 15:40:05 fig1 dhcpd: 
> May 31 15:40:05 fig1 dhcpd: No subnet declaration for ib0.8001 (1.1.1.1).
> May 31 15:40:05 fig1 dhcpd: ** Ignoring requests on ib0.8001.  If this is not 
> what
> May 31 15:40:05 fig1 dhcpd:    you want, please write a subnet declaration
> May 31 15:40:05 fig1 dhcpd:    in your dhcpd.conf file for the network segment
> May 31 15:40:05 fig1 dhcpd:    to which interface ib0.8001 is attached. **
> May 31 15:40:05 fig1 dhcpd: 
> May 31 15:40:05 fig1 dhcpd: 
> May 31 15:40:05 fig1 dhcpd: No subnet declaration for eth0 (172.28.6.121).
> May 31 15:40:05 fig1 dhcpd: ** Ignoring requests on eth0.  If this is not what
> May 31 15:40:05 fig1 dhcpd:    you want, please write a subnet declaration
> May 31 15:40:05 fig1 dhcpd:    in your dhcpd.conf file for the network segment
> May 31 15:40:05 fig1 dhcpd:    to which interface eth0 is attached. **
> May 31 15:40:05 fig1 dhcpd: 
> May 31 15:40:05 fig1 dhcpd: Sending on   Socket/fallback/fallback-net
> May 31 15:40:08 fig1 OpenSM[24718]:  Loading Cached Option:qos_max_vls = 8  
> May 31 15:40:08 fig1 OpenSM[24718]:  Loading Cached Option:qos_high_limit = 0 
>  
> May 31 15:40:08 fig1 OpenSM[24718]:  Loading Cached Option:qos_vlarb_high = 
> 0:32  
> May 31 15:40:08 fig1 OpenSM[24718]:  Loading Cached Option:qos_vlarb_low = 
> 1:224,2:64,3:32  
> May 31 15:40:08 fig1 OpenSM[24718]:  Loading Cached Option:qos_sl2vl = 
> 0,1,2,3,4,5,6,7,15,15,15,15,15,15,15,15  
> May 31 15:40:12 fig1 heartbeat: [18119]: info: fig1 wants to go standby [all]
> May 31 15:40:13 fig1 heartbeat: [18119]: info: standby: fig3 can take our all 
> resources
> May 31 15:40:13 fig1 heartbeat: [24878]: info: give up all HA resources 
> (standby).
> May 31 15:40:13 fig1 ResourceManager[24891]: [24902]: info: Releasing 
> resource group: fig1 172.28.6.140/24/eth0/172.28.6.255 drbddisk::ufmdb 
> Filesystem::/dev/drbd0::/opt/ufm/files/::ext3 mysqld ufmd
> May 31 15:40:13 fig1 ResourceManager[24891]: [24912]: info: Running 
> /etc/init.d/ufmd  stop
> May 31 15:40:13 fig1 opensm[24718]: Exiting SM  
> May 31 15:40:13 fig1 ResourceManager[24891]: [24957]: info: Running 
> /etc/init.d/mysqld  stop
> May 31 15:40:15 fig1 ResourceManager[24891]: [25025]: info: Running 
> /etc/ha.d/resource.d/Filesystem /dev/drbd0 /opt/ufm/files/ ext3 stop
> May 31 15:40:15 fig1 Filesystem[25038]: [25068]: INFO: Running stop for 
> /dev/drbd0 on /opt/ufm/files
> May 31 15:40:15 fig1 Filesystem[25038]: [25078]: INFO: Trying to unmount 
> /opt/ufm/files
> May 31 15:40:16 fig1 Filesystem[25038]: [25081]: INFO: unmounted 
> /opt/ufm/files successfully
> May 31 15:40:16 fig1 Filesystem[25027]: [25087]: INFO:  Success
> May 31 15:40:16 fig1 ResourceManager[24891]: [25102]: info: Running 
> /etc/ha.d/resource.d/drbddisk ufmdb stop
> May 31 15:40:16 fig1 kernel: drbd0: role( Primary -> Secondary ) 
> May 31 15:40:16 fig1 kernel: drbd0: Writing meta data super block now.
> May 31 15:40:16 fig1 ResourceManager[24891]: [25123]: info: Running 
> /etc/ha.d/resource.d/IPaddr 172.28.6.140/24/eth0/172.28.6.255 stop
> May 31 15:40:16 fig1 IPaddr[25154]: [25169]: INFO: ifconfig eth0:0 down
> May 31 15:40:16 fig1 avahi-daemon[4509]: Withdrawing address record for 
> 172.28.6.140 on eth0.
> May 31 15:40:16 fig1 IPaddr[25125]: [25172]: INFO:  Success
> May 31 15:40:16 fig1 heartbeat: [24878]: info: all HA resource release 
> completed (standby).
> May 31 15:40:16 fig1 heartbeat: [18119]: info: Local standby process 
> completed [all].
> May 31 15:40:17 fig1 kernel: drbd0: peer( Secondary -> Primary ) 
> May 31 15:40:18 fig1 ufm_monitor: stop monitoring...
> May 31 15:40:25 fig1 kernel: ib0: multicast join failed for 
> ff12:401b:ffff:0000:0000:0000:ffff:ffff, status -11
> May 31 15:40:25 fig1 harc[25254]: [25260]: info: Running 
> /etc/ha.d/rc.d/hb_takeover hb_takeover
> May 31 15:40:28 fig1 heartbeat: [18119]: WARN: 1 lost packet(s) for [fig3] 
> [1747:1749]
> May 31 15:40:28 fig1 heartbeat: [18119]: info: remote resource transition 
> completed.
> May 31 15:40:28 fig1 heartbeat: [18119]: info: No pkts missing from fig3!
> May 31 15:40:28 fig1 heartbeat: [18119]: info: Other node completed standby 
> takeover of all resources.
> May 31 15:40:28 fig1 heartbeat: [18119]: info: fig3 wants to go standby [all]
> May 31 15:40:29 fig1 heartbeat: [18119]: info: remote resource transition 
> completed.
> May 31 15:40:50 fig1 harc[25346]: [25352]: info: Running 
> /etc/ha.d/rc.d/hb_takeover hb_takeover
> May 31 15:40:50 fig1 heartbeat: [18119]: WARN: Standby in progress- new 
> request from fig3 ignored [3578 seconds left]
> May 31 15:41:29 fig1 heartbeat: [18119]: WARN: Standby in progress- new 
> request from fig3 ignored [3540 seconds left]
> May 31 15:43:28 fig1 heartbeat: [18119]: WARN: Standby in progress- new 
> request from fig3 ignored [3420 seconds left]
> May 31 15:46:00 fig1 heartbeat: [18119]: WARN: Standby in progress- new 
> request from fig3 ignored [3269 seconds left]
> May 31 15:48:32 fig1 heartbeat: [18119]: WARN: Standby in progress- new 
> request from fig3 ignored [3117 seconds left]
> 
> 
> 

> 
> May 31 15:38:39 fig3 dhcpd: Sending on   Socket/fallback/fallback-net
> May 31 15:38:42 fig3 harc[22947]: [22953]: info: Running 
> /etc/ha.d/rc.d/hb_takeover hb_takeover
> May 31 15:38:42 fig3 hb_standby[22959]: [22965]: Going standby [all].
> May 31 15:38:42 fig3 heartbeat: [15428]: info: fig3 wants to go standby [all]
> May 31 15:38:42 fig3 OpenSM[22667]:  Loading Cached Option:qos_max_vls = 8  
> May 31 15:38:42 fig3 OpenSM[22667]:  Loading Cached Option:qos_high_limit = 0 
>  
> May 31 15:38:42 fig3 OpenSM[22667]:  Loading Cached Option:qos_vlarb_high = 
> 0:32  
> May 31 15:38:42 fig3 OpenSM[22667]:  Loading Cached Option:qos_vlarb_low = 
> 1:224,2:64,3:32  
> May 31 15:38:42 fig3 OpenSM[22667]:  Loading Cached Option:qos_sl2vl = 
> 0,1,2,3,4,5,6,7,15,15,15,15,15,15,15,15  
> May 31 15:38:43 fig3 heartbeat: [15428]: info: standby: fig1 can take our all 
> resources
> May 31 15:38:43 fig3 heartbeat: [22970]: info: give up all HA resources 
> (standby).
> May 31 15:38:43 fig3 ResourceManager[22983]: [22994]: info: Releasing 
> resource group: fig3 172.28.6.140/24/eth0/172.28.6.255 drbddisk::ufmdb 
> Filesystem::/dev/drbd0::/opt/ufm/files/::ext3 mysqld ufmd
> May 31 15:38:43 fig3 ResourceManager[22983]: [23004]: info: Running 
> /etc/init.d/ufmd  stop
> May 31 15:38:43 fig3 opensm[22667]: Exiting SM  
> May 31 15:38:43 fig3 ResourceManager[22983]: [23049]: info: Running 
> /etc/init.d/mysqld  stop
> May 31 15:38:47 fig3 ufm_monitor: stop monitoring...
> May 31 15:38:47 fig3 ResourceManager[22983]: [23122]: info: Running 
> /etc/ha.d/resource.d/Filesystem /dev/drbd0 /opt/ufm/files/ ext3 stop
> May 31 15:38:47 fig3 Filesystem[23135]: [23165]: INFO: Running stop for 
> /dev/drbd0 on /opt/ufm/files
> May 31 15:38:47 fig3 Filesystem[23135]: [23175]: INFO: Trying to unmount 
> /opt/ufm/files
> May 31 15:38:47 fig3 Filesystem[23135]: [23178]: INFO: unmounted 
> /opt/ufm/files successfully
> May 31 15:38:47 fig3 Filesystem[23124]: [23184]: INFO:  Success
> May 31 15:38:47 fig3 ResourceManager[22983]: [23199]: info: Running 
> /etc/ha.d/resource.d/drbddisk ufmdb stop
> May 31 15:38:48 fig3 kernel: drbd0: role( Primary -> Secondary ) 
> May 31 15:38:48 fig3 kernel: drbd0: Writing meta data super block now.
> May 31 15:38:48 fig3 ResourceManager[22983]: [23220]: info: Running 
> /etc/ha.d/resource.d/IPaddr 172.28.6.140/24/eth0/172.28.6.255 stop
> May 31 15:38:48 fig3 IPaddr[23251]: [23266]: INFO: ifconfig eth0:0 down
> May 31 15:38:48 fig3 avahi-daemon[3290]: Withdrawing address record for 
> 172.28.6.140 on eth0.
> May 31 15:38:48 fig3 IPaddr[23222]: [23269]: INFO:  Success
> May 31 15:38:48 fig3 heartbeat: [22970]: info: all HA resource release 
> completed (standby).
> May 31 15:38:48 fig3 heartbeat: [15428]: info: Local standby process 
> completed [all].
> May 31 15:38:49 fig3 kernel: drbd0: peer( Secondary -> Primary ) 
> May 31 15:38:57 fig3 kernel: ib0: multicast join failed for 
> ff12:401b:ffff:0000:0000:0000:ffff:ffff, status -11
> May 31 15:39:00 fig3 heartbeat: [15428]: WARN: 1 lost packet(s) for [fig1] 
> [1655:1657]
> May 31 15:39:00 fig3 heartbeat: [15428]: info: remote resource transition 
> completed.
> May 31 15:39:00 fig3 heartbeat: [15428]: info: No pkts missing from fig1!
> May 31 15:39:00 fig3 heartbeat: [15428]: info: Other node completed standby 
> takeover of all resources.
> May 31 15:39:20 fig3 heartbeat: [15428]: info: fig1 wants to go standby [all]
> May 31 15:39:24 fig3 kernel: drbd0: peer( Primary -> Secondary ) 
> May 31 15:39:25 fig3 heartbeat: [15428]: info: standby: acquire [all] 
> resources from fig1
> May 31 15:39:25 fig3 heartbeat: [23483]: info: acquire all HA resources 
> (standby).
> May 31 15:39:25 fig3 ResourceManager[23496]: [23507]: info: Acquiring 
> resource group: fig3 172.28.6.140/24/eth0/172.28.6.255 drbddisk::ufmdb 
> Filesystem::/dev/drbd0::/opt/ufm/files/::ext3 mysqld ufmd
> May 31 15:39:25 fig3 IPaddr[23519]: [23562]: INFO:  Resource is stopped
> May 31 15:39:25 fig3 ResourceManager[23496]: [23578]: info: Running 
> /etc/ha.d/resource.d/IPaddr 172.28.6.140/24/eth0/172.28.6.255 start
> May 31 15:39:25 fig3 IPaddr[23609]: [23643]: INFO: Using calculated netmask 
> for 172.28.6.140: 255.255.255.0
> May 31 15:39:25 fig3 IPaddr[23609]: [23664]: INFO: eval ifconfig eth0:0 
> 172.28.6.140 netmask 255.255.255.0 broadcast 172.28.6.255
> May 31 15:39:25 fig3 avahi-daemon[3290]: Registering new address record for 
> 172.28.6.140 on eth0.
> May 31 15:39:25 fig3 avahi-daemon[3290]: Withdrawing address record for 
> 172.28.6.140 on eth0.
> May 31 15:39:25 fig3 avahi-daemon[3290]: Registering new address record for 
> 172.28.6.140 on eth0.
> May 31 15:39:25 fig3 IPaddr[23580]: [23683]: INFO:  Success
> May 31 15:39:25 fig3 ResourceManager[23496]: [23712]: info: Running 
> /etc/ha.d/resource.d/drbddisk ufmdb start
> May 31 15:39:25 fig3 kernel: drbd0: role( Secondary -> Primary ) 
> May 31 15:39:25 fig3 kernel: drbd0: Writing meta data super block now.
> May 31 15:39:25 fig3 Filesystem[23729]: [23773]: INFO:  Resource is stopped
> May 31 15:39:25 fig3 ResourceManager[23496]: [23787]: info: Running 
> /etc/ha.d/resource.d/Filesystem /dev/drbd0 /opt/ufm/files/ ext3 start
> May 31 15:39:26 fig3 Filesystem[23800]: [23830]: INFO: Running start for 
> /dev/drbd0 on /opt/ufm/files
> May 31 15:39:26 fig3 kernel: kjournald starting.  Commit interval 5 seconds
> May 31 15:39:26 fig3 kernel: EXT3 FS on drbd0, internal journal
> May 31 15:39:26 fig3 kernel: EXT3-fs: mounted filesystem with ordered data 
> mode.
> May 31 15:39:26 fig3 Filesystem[23789]: [23845]: INFO:  Success
> May 31 15:39:26 fig3 ResourceManager[23496]: [23899]: info: Running 
> /etc/init.d/mysqld  start
> May 31 15:39:27 fig3 ResourceManager[23496]: [24024]: info: Running 
> /etc/init.d/ufmd  start
> May 31 15:39:32 fig3 OpenSM[24125]:  
> May 31 15:39:32 fig3 OpenSM[24125]:  Loading Cached Option:max_op_vls = 4  
> May 31 15:39:32 fig3 OpenSM[24125]:  Loading Cached 
> Option:partition_config_file = /opt/ufm/files/conf/opensm/partitions.conf  
> May 31 15:39:32 fig3 OpenSM[24125]:  Loading Cached Option:routing_engine = 
> minhop  
> May 31 15:39:32 fig3 OpenSM[24125]:  Loading Cached 
> Option:lid_matrix_dump_file = /opt/ufm/files/conf/opensm/lid_matrix.conf  
> May 31 15:39:32 fig3 OpenSM[24125]:  Loading Cached Option:lfts_file = 
> /opt/ufm/files/conf/opensm/lfts.conf  
> May 31 15:39:32 fig3 OpenSM[24125]:  Loading Cached Option:root_guid_file = 
> /opt/ufm/files/conf/opensm/root_guid.conf  
> May 31 15:39:32 fig3 OpenSM[24125]:  Loading Cached Option:cn_guid_file = 
> /opt/ufm/files/conf/opensm/cn_guid.conf  
> May 31 15:39:32 fig3 OpenSM[24125]:  Loading Cached Option:ids_guid_file = 
> /opt/ufm/files/conf/opensm/ids_guid.conf  
> May 31 15:39:32 fig3 OpenSM[24125]:  Loading Cached 
> Option:guid_routing_order_file = 
> /opt/ufm/files/conf/opensm/guid_routing_order.conf  
> May 31 15:39:32 fig3 OpenSM[24125]:  Loading Cached Option:sm_priority = 15  
> May 31 15:39:32 fig3 OpenSM[24125]:  Loading Cached Option:event_plugin_name 
> = osmufmpi  
> May 31 15:39:32 fig3 OpenSM[24125]:  Loading Cached Option:log_file = 
> /opt/ufm/files/log/opensm.log  
> May 31 15:39:32 fig3 OpenSM[24125]:  Loading Cached Option:log_max_size = 
> 4096  
> May 31 15:39:32 fig3 OpenSM[24125]:  Loading Cached Option:dump_files_dir = 
> /opt/ufm/files/log/  
> May 31 15:39:32 fig3 OpenSM[24125]:  Loading Cached Option:qos = TRUE  
> May 31 15:39:32 fig3 OpenSM[24125]:  Loading Cached Option:qos_policy_file = 
> /opt/ufm/files/conf/opensm/qos-policy.conf  
> May 31 15:39:32 fig3 OpenSM[24125]:  Loading Cached Option:qos_max_vls = 8  
> May 31 15:39:32 fig3 OpenSM[24125]:  Loading Cached Option:qos_high_limit = 0 
>  
> May 31 15:39:32 fig3 OpenSM[24125]:  Loading Cached Option:qos_vlarb_high = 
> 0:32  
> May 31 15:39:32 fig3 OpenSM[24125]:  Loading Cached Option:qos_vlarb_low = 
> 1:224,2:64,3:32  
> May 31 15:39:32 fig3 OpenSM[24125]:  Loading Cached Option:qos_sl2vl = 
> 0,1,2,3,4,5,6,7,15,15,15,15,15,15,15,15  
> May 31 15:39:32 fig3 OpenSM[24125]:  Loading Cached Option:prefix_routes_file 
> = /opt/ufm/files/conf/opensm/prefix-routes.conf  
> May 31 15:39:32 fig3 OpenSM[24128]: /opt/ufm/files/log/opensm.log log file 
> opened 
> May 31 15:39:32 fig3 OpenSM[24128]: OpenSM 3.3.2_974fc3f_2cbb47c  
> May 31 15:39:32 fig3 OpenSM[24128]: Entering DISCOVERING state  
> May 31 15:39:32 fig3 OpenSM[24128]:  Loading Cached Option:qos_max_vls = 8  
> May 31 15:39:32 fig3 OpenSM[24128]:  Loading Cached Option:qos_high_limit = 0 
>  
> May 31 15:39:32 fig3 OpenSM[24128]:  Loading Cached Option:qos_vlarb_high = 
> 0:32  
> May 31 15:39:32 fig3 OpenSM[24128]:  Loading Cached Option:qos_vlarb_low = 
> 1:224,2:64,3:32  
> May 31 15:39:32 fig3 OpenSM[24128]:  Loading Cached Option:qos_sl2vl = 
> 0,1,2,3,4,5,6,7,15,15,15,15,15,15,15,15  
> May 31 15:39:32 fig3 opensm[24128]: Entering MASTER state  
> May 31 15:39:33 fig3 kernel: ib0: multicast join failed for 
> ff12:401b:ffff:0000:0000:0000:ffff:ffff, status -11
> May 31 15:39:33 fig3 opensm[24128]: SUBNET UP  
> May 31 15:39:36 fig3 heartbeat: [23483]: info: all HA resource acquisition 
> completed (standby).
> May 31 15:39:36 fig3 heartbeat: [15428]: info: Standby resource acquisition 
> done [all].
> May 31 15:39:36 fig3 ufm_monitor: start monitoring...
> May 31 15:39:37 fig3 heartbeat: [15428]: info: remote resource transition 
> completed.
> May 31 15:39:40 fig3 dhcpd: Wrote 0 leases to leases file.
> May 31 15:39:40 fig3 dhcpd: 
> May 31 15:39:40 fig3 dhcpd: No subnet declaration for ib0.8001 (1.1.1.1).
> May 31 15:39:40 fig3 dhcpd: ** Ignoring requests on ib0.8001.  If this is not 
> what
> May 31 15:39:40 fig3 dhcpd:    you want, please write a subnet declaration
> May 31 15:39:40 fig3 dhcpd:    in your dhcpd.conf file for the network segment
> May 31 15:39:40 fig3 dhcpd:    to which interface ib0.8001 is attached. **
> May 31 15:39:40 fig3 dhcpd: 
> May 31 15:39:40 fig3 dhcpd: 
> May 31 15:39:40 fig3 dhcpd: No subnet declaration for eth0 (172.28.6.123).
> May 31 15:39:40 fig3 dhcpd: ** Ignoring requests on eth0.  If this is not what
> May 31 15:39:40 fig3 dhcpd:    you want, please write a subnet declaration
> May 31 15:39:40 fig3 dhcpd:    in your dhcpd.conf file for the network segment
> May 31 15:39:40 fig3 dhcpd:    to which interface eth0 is attached. **
> May 31 15:39:40 fig3 dhcpd: 
> May 31 15:39:40 fig3 dhcpd: Sending on   Socket/fallback/fallback-net
> May 31 15:39:40 fig3 harc[24397]: [24403]: info: Running 
> /etc/ha.d/rc.d/hb_takeover hb_takeover
> May 31 15:39:40 fig3 hb_standby[24409]: [24415]: Going standby [all].
> May 31 15:39:40 fig3 heartbeat: [15428]: info: fig3 wants to go standby [all]
> May 31 15:39:41 fig3 heartbeat: [15428]: info: standby: fig1 can take our all 
> resources
> May 31 15:39:41 fig3 heartbeat: [24419]: info: give up all HA resources 
> (standby).
> May 31 15:39:41 fig3 ResourceManager[24434]: [24445]: info: Releasing 
> resource group: fig3 172.28.6.140/24/eth0/172.28.6.255 drbddisk::ufmdb 
> Filesystem::/dev/drbd0::/opt/ufm/files/::ext3 mysqld ufmd
> May 31 15:39:41 fig3 ResourceManager[24434]: [24455]: info: Running 
> /etc/init.d/ufmd  stop
> May 31 15:39:41 fig3 opensm[24128]: Exiting SM  
> May 31 15:39:41 fig3 ResourceManager[24434]: [24500]: info: Running 
> /etc/init.d/mysqld  stop
> May 31 15:39:41 fig3 ufm_monitor: stop monitoring...
> May 31 15:39:43 fig3 ResourceManager[24434]: [24569]: info: Running 
> /etc/ha.d/resource.d/Filesystem /dev/drbd0 /opt/ufm/files/ ext3 stop
> May 31 15:39:43 fig3 Filesystem[24582]: [24612]: INFO: Running stop for 
> /dev/drbd0 on /opt/ufm/files
> May 31 15:39:43 fig3 Filesystem[24582]: [24622]: INFO: Trying to unmount 
> /opt/ufm/files
> May 31 15:39:43 fig3 Filesystem[24582]: [24625]: INFO: unmounted 
> /opt/ufm/files successfully
> May 31 15:39:43 fig3 Filesystem[24571]: [24631]: INFO:  Success
> May 31 15:39:43 fig3 ResourceManager[24434]: [24646]: info: Running 
> /etc/ha.d/resource.d/drbddisk ufmdb stop
> May 31 15:39:43 fig3 kernel: drbd0: role( Primary -> Secondary ) 
> May 31 15:39:43 fig3 kernel: drbd0: Writing meta data super block now.
> May 31 15:39:44 fig3 ResourceManager[24434]: [24667]: info: Running 
> /etc/ha.d/resource.d/IPaddr 172.28.6.140/24/eth0/172.28.6.255 stop
> May 31 15:39:44 fig3 IPaddr[24698]: [24713]: INFO: ifconfig eth0:0 down
> May 31 15:39:44 fig3 avahi-daemon[3290]: Withdrawing address record for 
> 172.28.6.140 on eth0.
> May 31 15:39:44 fig3 IPaddr[24669]: [24716]: INFO:  Success
> May 31 15:39:44 fig3 heartbeat: [24419]: info: all HA resource release 
> completed (standby).
> May 31 15:39:44 fig3 heartbeat: [15428]: info: Local standby process 
> completed [all].
> May 31 15:39:44 fig3 kernel: drbd0: peer( Secondary -> Primary ) 
> May 31 15:39:52 fig3 kernel: ib0: multicast join failed for 
> ff12:401b:ffff:0000:0000:0000:ffff:ffff, status -11
> May 31 15:39:56 fig3 heartbeat: [15428]: WARN: 1 lost packet(s) for [fig1] 
> [1721:1723]
> May 31 15:39:56 fig3 heartbeat: [15428]: info: remote resource transition 
> completed.
> May 31 15:39:56 fig3 heartbeat: [15428]: info: No pkts missing from fig1!
> May 31 15:39:56 fig3 heartbeat: [15428]: info: Other node completed standby 
> takeover of all resources.
> May 31 15:40:06 fig3 heartbeat: [15428]: info: fig1 wants to go standby [all]
> May 31 15:40:09 fig3 kernel: drbd0: peer( Primary -> Secondary ) 
> May 31 15:40:09 fig3 heartbeat: [15428]: info: standby: acquire [all] 
> resources from fig1
> May 31 15:40:09 fig3 heartbeat: [24930]: info: acquire all HA resources 
> (standby).
> May 31 15:40:09 fig3 ResourceManager[24943]: [24954]: info: Acquiring 
> resource group: fig3 172.28.6.140/24/eth0/172.28.6.255 drbddisk::ufmdb 
> Filesystem::/dev/drbd0::/opt/ufm/files/::ext3 mysqld ufmd
> May 31 15:40:09 fig3 IPaddr[24966]: [25009]: INFO:  Resource is stopped
> May 31 15:40:09 fig3 ResourceManager[24943]: [25025]: info: Running 
> /etc/ha.d/resource.d/IPaddr 172.28.6.140/24/eth0/172.28.6.255 start
> May 31 15:40:09 fig3 IPaddr[25056]: [25090]: INFO: Using calculated netmask 
> for 172.28.6.140: 255.255.255.0
> May 31 15:40:10 fig3 IPaddr[25056]: [25111]: INFO: eval ifconfig eth0:0 
> 172.28.6.140 netmask 255.255.255.0 broadcast 172.28.6.255
> May 31 15:40:10 fig3 avahi-daemon[3290]: Registering new address record for 
> 172.28.6.140 on eth0.
> May 31 15:40:10 fig3 avahi-daemon[3290]: Withdrawing address record for 
> 172.28.6.140 on eth0.
> May 31 15:40:10 fig3 avahi-daemon[3290]: Registering new address record for 
> 172.28.6.140 on eth0.
> May 31 15:40:10 fig3 IPaddr[25027]: [25130]: INFO:  Success
> May 31 15:40:10 fig3 ResourceManager[24943]: [25159]: info: Running 
> /etc/ha.d/resource.d/drbddisk ufmdb start
> May 31 15:40:10 fig3 kernel: drbd0: role( Secondary -> Primary ) 
> May 31 15:40:10 fig3 kernel: drbd0: Writing meta data super block now.
> May 31 15:40:10 fig3 Filesystem[25176]: [25220]: INFO:  Resource is stopped
> May 31 15:40:10 fig3 ResourceManager[24943]: [25234]: info: Running 
> /etc/ha.d/resource.d/Filesystem /dev/drbd0 /opt/ufm/files/ ext3 start
> May 31 15:40:10 fig3 Filesystem[25247]: [25277]: INFO: Running start for 
> /dev/drbd0 on /opt/ufm/files
> May 31 15:40:10 fig3 kernel: kjournald starting.  Commit interval 5 seconds
> May 31 15:40:10 fig3 kernel: EXT3 FS on drbd0, internal journal
> May 31 15:40:10 fig3 kernel: EXT3-fs: mounted filesystem with ordered data 
> mode.
> May 31 15:40:10 fig3 Filesystem[25236]: [25292]: INFO:  Success
> May 31 15:40:10 fig3 ResourceManager[24943]: [25346]: info: Running 
> /etc/init.d/mysqld  start
> May 31 15:40:12 fig3 ResourceManager[24943]: [25471]: info: Running 
> /etc/init.d/ufmd  start
> May 31 15:40:16 fig3 OpenSM[25599]:  
> May 31 15:40:16 fig3 OpenSM[25599]:  Loading Cached Option:max_op_vls = 4  
> May 31 15:40:16 fig3 OpenSM[25599]:  Loading Cached 
> Option:partition_config_file = /opt/ufm/files/conf/opensm/partitions.conf  
> May 31 15:40:16 fig3 OpenSM[25599]:  Loading Cached Option:routing_engine = 
> minhop  
> May 31 15:40:16 fig3 OpenSM[25599]:  Loading Cached 
> Option:lid_matrix_dump_file = /opt/ufm/files/conf/opensm/lid_matrix.conf  
> May 31 15:40:16 fig3 OpenSM[25599]:  Loading Cached Option:lfts_file = 
> /opt/ufm/files/conf/opensm/lfts.conf  
> May 31 15:40:16 fig3 OpenSM[25599]:  Loading Cached Option:root_guid_file = 
> /opt/ufm/files/conf/opensm/root_guid.conf  
> May 31 15:40:16 fig3 OpenSM[25599]:  Loading Cached Option:cn_guid_file = 
> /opt/ufm/files/conf/opensm/cn_guid.conf  
> May 31 15:40:16 fig3 OpenSM[25599]:  Loading Cached Option:ids_guid_file = 
> /opt/ufm/files/conf/opensm/ids_guid.conf  
> May 31 15:40:16 fig3 OpenSM[25599]:  Loading Cached 
> Option:guid_routing_order_file = 
> /opt/ufm/files/conf/opensm/guid_routing_order.conf  
> May 31 15:40:16 fig3 OpenSM[25599]:  Loading Cached Option:sm_priority = 15  
> May 31 15:40:16 fig3 OpenSM[25599]:  Loading Cached Option:event_plugin_name 
> = osmufmpi  
> May 31 15:40:16 fig3 OpenSM[25599]:  Loading Cached Option:log_file = 
> /opt/ufm/files/log/opensm.log  
> May 31 15:40:16 fig3 OpenSM[25599]:  Loading Cached Option:log_max_size = 
> 4096  
> May 31 15:40:16 fig3 OpenSM[25599]:  Loading Cached Option:dump_files_dir = 
> /opt/ufm/files/log/  
> May 31 15:40:16 fig3 OpenSM[25599]:  Loading Cached Option:qos = TRUE  
> May 31 15:40:16 fig3 OpenSM[25599]:  Loading Cached Option:qos_policy_file = 
> /opt/ufm/files/conf/opensm/qos-policy.conf  
> May 31 15:40:16 fig3 OpenSM[25599]:  Loading Cached Option:qos_max_vls = 8  
> May 31 15:40:17 fig3 OpenSM[25599]:  Loading Cached Option:qos_high_limit = 0 
>  
> May 31 15:40:17 fig3 OpenSM[25599]:  Loading Cached Option:qos_vlarb_high = 
> 0:32  
> May 31 15:40:17 fig3 OpenSM[25599]:  Loading Cached Option:qos_vlarb_low = 
> 1:224,2:64,3:32  
> May 31 15:40:17 fig3 OpenSM[25599]:  Loading Cached Option:qos_sl2vl = 
> 0,1,2,3,4,5,6,7,15,15,15,15,15,15,15,15  
> May 31 15:40:17 fig3 OpenSM[25599]:  Loading Cached Option:prefix_routes_file 
> = /opt/ufm/files/conf/opensm/prefix-routes.conf  
> May 31 15:40:17 fig3 OpenSM[25677]: /opt/ufm/files/log/opensm.log log file 
> opened 
> May 31 15:40:17 fig3 OpenSM[25677]: OpenSM 3.3.2_974fc3f_2cbb47c  
> May 31 15:40:17 fig3 OpenSM[25677]: Entering DISCOVERING state  
> May 31 15:40:17 fig3 OpenSM[25677]:  Loading Cached Option:qos_max_vls = 8  
> May 31 15:40:17 fig3 OpenSM[25677]:  Loading Cached Option:qos_high_limit = 0 
>  
> May 31 15:40:17 fig3 OpenSM[25677]:  Loading Cached Option:qos_vlarb_high = 
> 0:32  
> May 31 15:40:17 fig3 OpenSM[25677]:  Loading Cached Option:qos_vlarb_low = 
> 1:224,2:64,3:32  
> May 31 15:40:17 fig3 OpenSM[25677]:  Loading Cached Option:qos_sl2vl = 
> 0,1,2,3,4,5,6,7,15,15,15,15,15,15,15,15  
> May 31 15:40:18 fig3 opensm[25677]: Entering MASTER state  
> May 31 15:40:18 fig3 kernel: ib0: multicast join failed for 
> ff12:401b:ffff:0000:0000:0000:ffff:ffff, status -11
> May 31 15:40:18 fig3 opensm[25677]: SUBNET UP  
> May 31 15:40:21 fig3 heartbeat: [24930]: info: all HA resource acquisition 
> completed (standby).
> May 31 15:40:21 fig3 heartbeat: [15428]: info: Standby resource acquisition 
> done [all].
> May 31 15:40:21 fig3 ufm_monitor: start monitoring...
> May 31 15:40:21 fig3 harc[25831]: [25842]: info: Running 
> /etc/ha.d/rc.d/hb_takeover hb_takeover
> May 31 15:40:21 fig3 hb_standby[25848]: [25855]: Going standby [all].
> May 31 15:40:21 fig3 heartbeat: [15428]: WARN: standby message [me] from fig3 
> ignored.  Other side is in flux.
> May 31 15:40:22 fig3 heartbeat: [15428]: info: remote resource transition 
> completed.
> May 31 15:40:22 fig3 heartbeat: [15428]: ERROR: Ignored standby message 
> 'other' from fig1 in state 0
> May 31 15:40:24 fig3 dhcpd: Wrote 0 leases to leases file.
> May 31 15:40:24 fig3 dhcpd: 
> May 31 15:40:24 fig3 dhcpd: No subnet declaration for ib0.8001 (1.1.1.1).
> May 31 15:40:24 fig3 dhcpd: ** Ignoring requests on ib0.8001.  If this is not 
> what
> May 31 15:40:24 fig3 dhcpd:    you want, please write a subnet declaration
> May 31 15:40:24 fig3 dhcpd:    in your dhcpd.conf file for the network segment
> May 31 15:40:24 fig3 dhcpd:    to which interface ib0.8001 is attached. **
> May 31 15:40:24 fig3 dhcpd: 
> May 31 15:40:24 fig3 dhcpd: 
> May 31 15:40:24 fig3 dhcpd: No subnet declaration for eth0 (172.28.6.123).
> May 31 15:40:24 fig3 dhcpd: ** Ignoring requests on eth0.  If this is not what
> May 31 15:40:24 fig3 dhcpd:    you want, please write a subnet declaration
> May 31 15:40:24 fig3 dhcpd:    in your dhcpd.conf file for the network segment
> May 31 15:40:24 fig3 dhcpd:    to which interface eth0 is attached. **
> May 31 15:40:24 fig3 dhcpd: 
> May 31 15:40:24 fig3 dhcpd: Sending on   Socket/fallback/fallback-net
> May 31 15:40:27 fig3 OpenSM[25677]:  Loading Cached Option:qos_max_vls = 8  
> May 31 15:40:27 fig3 OpenSM[25677]:  Loading Cached Option:qos_high_limit = 0 
>  
> May 31 15:40:27 fig3 OpenSM[25677]:  Loading Cached Option:qos_vlarb_high = 
> 0:32  
> May 31 15:40:27 fig3 OpenSM[25677]:  Loading Cached Option:qos_vlarb_low = 
> 1:224,2:64,3:32  
> May 31 15:40:27 fig3 OpenSM[25677]:  Loading Cached Option:qos_sl2vl = 
> 0,1,2,3,4,5,6,7,15,15,15,15,15,15,15,15  
> May 31 15:40:43 fig3 harc[26109]: [26115]: info: Running 
> /etc/ha.d/rc.d/hb_takeover hb_takeover
> May 31 15:40:43 fig3 hb_standby[26121]: [26127]: Going standby [all].
> May 31 15:40:43 fig3 heartbeat: [15428]: info: fig3 wants to go standby [all]
> May 31 15:40:53 fig3 heartbeat: [15428]: WARN: No reply to standby request.  
> Standby request cancelled.
> May 31 15:41:22 fig3 heartbeat: [15428]: info: fig3 wants to go standby [all]
> May 31 15:41:32 fig3 heartbeat: [15428]: WARN: No reply to standby request.  
> Standby request cancelled.
> May 31 15:43:21 fig3 heartbeat: [15428]: info: fig3 wants to go standby [all]
> May 31 15:43:31 fig3 heartbeat: [15428]: WARN: No reply to standby request.  
> Standby request cancelled.
> May 31 15:45:53 fig3 heartbeat: [15428]: info: fig3 wants to go standby [all]
> May 31 15:46:03 fig3 heartbeat: [15428]: WARN: No reply to standby request.  
> Standby request cancelled.
> May 31 15:48:24 fig3 heartbeat: [15428]: info: fig3 wants to go standby [all]
> May 31 15:48:34 fig3 heartbeat: [15428]: WARN: No reply to standby request.  
> Standby request cancelled.
> 
> 
> 

> _______________________________________________
> Linux-HA mailing list
> [email protected]
> http://lists.linux-ha.org/mailman/listinfo/linux-ha
> See also: http://linux-ha.org/ReportingProblems
_______________________________________________
Linux-HA mailing list
[email protected]
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems

Reply via email to