Hi, On Wed, Jun 03, 2009 at 05:15:08PM +0300, Eli Dorfman (Voltaire) wrote: > Resending > > -------- Original Message -------- > Subject: heartbeat: WARN: standby message [me] from fig3 ignored. Other side > is in flux > Date: Mon, 01 Jun 2009 14:04:24 +0300 > From: Eli Dorfman (Voltaire) <[email protected]> > To: [email protected] > > Hi > > I have a setup with 2 servers fig1 and fig3 with heartbeat and drbd. > After several attempts to failover (using /usr/lib64/heartbeat/hb_standby) > between the > 2 servers fig1 and fig3 it seems that failover fails - probably due to the > following message: > "heartbeat: WARN: standby message [me] from fig3 ignored. Other side is in > flux" > What does it mean?
> May 31 15:40:21 fig3 harc[25831]: [25842]: info: Running > /etc/ha.d/rc.d/hb_takeover hb_takeover > May 31 15:40:21 fig3 hb_standby[25848]: [25855]: Going standby [all]. > May 31 15:40:21 fig3 heartbeat: [15428]: WARN: standby message [me] from fig3 > ignored. Other side is in flux. At this time fig1 still hasn't finished the transition: > May 31 15:40:28 fig1 heartbeat: [18119]: info: remote resource transition > completed. > Is it a configuration problem? Probably not. > Attached the messages log files from both servers. May 31 15:40:43 fig3 harc[26109]: [26115]: info: Running /etc/ha.d/rc.d/hb_takeover hb_takeover May 31 15:40:50 fig1 harc[25346]: [25352]: info: Running /etc/ha.d/rc.d/hb_takeover hb_takeover So, both sides want to go standby, that's why fig3 is refusing to accept new requests. Thanks, Dejan > Thanks, > Eli > > > > > May 31 15:38:44 fig1 heartbeat: [18119]: WARN: 1 lost packet(s) for [fig3] > [1624:1626] > May 31 15:38:44 fig1 heartbeat: [18119]: info: remote resource transition > completed. > May 31 15:38:44 fig1 heartbeat: [18119]: info: No pkts missing from fig3! > May 31 15:38:44 fig1 heartbeat: [18119]: info: Other node completed standby > takeover of all resources. > May 31 15:38:49 fig1 harc[22856]: [22862]: info: Running > /etc/ha.d/rc.d/hb_takeover hb_takeover > May 31 15:38:49 fig1 heartbeat: [18119]: info: fig3 wants to go standby [all] > May 31 15:38:55 fig1 kernel: drbd0: peer( Primary -> Secondary ) > May 31 15:38:55 fig1 heartbeat: [18119]: info: standby: acquire [all] > resources from fig3 > May 31 15:38:55 fig1 heartbeat: [22869]: info: acquire all HA resources > (standby). > May 31 15:38:55 fig1 ResourceManager[22882]: [22893]: info: Acquiring > resource group: fig1 172.28.6.140/24/eth0/172.28.6.255 drbddisk::ufmdb > Filesystem::/dev/drbd0::/opt/ufm/files/::ext3 mysqld ufmd > May 31 15:38:56 fig1 IPaddr[22905]: [22948]: INFO: Resource is stopped > May 31 15:38:56 fig1 ResourceManager[22882]: [22965]: info: Running > /etc/ha.d/resource.d/IPaddr 172.28.6.140/24/eth0/172.28.6.255 start > May 31 15:38:56 fig1 IPaddr[22996]: [23030]: INFO: Using calculated netmask > for 172.28.6.140: 255.255.255.0 > May 31 15:38:56 fig1 IPaddr[22996]: [23051]: INFO: eval ifconfig eth0:0 > 172.28.6.140 netmask 255.255.255.0 broadcast 172.28.6.255 > May 31 15:38:56 fig1 avahi-daemon[4509]: Registering new address record for > 172.28.6.140 on eth0. > May 31 15:38:56 fig1 avahi-daemon[4509]: Withdrawing address record for > 172.28.6.140 on eth0. > May 31 15:38:56 fig1 avahi-daemon[4509]: Registering new address record for > 172.28.6.140 on eth0. > May 31 15:38:56 fig1 IPaddr[22967]: [23070]: INFO: Success > May 31 15:38:56 fig1 ResourceManager[22882]: [23099]: info: Running > /etc/ha.d/resource.d/drbddisk ufmdb start > May 31 15:38:56 fig1 kernel: drbd0: role( Secondary -> Primary ) > May 31 15:38:56 fig1 kernel: drbd0: Writing meta data super block now. > May 31 15:38:57 fig1 Filesystem[23116]: [23162]: INFO: Resource is stopped > May 31 15:38:57 fig1 ResourceManager[22882]: [23176]: info: Running > /etc/ha.d/resource.d/Filesystem /dev/drbd0 /opt/ufm/files/ ext3 start > May 31 15:38:57 fig1 Filesystem[23189]: [23219]: INFO: Running start for > /dev/drbd0 on /opt/ufm/files > May 31 15:38:57 fig1 kernel: kjournald starting. Commit interval 5 seconds > May 31 15:38:57 fig1 kernel: EXT3 FS on drbd0, internal journal > May 31 15:38:57 fig1 kernel: EXT3-fs: mounted filesystem with ordered data > mode. > May 31 15:38:57 fig1 Filesystem[23178]: [23234]: INFO: Success > May 31 15:38:57 fig1 ResourceManager[22882]: [23288]: info: Running > /etc/init.d/mysqld start > May 31 15:38:58 fig1 ResourceManager[22882]: [23411]: info: Running > /etc/init.d/ufmd start > May 31 15:39:03 fig1 OpenSM[23510]: > May 31 15:39:03 fig1 OpenSM[23510]: Loading Cached Option:max_op_vls = 4 > May 31 15:39:03 fig1 OpenSM[23510]: Loading Cached > Option:partition_config_file = /opt/ufm/files/conf/opensm/partitions.conf > May 31 15:39:03 fig1 OpenSM[23510]: Loading Cached Option:routing_engine = > minhop > May 31 15:39:03 fig1 OpenSM[23510]: Loading Cached > Option:lid_matrix_dump_file = /opt/ufm/files/conf/opensm/lid_matrix.conf > May 31 15:39:03 fig1 OpenSM[23510]: Loading Cached Option:lfts_file = > /opt/ufm/files/conf/opensm/lfts.conf > May 31 15:39:03 fig1 OpenSM[23510]: Loading Cached Option:root_guid_file = > /opt/ufm/files/conf/opensm/root_guid.conf > May 31 15:39:03 fig1 OpenSM[23510]: Loading Cached Option:cn_guid_file = > /opt/ufm/files/conf/opensm/cn_guid.conf > May 31 15:39:03 fig1 OpenSM[23510]: Loading Cached Option:ids_guid_file = > /opt/ufm/files/conf/opensm/ids_guid.conf > May 31 15:39:03 fig1 OpenSM[23510]: Loading Cached > Option:guid_routing_order_file = > /opt/ufm/files/conf/opensm/guid_routing_order.conf > May 31 15:39:03 fig1 OpenSM[23510]: Loading Cached Option:sm_priority = 15 > May 31 15:39:03 fig1 OpenSM[23510]: Loading Cached Option:event_plugin_name > = osmufmpi > May 31 15:39:03 fig1 OpenSM[23510]: Loading Cached Option:log_file = > /opt/ufm/files/log/opensm.log > May 31 15:39:03 fig1 OpenSM[23510]: Loading Cached Option:log_max_size = > 4096 > May 31 15:39:03 fig1 OpenSM[23510]: Loading Cached Option:dump_files_dir = > /opt/ufm/files/log/ > May 31 15:39:03 fig1 OpenSM[23510]: Loading Cached Option:qos = TRUE > May 31 15:39:03 fig1 OpenSM[23510]: Loading Cached Option:qos_policy_file = > /opt/ufm/files/conf/opensm/qos-policy.conf > May 31 15:39:03 fig1 OpenSM[23510]: Loading Cached Option:qos_max_vls = 8 > May 31 15:39:03 fig1 OpenSM[23510]: Loading Cached Option:qos_high_limit = 0 > > May 31 15:39:03 fig1 OpenSM[23510]: Loading Cached Option:qos_vlarb_high = > 0:32 > May 31 15:39:03 fig1 OpenSM[23510]: Loading Cached Option:qos_vlarb_low = > 1:224,2:64,3:32 > May 31 15:39:03 fig1 OpenSM[23510]: Loading Cached Option:qos_sl2vl = > 0,1,2,3,4,5,6,7,15,15,15,15,15,15,15,15 > May 31 15:39:03 fig1 OpenSM[23510]: Loading Cached Option:prefix_routes_file > = /opt/ufm/files/conf/opensm/prefix-routes.conf > May 31 15:39:03 fig1 OpenSM[23515]: /opt/ufm/files/log/opensm.log log file > opened > May 31 15:39:03 fig1 OpenSM[23515]: OpenSM 3.3.2_974fc3f_2cbb47c > May 31 15:39:03 fig1 OpenSM[23515]: Entering DISCOVERING state > May 31 15:39:03 fig1 OpenSM[23515]: Loading Cached Option:qos_max_vls = 8 > May 31 15:39:03 fig1 OpenSM[23515]: Loading Cached Option:qos_high_limit = 0 > > May 31 15:39:03 fig1 OpenSM[23515]: Loading Cached Option:qos_vlarb_high = > 0:32 > May 31 15:39:03 fig1 OpenSM[23515]: Loading Cached Option:qos_vlarb_low = > 1:224,2:64,3:32 > May 31 15:39:03 fig1 OpenSM[23515]: Loading Cached Option:qos_sl2vl = > 0,1,2,3,4,5,6,7,15,15,15,15,15,15,15,15 > May 31 15:39:04 fig1 opensm[23515]: Entering MASTER state > May 31 15:39:04 fig1 kernel: ib0: multicast join failed for > ff12:401b:ffff:0000:0000:0000:ffff:ffff, status -11 > May 31 15:39:04 fig1 opensm[23515]: SUBNET UP > May 31 15:39:07 fig1 heartbeat: [22869]: info: all HA resource acquisition > completed (standby). > May 31 15:39:07 fig1 heartbeat: [18119]: info: Standby resource acquisition > done [all]. > May 31 15:39:07 fig1 ufm_monitor: start monitoring... > May 31 15:39:07 fig1 heartbeat: [18119]: info: remote resource transition > completed. > May 31 15:39:10 fig1 dhcpd: Wrote 0 leases to leases file. > May 31 15:39:10 fig1 dhcpd: > May 31 15:39:10 fig1 dhcpd: No subnet declaration for ib0.8001 (1.1.1.1). > May 31 15:39:10 fig1 dhcpd: ** Ignoring requests on ib0.8001. If this is not > what > May 31 15:39:10 fig1 dhcpd: you want, please write a subnet declaration > May 31 15:39:10 fig1 dhcpd: in your dhcpd.conf file for the network segment > May 31 15:39:10 fig1 dhcpd: to which interface ib0.8001 is attached. ** > May 31 15:39:10 fig1 dhcpd: > May 31 15:39:10 fig1 dhcpd: > May 31 15:39:10 fig1 dhcpd: No subnet declaration for eth0 (172.28.6.121). > May 31 15:39:10 fig1 dhcpd: ** Ignoring requests on eth0. If this is not what > May 31 15:39:10 fig1 dhcpd: you want, please write a subnet declaration > May 31 15:39:10 fig1 dhcpd: in your dhcpd.conf file for the network segment > May 31 15:39:10 fig1 dhcpd: to which interface eth0 is attached. ** > May 31 15:39:10 fig1 dhcpd: > May 31 15:39:10 fig1 dhcpd: Sending on Socket/fallback/fallback-net > May 31 15:39:13 fig1 OpenSM[23515]: Loading Cached Option:qos_max_vls = 8 > May 31 15:39:13 fig1 OpenSM[23515]: Loading Cached Option:qos_high_limit = 0 > > May 31 15:39:13 fig1 OpenSM[23515]: Loading Cached Option:qos_vlarb_high = > 0:32 > May 31 15:39:13 fig1 OpenSM[23515]: Loading Cached Option:qos_vlarb_low = > 1:224,2:64,3:32 > May 31 15:39:13 fig1 OpenSM[23515]: Loading Cached Option:qos_sl2vl = > 0,1,2,3,4,5,6,7,15,15,15,15,15,15,15,15 > May 31 15:39:27 fig1 heartbeat: [18119]: info: fig1 wants to go standby [all] > May 31 15:39:27 fig1 heartbeat: [18119]: info: standby: fig3 can take our all > resources > May 31 15:39:27 fig1 heartbeat: [23681]: info: give up all HA resources > (standby). > May 31 15:39:27 fig1 ResourceManager[23694]: [23705]: info: Releasing > resource group: fig1 172.28.6.140/24/eth0/172.28.6.255 drbddisk::ufmdb > Filesystem::/dev/drbd0::/opt/ufm/files/::ext3 mysqld ufmd > May 31 15:39:28 fig1 ResourceManager[23694]: [23716]: info: Running > /etc/init.d/ufmd stop > May 31 15:39:28 fig1 opensm[23515]: Exiting SM > May 31 15:39:28 fig1 ResourceManager[23694]: [23766]: info: Running > /etc/init.d/mysqld stop > May 31 15:39:31 fig1 ResourceManager[23694]: [23836]: info: Running > /etc/ha.d/resource.d/Filesystem /dev/drbd0 /opt/ufm/files/ ext3 stop > May 31 15:39:31 fig1 Filesystem[23849]: [23879]: INFO: Running stop for > /dev/drbd0 on /opt/ufm/files > May 31 15:39:31 fig1 Filesystem[23849]: [23889]: INFO: Trying to unmount > /opt/ufm/files > May 31 15:39:31 fig1 Filesystem[23849]: [23892]: INFO: unmounted > /opt/ufm/files successfully > May 31 15:39:31 fig1 Filesystem[23838]: [23898]: INFO: Success > May 31 15:39:31 fig1 ResourceManager[23694]: [23913]: info: Running > /etc/ha.d/resource.d/drbddisk ufmdb stop > May 31 15:39:31 fig1 kernel: drbd0: role( Primary -> Secondary ) > May 31 15:39:31 fig1 kernel: drbd0: Writing meta data super block now. > May 31 15:39:31 fig1 ResourceManager[23694]: [23934]: info: Running > /etc/ha.d/resource.d/IPaddr 172.28.6.140/24/eth0/172.28.6.255 stop > May 31 15:39:32 fig1 IPaddr[23965]: [23980]: INFO: ifconfig eth0:0 down > May 31 15:39:32 fig1 avahi-daemon[4509]: Withdrawing address record for > 172.28.6.140 on eth0. > May 31 15:39:32 fig1 IPaddr[23936]: [23983]: INFO: Success > May 31 15:39:32 fig1 heartbeat: [23681]: info: all HA resource release > completed (standby). > May 31 15:39:32 fig1 heartbeat: [18119]: info: Local standby process > completed [all]. > May 31 15:39:32 fig1 kernel: drbd0: peer( Secondary -> Primary ) > May 31 15:39:33 fig1 ufm_monitor: stop monitoring... > May 31 15:39:40 fig1 kernel: ib0: multicast join failed for > ff12:401b:ffff:0000:0000:0000:ffff:ffff, status -11 > May 31 15:39:43 fig1 heartbeat: [18119]: WARN: 1 lost packet(s) for [fig3] > [1693:1695] > May 31 15:39:43 fig1 heartbeat: [18119]: info: remote resource transition > completed. > May 31 15:39:43 fig1 heartbeat: [18119]: info: No pkts missing from fig3! > May 31 15:39:43 fig1 heartbeat: [18119]: info: Other node completed standby > takeover of all resources. > May 31 15:39:47 fig1 harc[24065]: [24071]: info: Running > /etc/ha.d/rc.d/hb_takeover hb_takeover > May 31 15:39:47 fig1 heartbeat: [18119]: info: fig3 wants to go standby [all] > May 31 15:39:51 fig1 kernel: drbd0: peer( Primary -> Secondary ) > May 31 15:39:51 fig1 heartbeat: [18119]: info: standby: acquire [all] > resources from fig3 > May 31 15:39:51 fig1 heartbeat: [24077]: info: acquire all HA resources > (standby). > May 31 15:39:51 fig1 ResourceManager[24090]: [24101]: info: Acquiring > resource group: fig1 172.28.6.140/24/eth0/172.28.6.255 drbddisk::ufmdb > Filesystem::/dev/drbd0::/opt/ufm/files/::ext3 mysqld ufmd > May 31 15:39:51 fig1 IPaddr[24113]: [24156]: INFO: Resource is stopped > May 31 15:39:51 fig1 ResourceManager[24090]: [24172]: info: Running > /etc/ha.d/resource.d/IPaddr 172.28.6.140/24/eth0/172.28.6.255 start > May 31 15:39:51 fig1 IPaddr[24203]: [24237]: INFO: Using calculated netmask > for 172.28.6.140: 255.255.255.0 > May 31 15:39:51 fig1 IPaddr[24203]: [24258]: INFO: eval ifconfig eth0:0 > 172.28.6.140 netmask 255.255.255.0 broadcast 172.28.6.255 > May 31 15:39:51 fig1 avahi-daemon[4509]: Registering new address record for > 172.28.6.140 on eth0. > May 31 15:39:51 fig1 avahi-daemon[4509]: Withdrawing address record for > 172.28.6.140 on eth0. > May 31 15:39:51 fig1 avahi-daemon[4509]: Registering new address record for > 172.28.6.140 on eth0. > May 31 15:39:51 fig1 IPaddr[24174]: [24277]: INFO: Success > May 31 15:39:52 fig1 ResourceManager[24090]: [24306]: info: Running > /etc/ha.d/resource.d/drbddisk ufmdb start > May 31 15:39:52 fig1 kernel: drbd0: role( Secondary -> Primary ) > May 31 15:39:52 fig1 kernel: drbd0: Writing meta data super block now. > May 31 15:39:52 fig1 Filesystem[24323]: [24367]: INFO: Resource is stopped > May 31 15:39:52 fig1 ResourceManager[24090]: [24381]: info: Running > /etc/ha.d/resource.d/Filesystem /dev/drbd0 /opt/ufm/files/ ext3 start > May 31 15:39:52 fig1 Filesystem[24394]: [24424]: INFO: Running start for > /dev/drbd0 on /opt/ufm/files > May 31 15:39:52 fig1 kernel: kjournald starting. Commit interval 5 seconds > May 31 15:39:52 fig1 kernel: EXT3 FS on drbd0, internal journal > May 31 15:39:52 fig1 kernel: EXT3-fs: mounted filesystem with ordered data > mode. > May 31 15:39:52 fig1 Filesystem[24383]: [24439]: INFO: Success > May 31 15:39:52 fig1 ResourceManager[24090]: [24493]: info: Running > /etc/init.d/mysqld start > May 31 15:39:53 fig1 ResourceManager[24090]: [24614]: info: Running > /etc/init.d/ufmd start > May 31 15:39:58 fig1 OpenSM[24710]: > May 31 15:39:58 fig1 OpenSM[24710]: Loading Cached Option:max_op_vls = 4 > May 31 15:39:58 fig1 OpenSM[24710]: Loading Cached > Option:partition_config_file = /opt/ufm/files/conf/opensm/partitions.conf > May 31 15:39:58 fig1 OpenSM[24710]: Loading Cached Option:routing_engine = > minhop > May 31 15:39:58 fig1 OpenSM[24710]: Loading Cached > Option:lid_matrix_dump_file = /opt/ufm/files/conf/opensm/lid_matrix.conf > May 31 15:39:58 fig1 OpenSM[24710]: Loading Cached Option:lfts_file = > /opt/ufm/files/conf/opensm/lfts.conf > May 31 15:39:58 fig1 OpenSM[24710]: Loading Cached Option:root_guid_file = > /opt/ufm/files/conf/opensm/root_guid.conf > May 31 15:39:58 fig1 OpenSM[24710]: Loading Cached Option:cn_guid_file = > /opt/ufm/files/conf/opensm/cn_guid.conf > May 31 15:39:58 fig1 OpenSM[24710]: Loading Cached Option:ids_guid_file = > /opt/ufm/files/conf/opensm/ids_guid.conf > May 31 15:39:58 fig1 OpenSM[24710]: Loading Cached > Option:guid_routing_order_file = > /opt/ufm/files/conf/opensm/guid_routing_order.conf > May 31 15:39:58 fig1 OpenSM[24710]: Loading Cached Option:sm_priority = 15 > May 31 15:39:58 fig1 OpenSM[24710]: Loading Cached Option:event_plugin_name > = osmufmpi > May 31 15:39:58 fig1 OpenSM[24710]: Loading Cached Option:log_file = > /opt/ufm/files/log/opensm.log > May 31 15:39:58 fig1 OpenSM[24710]: Loading Cached Option:log_max_size = > 4096 > May 31 15:39:58 fig1 OpenSM[24710]: Loading Cached Option:dump_files_dir = > /opt/ufm/files/log/ > May 31 15:39:58 fig1 OpenSM[24710]: Loading Cached Option:qos = TRUE > May 31 15:39:58 fig1 OpenSM[24710]: Loading Cached Option:qos_policy_file = > /opt/ufm/files/conf/opensm/qos-policy.conf > May 31 15:39:58 fig1 OpenSM[24710]: Loading Cached Option:qos_max_vls = 8 > May 31 15:39:58 fig1 OpenSM[24710]: Loading Cached Option:qos_high_limit = 0 > > May 31 15:39:58 fig1 OpenSM[24710]: Loading Cached Option:qos_vlarb_high = > 0:32 > May 31 15:39:58 fig1 OpenSM[24710]: Loading Cached Option:qos_vlarb_low = > 1:224,2:64,3:32 > May 31 15:39:58 fig1 OpenSM[24710]: Loading Cached Option:qos_sl2vl = > 0,1,2,3,4,5,6,7,15,15,15,15,15,15,15,15 > May 31 15:39:58 fig1 OpenSM[24710]: Loading Cached Option:prefix_routes_file > = /opt/ufm/files/conf/opensm/prefix-routes.conf > May 31 15:39:58 fig1 OpenSM[24718]: /opt/ufm/files/log/opensm.log log file > opened > May 31 15:39:58 fig1 OpenSM[24718]: OpenSM 3.3.2_974fc3f_2cbb47c > May 31 15:39:58 fig1 OpenSM[24718]: Entering DISCOVERING state > May 31 15:39:58 fig1 OpenSM[24718]: Loading Cached Option:qos_max_vls = 8 > May 31 15:39:58 fig1 OpenSM[24718]: Loading Cached Option:qos_high_limit = 0 > > May 31 15:39:58 fig1 OpenSM[24718]: Loading Cached Option:qos_vlarb_high = > 0:32 > May 31 15:39:58 fig1 OpenSM[24718]: Loading Cached Option:qos_vlarb_low = > 1:224,2:64,3:32 > May 31 15:39:59 fig1 OpenSM[24718]: Loading Cached Option:qos_sl2vl = > 0,1,2,3,4,5,6,7,15,15,15,15,15,15,15,15 > May 31 15:39:59 fig1 opensm[24718]: Entering MASTER state > May 31 15:39:59 fig1 kernel: ib0: multicast join failed for > ff12:401b:ffff:0000:0000:0000:ffff:ffff, status -11 > May 31 15:39:59 fig1 opensm[24718]: SUBNET UP > May 31 15:40:02 fig1 heartbeat: [24077]: info: all HA resource acquisition > completed (standby). > May 31 15:40:02 fig1 heartbeat: [18119]: info: Standby resource acquisition > done [all]. > May 31 15:40:02 fig1 ufm_monitor: start monitoring... > May 31 15:40:03 fig1 heartbeat: [18119]: info: remote resource transition > completed. > May 31 15:40:05 fig1 dhcpd: Wrote 0 leases to leases file. > May 31 15:40:05 fig1 dhcpd: > May 31 15:40:05 fig1 dhcpd: No subnet declaration for ib0.8001 (1.1.1.1). > May 31 15:40:05 fig1 dhcpd: ** Ignoring requests on ib0.8001. If this is not > what > May 31 15:40:05 fig1 dhcpd: you want, please write a subnet declaration > May 31 15:40:05 fig1 dhcpd: in your dhcpd.conf file for the network segment > May 31 15:40:05 fig1 dhcpd: to which interface ib0.8001 is attached. ** > May 31 15:40:05 fig1 dhcpd: > May 31 15:40:05 fig1 dhcpd: > May 31 15:40:05 fig1 dhcpd: No subnet declaration for eth0 (172.28.6.121). > May 31 15:40:05 fig1 dhcpd: ** Ignoring requests on eth0. If this is not what > May 31 15:40:05 fig1 dhcpd: you want, please write a subnet declaration > May 31 15:40:05 fig1 dhcpd: in your dhcpd.conf file for the network segment > May 31 15:40:05 fig1 dhcpd: to which interface eth0 is attached. ** > May 31 15:40:05 fig1 dhcpd: > May 31 15:40:05 fig1 dhcpd: Sending on Socket/fallback/fallback-net > May 31 15:40:08 fig1 OpenSM[24718]: Loading Cached Option:qos_max_vls = 8 > May 31 15:40:08 fig1 OpenSM[24718]: Loading Cached Option:qos_high_limit = 0 > > May 31 15:40:08 fig1 OpenSM[24718]: Loading Cached Option:qos_vlarb_high = > 0:32 > May 31 15:40:08 fig1 OpenSM[24718]: Loading Cached Option:qos_vlarb_low = > 1:224,2:64,3:32 > May 31 15:40:08 fig1 OpenSM[24718]: Loading Cached Option:qos_sl2vl = > 0,1,2,3,4,5,6,7,15,15,15,15,15,15,15,15 > May 31 15:40:12 fig1 heartbeat: [18119]: info: fig1 wants to go standby [all] > May 31 15:40:13 fig1 heartbeat: [18119]: info: standby: fig3 can take our all > resources > May 31 15:40:13 fig1 heartbeat: [24878]: info: give up all HA resources > (standby). > May 31 15:40:13 fig1 ResourceManager[24891]: [24902]: info: Releasing > resource group: fig1 172.28.6.140/24/eth0/172.28.6.255 drbddisk::ufmdb > Filesystem::/dev/drbd0::/opt/ufm/files/::ext3 mysqld ufmd > May 31 15:40:13 fig1 ResourceManager[24891]: [24912]: info: Running > /etc/init.d/ufmd stop > May 31 15:40:13 fig1 opensm[24718]: Exiting SM > May 31 15:40:13 fig1 ResourceManager[24891]: [24957]: info: Running > /etc/init.d/mysqld stop > May 31 15:40:15 fig1 ResourceManager[24891]: [25025]: info: Running > /etc/ha.d/resource.d/Filesystem /dev/drbd0 /opt/ufm/files/ ext3 stop > May 31 15:40:15 fig1 Filesystem[25038]: [25068]: INFO: Running stop for > /dev/drbd0 on /opt/ufm/files > May 31 15:40:15 fig1 Filesystem[25038]: [25078]: INFO: Trying to unmount > /opt/ufm/files > May 31 15:40:16 fig1 Filesystem[25038]: [25081]: INFO: unmounted > /opt/ufm/files successfully > May 31 15:40:16 fig1 Filesystem[25027]: [25087]: INFO: Success > May 31 15:40:16 fig1 ResourceManager[24891]: [25102]: info: Running > /etc/ha.d/resource.d/drbddisk ufmdb stop > May 31 15:40:16 fig1 kernel: drbd0: role( Primary -> Secondary ) > May 31 15:40:16 fig1 kernel: drbd0: Writing meta data super block now. > May 31 15:40:16 fig1 ResourceManager[24891]: [25123]: info: Running > /etc/ha.d/resource.d/IPaddr 172.28.6.140/24/eth0/172.28.6.255 stop > May 31 15:40:16 fig1 IPaddr[25154]: [25169]: INFO: ifconfig eth0:0 down > May 31 15:40:16 fig1 avahi-daemon[4509]: Withdrawing address record for > 172.28.6.140 on eth0. > May 31 15:40:16 fig1 IPaddr[25125]: [25172]: INFO: Success > May 31 15:40:16 fig1 heartbeat: [24878]: info: all HA resource release > completed (standby). > May 31 15:40:16 fig1 heartbeat: [18119]: info: Local standby process > completed [all]. > May 31 15:40:17 fig1 kernel: drbd0: peer( Secondary -> Primary ) > May 31 15:40:18 fig1 ufm_monitor: stop monitoring... > May 31 15:40:25 fig1 kernel: ib0: multicast join failed for > ff12:401b:ffff:0000:0000:0000:ffff:ffff, status -11 > May 31 15:40:25 fig1 harc[25254]: [25260]: info: Running > /etc/ha.d/rc.d/hb_takeover hb_takeover > May 31 15:40:28 fig1 heartbeat: [18119]: WARN: 1 lost packet(s) for [fig3] > [1747:1749] > May 31 15:40:28 fig1 heartbeat: [18119]: info: remote resource transition > completed. > May 31 15:40:28 fig1 heartbeat: [18119]: info: No pkts missing from fig3! > May 31 15:40:28 fig1 heartbeat: [18119]: info: Other node completed standby > takeover of all resources. > May 31 15:40:28 fig1 heartbeat: [18119]: info: fig3 wants to go standby [all] > May 31 15:40:29 fig1 heartbeat: [18119]: info: remote resource transition > completed. > May 31 15:40:50 fig1 harc[25346]: [25352]: info: Running > /etc/ha.d/rc.d/hb_takeover hb_takeover > May 31 15:40:50 fig1 heartbeat: [18119]: WARN: Standby in progress- new > request from fig3 ignored [3578 seconds left] > May 31 15:41:29 fig1 heartbeat: [18119]: WARN: Standby in progress- new > request from fig3 ignored [3540 seconds left] > May 31 15:43:28 fig1 heartbeat: [18119]: WARN: Standby in progress- new > request from fig3 ignored [3420 seconds left] > May 31 15:46:00 fig1 heartbeat: [18119]: WARN: Standby in progress- new > request from fig3 ignored [3269 seconds left] > May 31 15:48:32 fig1 heartbeat: [18119]: WARN: Standby in progress- new > request from fig3 ignored [3117 seconds left] > > > > > May 31 15:38:39 fig3 dhcpd: Sending on Socket/fallback/fallback-net > May 31 15:38:42 fig3 harc[22947]: [22953]: info: Running > /etc/ha.d/rc.d/hb_takeover hb_takeover > May 31 15:38:42 fig3 hb_standby[22959]: [22965]: Going standby [all]. > May 31 15:38:42 fig3 heartbeat: [15428]: info: fig3 wants to go standby [all] > May 31 15:38:42 fig3 OpenSM[22667]: Loading Cached Option:qos_max_vls = 8 > May 31 15:38:42 fig3 OpenSM[22667]: Loading Cached Option:qos_high_limit = 0 > > May 31 15:38:42 fig3 OpenSM[22667]: Loading Cached Option:qos_vlarb_high = > 0:32 > May 31 15:38:42 fig3 OpenSM[22667]: Loading Cached Option:qos_vlarb_low = > 1:224,2:64,3:32 > May 31 15:38:42 fig3 OpenSM[22667]: Loading Cached Option:qos_sl2vl = > 0,1,2,3,4,5,6,7,15,15,15,15,15,15,15,15 > May 31 15:38:43 fig3 heartbeat: [15428]: info: standby: fig1 can take our all > resources > May 31 15:38:43 fig3 heartbeat: [22970]: info: give up all HA resources > (standby). > May 31 15:38:43 fig3 ResourceManager[22983]: [22994]: info: Releasing > resource group: fig3 172.28.6.140/24/eth0/172.28.6.255 drbddisk::ufmdb > Filesystem::/dev/drbd0::/opt/ufm/files/::ext3 mysqld ufmd > May 31 15:38:43 fig3 ResourceManager[22983]: [23004]: info: Running > /etc/init.d/ufmd stop > May 31 15:38:43 fig3 opensm[22667]: Exiting SM > May 31 15:38:43 fig3 ResourceManager[22983]: [23049]: info: Running > /etc/init.d/mysqld stop > May 31 15:38:47 fig3 ufm_monitor: stop monitoring... > May 31 15:38:47 fig3 ResourceManager[22983]: [23122]: info: Running > /etc/ha.d/resource.d/Filesystem /dev/drbd0 /opt/ufm/files/ ext3 stop > May 31 15:38:47 fig3 Filesystem[23135]: [23165]: INFO: Running stop for > /dev/drbd0 on /opt/ufm/files > May 31 15:38:47 fig3 Filesystem[23135]: [23175]: INFO: Trying to unmount > /opt/ufm/files > May 31 15:38:47 fig3 Filesystem[23135]: [23178]: INFO: unmounted > /opt/ufm/files successfully > May 31 15:38:47 fig3 Filesystem[23124]: [23184]: INFO: Success > May 31 15:38:47 fig3 ResourceManager[22983]: [23199]: info: Running > /etc/ha.d/resource.d/drbddisk ufmdb stop > May 31 15:38:48 fig3 kernel: drbd0: role( Primary -> Secondary ) > May 31 15:38:48 fig3 kernel: drbd0: Writing meta data super block now. > May 31 15:38:48 fig3 ResourceManager[22983]: [23220]: info: Running > /etc/ha.d/resource.d/IPaddr 172.28.6.140/24/eth0/172.28.6.255 stop > May 31 15:38:48 fig3 IPaddr[23251]: [23266]: INFO: ifconfig eth0:0 down > May 31 15:38:48 fig3 avahi-daemon[3290]: Withdrawing address record for > 172.28.6.140 on eth0. > May 31 15:38:48 fig3 IPaddr[23222]: [23269]: INFO: Success > May 31 15:38:48 fig3 heartbeat: [22970]: info: all HA resource release > completed (standby). > May 31 15:38:48 fig3 heartbeat: [15428]: info: Local standby process > completed [all]. > May 31 15:38:49 fig3 kernel: drbd0: peer( Secondary -> Primary ) > May 31 15:38:57 fig3 kernel: ib0: multicast join failed for > ff12:401b:ffff:0000:0000:0000:ffff:ffff, status -11 > May 31 15:39:00 fig3 heartbeat: [15428]: WARN: 1 lost packet(s) for [fig1] > [1655:1657] > May 31 15:39:00 fig3 heartbeat: [15428]: info: remote resource transition > completed. > May 31 15:39:00 fig3 heartbeat: [15428]: info: No pkts missing from fig1! > May 31 15:39:00 fig3 heartbeat: [15428]: info: Other node completed standby > takeover of all resources. > May 31 15:39:20 fig3 heartbeat: [15428]: info: fig1 wants to go standby [all] > May 31 15:39:24 fig3 kernel: drbd0: peer( Primary -> Secondary ) > May 31 15:39:25 fig3 heartbeat: [15428]: info: standby: acquire [all] > resources from fig1 > May 31 15:39:25 fig3 heartbeat: [23483]: info: acquire all HA resources > (standby). > May 31 15:39:25 fig3 ResourceManager[23496]: [23507]: info: Acquiring > resource group: fig3 172.28.6.140/24/eth0/172.28.6.255 drbddisk::ufmdb > Filesystem::/dev/drbd0::/opt/ufm/files/::ext3 mysqld ufmd > May 31 15:39:25 fig3 IPaddr[23519]: [23562]: INFO: Resource is stopped > May 31 15:39:25 fig3 ResourceManager[23496]: [23578]: info: Running > /etc/ha.d/resource.d/IPaddr 172.28.6.140/24/eth0/172.28.6.255 start > May 31 15:39:25 fig3 IPaddr[23609]: [23643]: INFO: Using calculated netmask > for 172.28.6.140: 255.255.255.0 > May 31 15:39:25 fig3 IPaddr[23609]: [23664]: INFO: eval ifconfig eth0:0 > 172.28.6.140 netmask 255.255.255.0 broadcast 172.28.6.255 > May 31 15:39:25 fig3 avahi-daemon[3290]: Registering new address record for > 172.28.6.140 on eth0. > May 31 15:39:25 fig3 avahi-daemon[3290]: Withdrawing address record for > 172.28.6.140 on eth0. > May 31 15:39:25 fig3 avahi-daemon[3290]: Registering new address record for > 172.28.6.140 on eth0. > May 31 15:39:25 fig3 IPaddr[23580]: [23683]: INFO: Success > May 31 15:39:25 fig3 ResourceManager[23496]: [23712]: info: Running > /etc/ha.d/resource.d/drbddisk ufmdb start > May 31 15:39:25 fig3 kernel: drbd0: role( Secondary -> Primary ) > May 31 15:39:25 fig3 kernel: drbd0: Writing meta data super block now. > May 31 15:39:25 fig3 Filesystem[23729]: [23773]: INFO: Resource is stopped > May 31 15:39:25 fig3 ResourceManager[23496]: [23787]: info: Running > /etc/ha.d/resource.d/Filesystem /dev/drbd0 /opt/ufm/files/ ext3 start > May 31 15:39:26 fig3 Filesystem[23800]: [23830]: INFO: Running start for > /dev/drbd0 on /opt/ufm/files > May 31 15:39:26 fig3 kernel: kjournald starting. Commit interval 5 seconds > May 31 15:39:26 fig3 kernel: EXT3 FS on drbd0, internal journal > May 31 15:39:26 fig3 kernel: EXT3-fs: mounted filesystem with ordered data > mode. > May 31 15:39:26 fig3 Filesystem[23789]: [23845]: INFO: Success > May 31 15:39:26 fig3 ResourceManager[23496]: [23899]: info: Running > /etc/init.d/mysqld start > May 31 15:39:27 fig3 ResourceManager[23496]: [24024]: info: Running > /etc/init.d/ufmd start > May 31 15:39:32 fig3 OpenSM[24125]: > May 31 15:39:32 fig3 OpenSM[24125]: Loading Cached Option:max_op_vls = 4 > May 31 15:39:32 fig3 OpenSM[24125]: Loading Cached > Option:partition_config_file = /opt/ufm/files/conf/opensm/partitions.conf > May 31 15:39:32 fig3 OpenSM[24125]: Loading Cached Option:routing_engine = > minhop > May 31 15:39:32 fig3 OpenSM[24125]: Loading Cached > Option:lid_matrix_dump_file = /opt/ufm/files/conf/opensm/lid_matrix.conf > May 31 15:39:32 fig3 OpenSM[24125]: Loading Cached Option:lfts_file = > /opt/ufm/files/conf/opensm/lfts.conf > May 31 15:39:32 fig3 OpenSM[24125]: Loading Cached Option:root_guid_file = > /opt/ufm/files/conf/opensm/root_guid.conf > May 31 15:39:32 fig3 OpenSM[24125]: Loading Cached Option:cn_guid_file = > /opt/ufm/files/conf/opensm/cn_guid.conf > May 31 15:39:32 fig3 OpenSM[24125]: Loading Cached Option:ids_guid_file = > /opt/ufm/files/conf/opensm/ids_guid.conf > May 31 15:39:32 fig3 OpenSM[24125]: Loading Cached > Option:guid_routing_order_file = > /opt/ufm/files/conf/opensm/guid_routing_order.conf > May 31 15:39:32 fig3 OpenSM[24125]: Loading Cached Option:sm_priority = 15 > May 31 15:39:32 fig3 OpenSM[24125]: Loading Cached Option:event_plugin_name > = osmufmpi > May 31 15:39:32 fig3 OpenSM[24125]: Loading Cached Option:log_file = > /opt/ufm/files/log/opensm.log > May 31 15:39:32 fig3 OpenSM[24125]: Loading Cached Option:log_max_size = > 4096 > May 31 15:39:32 fig3 OpenSM[24125]: Loading Cached Option:dump_files_dir = > /opt/ufm/files/log/ > May 31 15:39:32 fig3 OpenSM[24125]: Loading Cached Option:qos = TRUE > May 31 15:39:32 fig3 OpenSM[24125]: Loading Cached Option:qos_policy_file = > /opt/ufm/files/conf/opensm/qos-policy.conf > May 31 15:39:32 fig3 OpenSM[24125]: Loading Cached Option:qos_max_vls = 8 > May 31 15:39:32 fig3 OpenSM[24125]: Loading Cached Option:qos_high_limit = 0 > > May 31 15:39:32 fig3 OpenSM[24125]: Loading Cached Option:qos_vlarb_high = > 0:32 > May 31 15:39:32 fig3 OpenSM[24125]: Loading Cached Option:qos_vlarb_low = > 1:224,2:64,3:32 > May 31 15:39:32 fig3 OpenSM[24125]: Loading Cached Option:qos_sl2vl = > 0,1,2,3,4,5,6,7,15,15,15,15,15,15,15,15 > May 31 15:39:32 fig3 OpenSM[24125]: Loading Cached Option:prefix_routes_file > = /opt/ufm/files/conf/opensm/prefix-routes.conf > May 31 15:39:32 fig3 OpenSM[24128]: /opt/ufm/files/log/opensm.log log file > opened > May 31 15:39:32 fig3 OpenSM[24128]: OpenSM 3.3.2_974fc3f_2cbb47c > May 31 15:39:32 fig3 OpenSM[24128]: Entering DISCOVERING state > May 31 15:39:32 fig3 OpenSM[24128]: Loading Cached Option:qos_max_vls = 8 > May 31 15:39:32 fig3 OpenSM[24128]: Loading Cached Option:qos_high_limit = 0 > > May 31 15:39:32 fig3 OpenSM[24128]: Loading Cached Option:qos_vlarb_high = > 0:32 > May 31 15:39:32 fig3 OpenSM[24128]: Loading Cached Option:qos_vlarb_low = > 1:224,2:64,3:32 > May 31 15:39:32 fig3 OpenSM[24128]: Loading Cached Option:qos_sl2vl = > 0,1,2,3,4,5,6,7,15,15,15,15,15,15,15,15 > May 31 15:39:32 fig3 opensm[24128]: Entering MASTER state > May 31 15:39:33 fig3 kernel: ib0: multicast join failed for > ff12:401b:ffff:0000:0000:0000:ffff:ffff, status -11 > May 31 15:39:33 fig3 opensm[24128]: SUBNET UP > May 31 15:39:36 fig3 heartbeat: [23483]: info: all HA resource acquisition > completed (standby). > May 31 15:39:36 fig3 heartbeat: [15428]: info: Standby resource acquisition > done [all]. > May 31 15:39:36 fig3 ufm_monitor: start monitoring... > May 31 15:39:37 fig3 heartbeat: [15428]: info: remote resource transition > completed. > May 31 15:39:40 fig3 dhcpd: Wrote 0 leases to leases file. > May 31 15:39:40 fig3 dhcpd: > May 31 15:39:40 fig3 dhcpd: No subnet declaration for ib0.8001 (1.1.1.1). > May 31 15:39:40 fig3 dhcpd: ** Ignoring requests on ib0.8001. If this is not > what > May 31 15:39:40 fig3 dhcpd: you want, please write a subnet declaration > May 31 15:39:40 fig3 dhcpd: in your dhcpd.conf file for the network segment > May 31 15:39:40 fig3 dhcpd: to which interface ib0.8001 is attached. ** > May 31 15:39:40 fig3 dhcpd: > May 31 15:39:40 fig3 dhcpd: > May 31 15:39:40 fig3 dhcpd: No subnet declaration for eth0 (172.28.6.123). > May 31 15:39:40 fig3 dhcpd: ** Ignoring requests on eth0. If this is not what > May 31 15:39:40 fig3 dhcpd: you want, please write a subnet declaration > May 31 15:39:40 fig3 dhcpd: in your dhcpd.conf file for the network segment > May 31 15:39:40 fig3 dhcpd: to which interface eth0 is attached. ** > May 31 15:39:40 fig3 dhcpd: > May 31 15:39:40 fig3 dhcpd: Sending on Socket/fallback/fallback-net > May 31 15:39:40 fig3 harc[24397]: [24403]: info: Running > /etc/ha.d/rc.d/hb_takeover hb_takeover > May 31 15:39:40 fig3 hb_standby[24409]: [24415]: Going standby [all]. > May 31 15:39:40 fig3 heartbeat: [15428]: info: fig3 wants to go standby [all] > May 31 15:39:41 fig3 heartbeat: [15428]: info: standby: fig1 can take our all > resources > May 31 15:39:41 fig3 heartbeat: [24419]: info: give up all HA resources > (standby). > May 31 15:39:41 fig3 ResourceManager[24434]: [24445]: info: Releasing > resource group: fig3 172.28.6.140/24/eth0/172.28.6.255 drbddisk::ufmdb > Filesystem::/dev/drbd0::/opt/ufm/files/::ext3 mysqld ufmd > May 31 15:39:41 fig3 ResourceManager[24434]: [24455]: info: Running > /etc/init.d/ufmd stop > May 31 15:39:41 fig3 opensm[24128]: Exiting SM > May 31 15:39:41 fig3 ResourceManager[24434]: [24500]: info: Running > /etc/init.d/mysqld stop > May 31 15:39:41 fig3 ufm_monitor: stop monitoring... > May 31 15:39:43 fig3 ResourceManager[24434]: [24569]: info: Running > /etc/ha.d/resource.d/Filesystem /dev/drbd0 /opt/ufm/files/ ext3 stop > May 31 15:39:43 fig3 Filesystem[24582]: [24612]: INFO: Running stop for > /dev/drbd0 on /opt/ufm/files > May 31 15:39:43 fig3 Filesystem[24582]: [24622]: INFO: Trying to unmount > /opt/ufm/files > May 31 15:39:43 fig3 Filesystem[24582]: [24625]: INFO: unmounted > /opt/ufm/files successfully > May 31 15:39:43 fig3 Filesystem[24571]: [24631]: INFO: Success > May 31 15:39:43 fig3 ResourceManager[24434]: [24646]: info: Running > /etc/ha.d/resource.d/drbddisk ufmdb stop > May 31 15:39:43 fig3 kernel: drbd0: role( Primary -> Secondary ) > May 31 15:39:43 fig3 kernel: drbd0: Writing meta data super block now. > May 31 15:39:44 fig3 ResourceManager[24434]: [24667]: info: Running > /etc/ha.d/resource.d/IPaddr 172.28.6.140/24/eth0/172.28.6.255 stop > May 31 15:39:44 fig3 IPaddr[24698]: [24713]: INFO: ifconfig eth0:0 down > May 31 15:39:44 fig3 avahi-daemon[3290]: Withdrawing address record for > 172.28.6.140 on eth0. > May 31 15:39:44 fig3 IPaddr[24669]: [24716]: INFO: Success > May 31 15:39:44 fig3 heartbeat: [24419]: info: all HA resource release > completed (standby). > May 31 15:39:44 fig3 heartbeat: [15428]: info: Local standby process > completed [all]. > May 31 15:39:44 fig3 kernel: drbd0: peer( Secondary -> Primary ) > May 31 15:39:52 fig3 kernel: ib0: multicast join failed for > ff12:401b:ffff:0000:0000:0000:ffff:ffff, status -11 > May 31 15:39:56 fig3 heartbeat: [15428]: WARN: 1 lost packet(s) for [fig1] > [1721:1723] > May 31 15:39:56 fig3 heartbeat: [15428]: info: remote resource transition > completed. > May 31 15:39:56 fig3 heartbeat: [15428]: info: No pkts missing from fig1! > May 31 15:39:56 fig3 heartbeat: [15428]: info: Other node completed standby > takeover of all resources. > May 31 15:40:06 fig3 heartbeat: [15428]: info: fig1 wants to go standby [all] > May 31 15:40:09 fig3 kernel: drbd0: peer( Primary -> Secondary ) > May 31 15:40:09 fig3 heartbeat: [15428]: info: standby: acquire [all] > resources from fig1 > May 31 15:40:09 fig3 heartbeat: [24930]: info: acquire all HA resources > (standby). > May 31 15:40:09 fig3 ResourceManager[24943]: [24954]: info: Acquiring > resource group: fig3 172.28.6.140/24/eth0/172.28.6.255 drbddisk::ufmdb > Filesystem::/dev/drbd0::/opt/ufm/files/::ext3 mysqld ufmd > May 31 15:40:09 fig3 IPaddr[24966]: [25009]: INFO: Resource is stopped > May 31 15:40:09 fig3 ResourceManager[24943]: [25025]: info: Running > /etc/ha.d/resource.d/IPaddr 172.28.6.140/24/eth0/172.28.6.255 start > May 31 15:40:09 fig3 IPaddr[25056]: [25090]: INFO: Using calculated netmask > for 172.28.6.140: 255.255.255.0 > May 31 15:40:10 fig3 IPaddr[25056]: [25111]: INFO: eval ifconfig eth0:0 > 172.28.6.140 netmask 255.255.255.0 broadcast 172.28.6.255 > May 31 15:40:10 fig3 avahi-daemon[3290]: Registering new address record for > 172.28.6.140 on eth0. > May 31 15:40:10 fig3 avahi-daemon[3290]: Withdrawing address record for > 172.28.6.140 on eth0. > May 31 15:40:10 fig3 avahi-daemon[3290]: Registering new address record for > 172.28.6.140 on eth0. > May 31 15:40:10 fig3 IPaddr[25027]: [25130]: INFO: Success > May 31 15:40:10 fig3 ResourceManager[24943]: [25159]: info: Running > /etc/ha.d/resource.d/drbddisk ufmdb start > May 31 15:40:10 fig3 kernel: drbd0: role( Secondary -> Primary ) > May 31 15:40:10 fig3 kernel: drbd0: Writing meta data super block now. > May 31 15:40:10 fig3 Filesystem[25176]: [25220]: INFO: Resource is stopped > May 31 15:40:10 fig3 ResourceManager[24943]: [25234]: info: Running > /etc/ha.d/resource.d/Filesystem /dev/drbd0 /opt/ufm/files/ ext3 start > May 31 15:40:10 fig3 Filesystem[25247]: [25277]: INFO: Running start for > /dev/drbd0 on /opt/ufm/files > May 31 15:40:10 fig3 kernel: kjournald starting. Commit interval 5 seconds > May 31 15:40:10 fig3 kernel: EXT3 FS on drbd0, internal journal > May 31 15:40:10 fig3 kernel: EXT3-fs: mounted filesystem with ordered data > mode. > May 31 15:40:10 fig3 Filesystem[25236]: [25292]: INFO: Success > May 31 15:40:10 fig3 ResourceManager[24943]: [25346]: info: Running > /etc/init.d/mysqld start > May 31 15:40:12 fig3 ResourceManager[24943]: [25471]: info: Running > /etc/init.d/ufmd start > May 31 15:40:16 fig3 OpenSM[25599]: > May 31 15:40:16 fig3 OpenSM[25599]: Loading Cached Option:max_op_vls = 4 > May 31 15:40:16 fig3 OpenSM[25599]: Loading Cached > Option:partition_config_file = /opt/ufm/files/conf/opensm/partitions.conf > May 31 15:40:16 fig3 OpenSM[25599]: Loading Cached Option:routing_engine = > minhop > May 31 15:40:16 fig3 OpenSM[25599]: Loading Cached > Option:lid_matrix_dump_file = /opt/ufm/files/conf/opensm/lid_matrix.conf > May 31 15:40:16 fig3 OpenSM[25599]: Loading Cached Option:lfts_file = > /opt/ufm/files/conf/opensm/lfts.conf > May 31 15:40:16 fig3 OpenSM[25599]: Loading Cached Option:root_guid_file = > /opt/ufm/files/conf/opensm/root_guid.conf > May 31 15:40:16 fig3 OpenSM[25599]: Loading Cached Option:cn_guid_file = > /opt/ufm/files/conf/opensm/cn_guid.conf > May 31 15:40:16 fig3 OpenSM[25599]: Loading Cached Option:ids_guid_file = > /opt/ufm/files/conf/opensm/ids_guid.conf > May 31 15:40:16 fig3 OpenSM[25599]: Loading Cached > Option:guid_routing_order_file = > /opt/ufm/files/conf/opensm/guid_routing_order.conf > May 31 15:40:16 fig3 OpenSM[25599]: Loading Cached Option:sm_priority = 15 > May 31 15:40:16 fig3 OpenSM[25599]: Loading Cached Option:event_plugin_name > = osmufmpi > May 31 15:40:16 fig3 OpenSM[25599]: Loading Cached Option:log_file = > /opt/ufm/files/log/opensm.log > May 31 15:40:16 fig3 OpenSM[25599]: Loading Cached Option:log_max_size = > 4096 > May 31 15:40:16 fig3 OpenSM[25599]: Loading Cached Option:dump_files_dir = > /opt/ufm/files/log/ > May 31 15:40:16 fig3 OpenSM[25599]: Loading Cached Option:qos = TRUE > May 31 15:40:16 fig3 OpenSM[25599]: Loading Cached Option:qos_policy_file = > /opt/ufm/files/conf/opensm/qos-policy.conf > May 31 15:40:16 fig3 OpenSM[25599]: Loading Cached Option:qos_max_vls = 8 > May 31 15:40:17 fig3 OpenSM[25599]: Loading Cached Option:qos_high_limit = 0 > > May 31 15:40:17 fig3 OpenSM[25599]: Loading Cached Option:qos_vlarb_high = > 0:32 > May 31 15:40:17 fig3 OpenSM[25599]: Loading Cached Option:qos_vlarb_low = > 1:224,2:64,3:32 > May 31 15:40:17 fig3 OpenSM[25599]: Loading Cached Option:qos_sl2vl = > 0,1,2,3,4,5,6,7,15,15,15,15,15,15,15,15 > May 31 15:40:17 fig3 OpenSM[25599]: Loading Cached Option:prefix_routes_file > = /opt/ufm/files/conf/opensm/prefix-routes.conf > May 31 15:40:17 fig3 OpenSM[25677]: /opt/ufm/files/log/opensm.log log file > opened > May 31 15:40:17 fig3 OpenSM[25677]: OpenSM 3.3.2_974fc3f_2cbb47c > May 31 15:40:17 fig3 OpenSM[25677]: Entering DISCOVERING state > May 31 15:40:17 fig3 OpenSM[25677]: Loading Cached Option:qos_max_vls = 8 > May 31 15:40:17 fig3 OpenSM[25677]: Loading Cached Option:qos_high_limit = 0 > > May 31 15:40:17 fig3 OpenSM[25677]: Loading Cached Option:qos_vlarb_high = > 0:32 > May 31 15:40:17 fig3 OpenSM[25677]: Loading Cached Option:qos_vlarb_low = > 1:224,2:64,3:32 > May 31 15:40:17 fig3 OpenSM[25677]: Loading Cached Option:qos_sl2vl = > 0,1,2,3,4,5,6,7,15,15,15,15,15,15,15,15 > May 31 15:40:18 fig3 opensm[25677]: Entering MASTER state > May 31 15:40:18 fig3 kernel: ib0: multicast join failed for > ff12:401b:ffff:0000:0000:0000:ffff:ffff, status -11 > May 31 15:40:18 fig3 opensm[25677]: SUBNET UP > May 31 15:40:21 fig3 heartbeat: [24930]: info: all HA resource acquisition > completed (standby). > May 31 15:40:21 fig3 heartbeat: [15428]: info: Standby resource acquisition > done [all]. > May 31 15:40:21 fig3 ufm_monitor: start monitoring... > May 31 15:40:21 fig3 harc[25831]: [25842]: info: Running > /etc/ha.d/rc.d/hb_takeover hb_takeover > May 31 15:40:21 fig3 hb_standby[25848]: [25855]: Going standby [all]. > May 31 15:40:21 fig3 heartbeat: [15428]: WARN: standby message [me] from fig3 > ignored. Other side is in flux. > May 31 15:40:22 fig3 heartbeat: [15428]: info: remote resource transition > completed. > May 31 15:40:22 fig3 heartbeat: [15428]: ERROR: Ignored standby message > 'other' from fig1 in state 0 > May 31 15:40:24 fig3 dhcpd: Wrote 0 leases to leases file. > May 31 15:40:24 fig3 dhcpd: > May 31 15:40:24 fig3 dhcpd: No subnet declaration for ib0.8001 (1.1.1.1). > May 31 15:40:24 fig3 dhcpd: ** Ignoring requests on ib0.8001. If this is not > what > May 31 15:40:24 fig3 dhcpd: you want, please write a subnet declaration > May 31 15:40:24 fig3 dhcpd: in your dhcpd.conf file for the network segment > May 31 15:40:24 fig3 dhcpd: to which interface ib0.8001 is attached. ** > May 31 15:40:24 fig3 dhcpd: > May 31 15:40:24 fig3 dhcpd: > May 31 15:40:24 fig3 dhcpd: No subnet declaration for eth0 (172.28.6.123). > May 31 15:40:24 fig3 dhcpd: ** Ignoring requests on eth0. If this is not what > May 31 15:40:24 fig3 dhcpd: you want, please write a subnet declaration > May 31 15:40:24 fig3 dhcpd: in your dhcpd.conf file for the network segment > May 31 15:40:24 fig3 dhcpd: to which interface eth0 is attached. ** > May 31 15:40:24 fig3 dhcpd: > May 31 15:40:24 fig3 dhcpd: Sending on Socket/fallback/fallback-net > May 31 15:40:27 fig3 OpenSM[25677]: Loading Cached Option:qos_max_vls = 8 > May 31 15:40:27 fig3 OpenSM[25677]: Loading Cached Option:qos_high_limit = 0 > > May 31 15:40:27 fig3 OpenSM[25677]: Loading Cached Option:qos_vlarb_high = > 0:32 > May 31 15:40:27 fig3 OpenSM[25677]: Loading Cached Option:qos_vlarb_low = > 1:224,2:64,3:32 > May 31 15:40:27 fig3 OpenSM[25677]: Loading Cached Option:qos_sl2vl = > 0,1,2,3,4,5,6,7,15,15,15,15,15,15,15,15 > May 31 15:40:43 fig3 harc[26109]: [26115]: info: Running > /etc/ha.d/rc.d/hb_takeover hb_takeover > May 31 15:40:43 fig3 hb_standby[26121]: [26127]: Going standby [all]. > May 31 15:40:43 fig3 heartbeat: [15428]: info: fig3 wants to go standby [all] > May 31 15:40:53 fig3 heartbeat: [15428]: WARN: No reply to standby request. > Standby request cancelled. > May 31 15:41:22 fig3 heartbeat: [15428]: info: fig3 wants to go standby [all] > May 31 15:41:32 fig3 heartbeat: [15428]: WARN: No reply to standby request. > Standby request cancelled. > May 31 15:43:21 fig3 heartbeat: [15428]: info: fig3 wants to go standby [all] > May 31 15:43:31 fig3 heartbeat: [15428]: WARN: No reply to standby request. > Standby request cancelled. > May 31 15:45:53 fig3 heartbeat: [15428]: info: fig3 wants to go standby [all] > May 31 15:46:03 fig3 heartbeat: [15428]: WARN: No reply to standby request. > Standby request cancelled. > May 31 15:48:24 fig3 heartbeat: [15428]: info: fig3 wants to go standby [all] > May 31 15:48:34 fig3 heartbeat: [15428]: WARN: No reply to standby request. > Standby request cancelled. > > > > _______________________________________________ > Linux-HA mailing list > [email protected] > http://lists.linux-ha.org/mailman/listinfo/linux-ha > See also: http://linux-ha.org/ReportingProblems _______________________________________________ Linux-HA mailing list [email protected] http://lists.linux-ha.org/mailman/listinfo/linux-ha See also: http://linux-ha.org/ReportingProblems
