Re: [Linux-HA] Problem with connectivity loss

Laurent Yin Thu, 04 Sep 2008 13:56:09 -0700

Hi Chase.

I use Ubuntu Server 8.04


Here's my ha.cf

############
###### HA.CF
############

#/etc/ha.d/ha.cf
#
bcast eth1

baud    19200
### UNCOMMENT AND PUT THE OTHER NODE'S IP HERE ###
#ucast eth0 PEER_IP
#####


debugfile /var/log/ha.debug
logfile    /var/log/ha.log
logfacility    local0
crm yes
#time between heart beats
keepalive    5

deadtime    15

warntime    6

initdead    20

#Name must be the one returned by 'uname -n'
node    machine1
node    machine2

### UNCOMMENT AND PUT THE GATEWAY IP TO DETECT CONNECTIVITY LOSS
#ping    GATEWAY_IP
#####

# to detect connectivity loss
respawn root /usr/lib/heartbeat/pingd -m 100 -d 5s -a pingd


# no auto failback. I'm not sure it does anything on v2, you'll have to set
resource stickiness
auto_failback off

############
###### HA.CF
############


  And here are the files I use to configure my cib (I then use "cibadmin -o
resources -C -x file_name" to add them or "-o constraints")

#################
###### DRBDRESOURCE
#################

<master_slave id="ms-drbd0">
  <meta_attributes id="ma-ms-drbd0">
     <attributes>
       <nvpair id="ma-ms-drbd0-1" name="clone_max" value="2"/>
       <nvpair id="ma-ms-drbd0-2" name="clone_node_max" value="1"/>
       <nvpair id="ma-ms-drbd0-3" name="master_max" value="1"/>
       <nvpair id="ma-ms-drbd0-4" name="master_node_max" value="1"/>
       <nvpair id="ma-ms-drbd0-5" name="notify" value="yes"/>
       <nvpair id="ma-ms-drbd0-6" name="globally_unique" value="false"/>
       <nvpair id="ma-ms-drbd0-7" name="target_role" value="#default"/>
    </attributes>
  </meta_attributes>
  <primitive id="drbd0" class="ocf" provider="heartbeat" type="drbd">
    <instance_attributes id="ia-drbd0">
      <attributes>
        <nvpair id="ia-drbd0-1" name="drbd_resource" value="mysql"/>
      </attributes>
    </instance_attributes>
     <operations>
       <op id="op-drbd0-1" name="monitor" interval="20s" timeout="10s"
role="Master"/>
       <op id="op-drbd0-2" name="monitor" interval="45s" timeout="10s"
role="Slave"/>
     </operations>
  </primitive>
</master_slave>

#################
###### DRBDRESOURCE
#################


#################
###### MYSQLGROUP
#################

<group id="mysqlgroup">
 <meta_attributes id="ma-mysqlgroup">
  <attributes>
   <nvpair name="resource_stickiness" id="ma-mysqlgroup-1" value="9999"/>
   <nvpair id="ma-mysqlgroup-2" name="target_role" value="started"/>
  </attributes>
 </meta_attributes>
 <primitive class="ocf" provider="heartbeat" type="Filesystem" id="fs0">
  <instance_attributes id="ia-fs0">
   <attributes>
    <nvpair id="ia-fs0-1" name="fstype" value="ext3"/>
    <nvpair id="ia-fs0-2" name="directory" value="/replicated"/>
    <nvpair id="ia-fs0-3" name="device" value="/dev/drbd0"/>
   </attributes>
  </instance_attributes>
  <operations>
   <op id="op-filesystem-monitor" interval="20s" name="monitor"
timeout="10s"/>
  </operations>
</primitive>
<primitive class="lsb" type="mysql" id="mysqlserver">
 <operations>
  <op id="op-monitor-mysql" name="monitor" interval="10s" timeout="5s"/>
 </operations>
</primitive>
<primitive id="mysql-vip" class="ocf" type="IPaddr2" provider="heartbeat">
 <instance_attributes id="ia-mysql-vip">
  <attributes>
   <nvpair id="ia-mysql-vip-ip" name="ip" value="192.168.21.10"/>
   <nvpair id="ia-virtual-ip-2" name="broadcast" value="192.168.203.255"/>
   <nvpair id="ia-mysql-vip-nic" name="nic" value="eth0"/>
   <nvpair id="ia-mysql-vip-netmask" name="cidr_netmask" value="
255.255.255.0"/>
  </attributes>
 </instance_attributes>
 <operations>
  <op id="op-monitor-vip" name="monitor" interval="10s" timeout="3s"/>
 </operations>
</primitive>
<primitive id="R_MailTo" class="ocf" type="MailTo" provider="heartbeat">
 <instance_attributes>
  <attributes>
   <nvpair id="44b0bd1a-3795-4a20-aaab-58df706bc39b" name="email" value="
[EMAIL PROTECTED]"/>
   <nvpair id="4d763860-6a5d-425b-8a13-986f4ede82dc" name="subject"
value="My_Mysql_Server"/>
  </attributes>
 </instance_attributes>
</primitive>
</group>

#################
###### MYSQLGROUP
#################


#################
###### COLOC_CONST
#################

<!-- Mount file system only on node which is master -->
<rsc_colocation id="start_mysql_on_drbd_master" to="ms-drbd0"
to_role="master" from="mysqlgroup" score="INFINITY"/>

#################
###### COLOC_CONST
#################


#################
###### ORDER_CONST
#################

<rsc_order id="drbd_before_mysql_group" from="mysqlgroup" action="start"
type="after" to="ms-drbd0" to_action="promote"/>

#################
###### ORDER_CONST
#################


#################
# CONNECTIVITY_CONST
#################

<rsc_location id="my_resource:connected" rsc="mysqlgroup">
  <rule id="my_resource:connected:rule" score="-INFINITY" boolean_op="or">
    <expression id="my_resource:connected:expr:undefined"
      attribute="pingd" operation="not_defined"/>
    <expression id="my_resource:connected:expr:zero"
      attribute="pingd" operation="lte" value="0"/>
  </rule>
</rsc_location>

#################
# CONNECTIVITY_CONST
#################


Putting the ping gatewayIP and respawn... in your ha.cf and adding the
connectivity_const constraint to your cib.xml should do the trick. I just
followed the tutorial in the link in my first post step by step and it
worked.
I used to ping www.google.com but there were warnings in ha.log or ha.debug,
so I just used the gateway IP because it's their only way to reach the
outside world in any case.

I managed to have a 35 seconds failover time when unplugging the cable.

I don't know what version of heartbeat I'm using, I'll check that tomorrow
and tell you about it.

Laurent

On Thu, Sep 4, 2008 at 5:21 PM, Chase Simms <[EMAIL PROTECTED]> wrote:

> Laurent,
>
> Would you mind sharing your ha.cf and your cib.xml.  I've been fighting
> the same problem for weeks.  I was about to give up when I found your post.
>  Everything works for me except network failover.  I've tried running using
> a constraint to run Pingd with MySQL and used the clone method from the
> tutorials.  I would love to see a config I know works.  What flavor of Linux
> are you using?  I'm using CentOS and the heartbeat from their repositories.
>
> Thank you,
> Chase
>
> >>> "Laurent Yin" <[EMAIL PROTECTED]> 8/27/2008 10:22 AM >>>
> thanks!
> I will try the "on fail ignore". The other issue was not really an issue
> because it finally worked fine when I decided to completely erase the CIB
> and reconfigure constraints and resources without adding the mail resource.
> Maybe it was a problem due to the fact that I used "crm_resource" to remove
> this one resource specifically, I don't know...
>
> I had "solved" the mail program by changing the MailTo RA, launching the
> mail in another process to not have to wait for the timeout to arrive, and
> masking the error.
> The advantage is that I don't have to wait the timeout - which was quite
> long if I remember well - to continue leaving up resources, allowing
> failover to execute faster.
>
> The inconvenient is that I have to change the MailTo RA...
>
> Is there any  way to emulate this behaviour by setting fail_ignore?
>
> On Mon, Aug 25, 2008 at 12:21 PM, Andrew Beekhof <[EMAIL PROTECTED]>
> wrote:
>
> > On Tue, Aug 12, 2008 at 12:30, Laurent Yin <[EMAIL PROTECTED]>
> > wrote:
> > > Hello,
> > >
> > > I set up a DRBD-Mysql cluster with a master slave set DRBD and a mysql
> > > resource group containing :
> > > -a Filesystem
> > > -a mysql (5.1)
> > > -a virtual IP Address (IPAddr2)
> > > -a MailTo RA
> > >
> > > I have two constraints :
> > > - one colocational constraint which tells that you have to have DRBD
> > master
> > > on the machine running mysqlgroup
> > > - one ordering constraint which tells you have to launch mysqlgroup
> after
> > > DRBD
> > >
> > > It works fine and it does failover smoothly on machine poweroff and
> > stuffs.
> > >
> > > Now I would've liked it to be network-loss tolerant, eg if I unplug the
> > > network cable between the master node and the router, I want it to
> detect
> > > that connectivity is lost.
> > > For that purpose, I added two ping nodes to my ha.cf and a respawn
> with
> > > pingd.
> > >
> > > ## in HA.CF
> > > ping    www.google.com
> > > ping    www.yahoo.com
> > >
> > > respawn root /usr/lib/heartbeat/pingd -m 100 -d 5s -a pingd
> > > ## END OF in HA.CF
> > >
> > > I also added a constraint as done on the site
> > > http://www.linux-ha.org/pingdin the section "Only Run my_resource on
> > > Nodes With Access to at Least One
> > > Ping Node".
> > >
> > > ## CONSTRAINT ##
> > > <rsc_location id="my_resource:connected" rsc="mysqlgroup">
> > >  <rule id="my_resource:connected:rule" score="-INFINITY"
> boolean_op="or">
> > >    <expression id="my_resource:connected:expr:undefined"
> > >      attribute="pingd" operation="not_defined"/>
> > >    <expression id="my_resource:connected:expr:zero"
> > >      attribute="pingd" operation="lte" value="0"/>
> > >  </rule>
> > > </rsc_location>
> > > ## END OF CONSTRAINT ##
> > >
> > >
> > > I have two problems with this configuration.
> > > 1 ) When I unplug the network cable of the machine running mysql, after
> > > detecting that there is no connectivity, it tries to stop the group,
> > > beginning with my last resource which is MailTo. But, as there is no
> > > connectivity, it fails to stop, and therefore the whole group remains
> > > unstopped. What can I do against this?
> >
> > fix the RA or set on_fail=ignore for the resource's stop action
> >
> > >
> > > 2 ) When I remove the MailTo RA (just for testing purpose, to see what
> > > happens, but this is not an acceptable solution), it manages to stop
> the
> > > mysqlgroup, but it doesn't get started on the other node. I assume that
> > it
> > > is because DRBD is still master on this node. How can I tell Heartbeat
> to
> > > switch master/slave in DRBD when connectivity is lost?
> > > Or is there another solution with constraints maybe?
> >
> > create a similar pingd constraint for drbd as you used for the group
> > _______________________________________________
> > Linux-HA mailing list
> > [email protected]
> > http://lists.linux-ha.org/mailman/listinfo/linux-ha
> > See also: http://linux-ha.org/ReportingProblems
> >
>
>
>
> --
> This is the end ... beautiful friend ...
>
> This is the end .... my only friend, the end ...
> _______________________________________________
> Linux-HA mailing list
> [email protected]
> http://lists.linux-ha.org/mailman/listinfo/linux-ha
> See also: http://linux-ha.org/ReportingProblems
>
>
> The information in this email is intended for the sole use of the
> addressees and may be confidential and subject to protection under the law.
> If you are not the intended recipient, you are hereby notified that any
> distribution or copying of this email is strictly prohibited. If you have
> received this message in error, please reply and delete your copy.
>
> _______________________________________________
> Linux-HA mailing list
> [email protected]
> http://lists.linux-ha.org/mailman/listinfo/linux-ha
> See also: http://linux-ha.org/ReportingProblems
>



-- 
This is the end ... beautiful friend ...

This is the end .... my only friend, the end ...
_______________________________________________
Linux-HA mailing list
[email protected]
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems

Re: [Linux-HA] Problem with connectivity loss

Reply via email to