Re: [Pacemaker] Two node DRBD cluster will not automatically failover to the secondary

2010-02-03 Thread Tom Pride
Hi Shravan,

Thank you very much for your reply.  I know it was quite a while ago that I
posted my question to the mailing list, but I've been working on other
things and have only just had the chance to come back to this.

You say that I need to setup stonith resources along with setting
stonith-enabled = true.  Well I know how to change the stonith-enabled
setting, but I have no clue as to how I go about setting up the appropriate
stonith resources to prevent DRBD from getting into a split brain
situation.  The documentation provided on the DRBD website about setting up
a 2 node cluster with Pacemaker doesn't tell you to enable stonith or
configure stonith resources. It does talk about the resource fencing options
within the /etc/drbd.conf of which I have configured:

resource r0 {
  disk {
fencing resource-only;
  }
  handlers {
fence-peer /usr/lib/drbd/crm-fence-peer.sh;
after-resync-target /usr/lib/drbd/crm-unfence-peer.sh;
  }


I've searched the internet high and low for example pacemaker configs that
show you how to configure stonith resources for DRBD, but I can't find
anything useful.

This howto (
http://www.howtoforge.com/installation-and-setup-guide-for-drbd-openais-pacemaker-xen-on-opensuse-11.1)
 that I found spells out how to configure a cluster and even states:
STONITH is disabled in this configuration though it is highly-recommended
in any production environment to eliminate the risk of divergent data. but
infuriatingly it doesn't tell you how.

Could you please give me some pointers or some helpful examples or perhaps
point me to someone or something that can give me a hand in this area?

Many Thanks
Tom


On Thu, Dec 17, 2009 at 2:14 PM, Shravan Mishra shravan.mis...@gmail.comwrote:

 Hi,

 For stateful resources like drbd you will have to setup stonith resources
 for them to function properly or at all.
 stonith-enabled is true by default.

 Sincerely
 Shravan

 On Thu, Dec 17, 2009 at 6:29 AM, Tom Pride tom.pr...@gmail.com wrote:

 Hi there,

 I have setup a two node DRBD culster with pacemaker using the instructions
 provided on the drbd.org website:
 http://www.drbd.org/users-guide-emb/ch-pacemaker.html  The cluster works
 perfectly and I can migrate the resources back and forth between the two
 nodes without a problem.  However, if I try simulating a complete server
 failure of the master node by powering off the server, pacemaker does not
 then automatically bring up the remaining node as the master.  I need some
 help to find out what configuration changes I need to make in order for my
 cluster to failover automatically.

 The cluster is built on 2 Redhat EL 5.3 servers running the following
 software versions:
 drbd-8.3.6-1
 pacemaker-1.0.5-4.1
 openais-0.80.5-15.1

 Below I have listed the drbd.conf, openais.conf and the output of crm
 configuration show.  If someone could take a look at these for me and
 provide any suggestions/modifications I would be most grateful.

 Thanks,
 Tom

 /etc/drbd.conf

 global {
   usage-count no;
 }
 common {
   protocol C;
 }
 resource r0 {
   disk {
 fencing resource-only;
   }
   handlers {
 fence-peer /usr/lib/drbd/crm-fence-peer.
 sh;
 after-resync-target /usr/lib/drbd/crm-unfence-peer.sh;
   }
   syncer {
 rate 40M;
   }
   on mq001.back.live.cwwtf.local {
 device/dev/drbd1;
 disk  /dev/cciss/c0d0p1;
 address   172.23.8.69:7789;
 meta-disk internal;
   }
   on mq002.back.live.cwwtf.local {
 device/dev/drbd1;
 disk  /dev/cciss/c0d0p1;
 address   172.23.8.70:7789;
 meta-disk internal;
   }
 }


 r...@mq001:~# cat /etc/ais/openais.conf
 totem {
   version: 2
   token: 3000
   token_retransmits_before_loss_const: 10
   join: 60
   consensus: 1500
   vsftype: none
   max_messages: 20
   clear_node_high_bit: yes
   secauth: on
   threads: 0
   rrp_mode: passive
   interface {
 ringnumber: 0
 bindnetaddr: 172.59.60.0
 mcastaddr: 239.94.1.1
 mcastport: 5405
   }
   interface {
 ringnumber: 1
 bindnetaddr: 172.23.8.0
 mcastaddr: 239.94.2.1
 mcastport: 5405
   }
 }
 logging {
   to_stderr: yes
   debug: on
   timestamp: on
   to_file: no
   to_syslog: yes
   syslog_facility: daemon
 }
 amf {
   mode: disabled
 }
 service {
   ver:   0
   name:  pacemaker
   use_mgmtd: yes
 }
 aisexec {
   user:   root
   group:  root
 }


 r...@mq001:~# crm configure show
 node mq001.back.live.cwwtf.local
 node mq002.back.live.cwwtf.local
 primitive activemq-emp lsb:bbc-activemq-emp
 primitive activemq-forge-services lsb:bbc-activemq-forge-services
 primitive activemq-social lsb:activemq-social
 primitive drbd_activemq ocf:linbit:drbd \
 params drbd_resource=r0 \
 op monitor interval=15s
 primitive fs_activemq ocf:heartbeat:Filesystem \
 params device=/dev/drbd1 directory=/drbd fstype=ext3
 primitive ip_activemq ocf:heartbeat:IPaddr2 \
 params ip=172.23.8.71 nic=eth0
 group activemq fs_activemq ip_activemq activemq-forge-services
 

Re: [Pacemaker] Two node DRBD cluster will not automatically failover to the secondary

2009-12-17 Thread Shravan Mishra
Hi,

For stateful resources like drbd you will have to setup stonith resources
for them to function properly or at all.
stonith-enabled is true by default.

Sincerely
Shravan

On Thu, Dec 17, 2009 at 6:29 AM, Tom Pride tom.pr...@gmail.com wrote:

 Hi there,

 I have setup a two node DRBD culster with pacemaker using the instructions
 provided on the drbd.org website:
 http://www.drbd.org/users-guide-emb/ch-pacemaker.html  The cluster works
 perfectly and I can migrate the resources back and forth between the two
 nodes without a problem.  However, if I try simulating a complete server
 failure of the master node by powering off the server, pacemaker does not
 then automatically bring up the remaining node as the master.  I need some
 help to find out what configuration changes I need to make in order for my
 cluster to failover automatically.

 The cluster is built on 2 Redhat EL 5.3 servers running the following
 software versions:
 drbd-8.3.6-1
 pacemaker-1.0.5-4.1
 openais-0.80.5-15.1

 Below I have listed the drbd.conf, openais.conf and the output of crm
 configuration show.  If someone could take a look at these for me and
 provide any suggestions/modifications I would be most grateful.

 Thanks,
 Tom

 /etc/drbd.conf

 global {
   usage-count no;
 }
 common {
   protocol C;
 }
 resource r0 {
   disk {
 fencing resource-only;
   }
   handlers {
 fence-peer /usr/lib/drbd/crm-fence-peer.
 sh;
 after-resync-target /usr/lib/drbd/crm-unfence-peer.sh;
   }
   syncer {
 rate 40M;
   }
   on mq001.back.live.cwwtf.local {
 device/dev/drbd1;
 disk  /dev/cciss/c0d0p1;
 address   172.23.8.69:7789;
 meta-disk internal;
   }
   on mq002.back.live.cwwtf.local {
 device/dev/drbd1;
 disk  /dev/cciss/c0d0p1;
 address   172.23.8.70:7789;
 meta-disk internal;
   }
 }


 r...@mq001:~# cat /etc/ais/openais.conf
 totem {
   version: 2
   token: 3000
   token_retransmits_before_loss_const: 10
   join: 60
   consensus: 1500
   vsftype: none
   max_messages: 20
   clear_node_high_bit: yes
   secauth: on
   threads: 0
   rrp_mode: passive
   interface {
 ringnumber: 0
 bindnetaddr: 172.59.60.0
 mcastaddr: 239.94.1.1
 mcastport: 5405
   }
   interface {
 ringnumber: 1
 bindnetaddr: 172.23.8.0
 mcastaddr: 239.94.2.1
 mcastport: 5405
   }
 }
 logging {
   to_stderr: yes
   debug: on
   timestamp: on
   to_file: no
   to_syslog: yes
   syslog_facility: daemon
 }
 amf {
   mode: disabled
 }
 service {
   ver:   0
   name:  pacemaker
   use_mgmtd: yes
 }
 aisexec {
   user:   root
   group:  root
 }


 r...@mq001:~# crm configure show
 node mq001.back.live.cwwtf.local
 node mq002.back.live.cwwtf.local
 primitive activemq-emp lsb:bbc-activemq-emp
 primitive activemq-forge-services lsb:bbc-activemq-forge-services
 primitive activemq-social lsb:activemq-social
 primitive drbd_activemq ocf:linbit:drbd \
 params drbd_resource=r0 \
 op monitor interval=15s
 primitive fs_activemq ocf:heartbeat:Filesystem \
 params device=/dev/drbd1 directory=/drbd fstype=ext3
 primitive ip_activemq ocf:heartbeat:IPaddr2 \
 params ip=172.23.8.71 nic=eth0
 group activemq fs_activemq ip_activemq activemq-forge-services activemq-emp
 activemq-social
 ms ms_drbd_activemq drbd_activemq \
 meta master-max=1 master-node-max=1 clone-max=2
 clone-node-max=1 notify=true
 colocation activemq_on_drbd inf: activemq ms_drbd_activemq:Master
 order activemq_after_drbd inf: ms_drbd_activemq:promote activemq:start
 property $id=cib-bootstrap-options \
 dc-version=1.0.5-462f1569a43740667daf7b0f6b521742e9eb8fa7 \
 cluster-infrastructure=openais \
 expected-quorum-votes=2 \
 no-quorum-policy=ignore \
 last-lrm-refresh=1260809203

 ___
 Pacemaker mailing list
 Pacemaker@oss.clusterlabs.org
 http://oss.clusterlabs.org/mailman/listinfo/pacemaker


___
Pacemaker mailing list
Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker


Re: [Pacemaker] Two node DRBD cluster will not automatically failover to the secondary

2009-12-17 Thread Adam Gandelman
Tom Pride wrote:
 Hi there,

 I have setup a two node DRBD culster with pacemaker using the
 instructions provided on the drbd.org http://drbd.org/ website:
 http://www.drbd.org/users-guide-emb/ch-pacemaker.html  The cluster
 works perfectly and I can migrate the resources back and forth between
 the two nodes without a problem.  However, if I try simulating a
 complete server failure of the master node by powering off the server,
 pacemaker does not then automatically bring up the remaining node as
 the master.  I need some help to find out what configuration changes I
 need to make in order for my cluster to failover automatically.
Your config looks OKAY at first glance.  To test, try disabling the
second interface in openais.conf and run it with only one  link, see if
that changes behavior.  If no luck, log files?

-- 
: Adam Gandelman
: LINBIT | Your Way to High Availability
: Telephone: 503-573-1262 ext. 203
: Sales: 1-877-4-LINBIT / 1-877-454-6248
:
: 7959 SW Cirrus Dr.
: Beaverton, OR 97008
:
: http://www.linbit.com 


___
Pacemaker mailing list
Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker