Re: [Linux-HA] Questions about Pacemaker, corosyns DRBD and MySQL

2014-01-17 Thread Marcy.D.Cortes
Sorry, i can't speak to RH since we run SLES.  There have been some very recent 
fixes for it on s390x.
I can't remember which component had the issues - i think pacemaker, not 
corosync - but basically it couldn't shutdown the domain controller without the 
cluster getting totally confused. It was using little endian in places where it 
needed to be big!  SUSE put out those fixes on Dec 27.
You should check that you don't have a problem shutting down the DC node and 
contact RH if you do.

Hope to save you the same fun :)


Marcy


-Original Message-
From: linux-ha-boun...@lists.linux-ha.org 
[mailto:linux-ha-boun...@lists.linux-ha.org] On Behalf Of mike
Sent: Friday, January 17, 2014 1:58 PM
To: linux-ha@lists.linux-ha.org
Subject: [Linux-HA] Questions about Pacemaker, corosyns DRBD and MySQL

Hi guys,

I've got a running HA Cluster with shared storage on RHEL 5 on s390x. It works 
fine but I've had to do some VM modifications underneath the covers to 
adequately implement a safe STONITH package due to RACF policies and so on.

I'm starting to experiment with RHEL 6 and I think a great solution for me 
would be to implement DRBD with Pacemaker and Corosync since I believe it would 
release me from the VM modifications that I'd really like to avoid. I'm having 
a bit of a difficult time understanding how DRBD works to be honest. All of the 
examples show how to set up the cluster and then create a partition on both 
nodes for DRBD but I'm really lacking a deep understanding on how this works 
and so I'm a bit confused.

I currently have the cluster built and mysql resource enabled and working. 
Regarding the DRBD partition that I create on each node - will this partition 
hold a database? For instance, I have 3 databases created
-  wiki, drupal and testDB. Should I create a separate partition on each node 
for each one of these databases and insert them into DRBD config? So in effect 
I would have 3 separate filesystems /var/lib/wiki, /var/lib/drupal and 
/var/lib/testDB ? Or is this DRBD partition a sort of messaging partition that 
sits outside the database directories themselves and handles the transaction 
messaging to the actual database?

Know what I mean?

-tks
Mike


___
Linux-HA mailing list
Linux-HA@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems
___
Linux-HA mailing list
Linux-HA@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems


Re: [Linux-HA] FW: Can't seem to shutdown the DC

2013-09-13 Thread Marcy.D.Cortes
Thanks Lars,  support request is 10855509041.   I improved the situation a bit 
by added a dc-deadline of 2min, but it still won't shutdown correctly (the 
other nodes still think it is online).
Also upgraded it to SP3 with no change in results.

Marcy

-Original Message-
From: linux-ha-boun...@lists.linux-ha.org 
[mailto:linux-ha-boun...@lists.linux-ha.org] On Behalf Of Lars Marowsky-Bree
Sent: Thursday, September 12, 2013 12:15 PM
To: General Linux-HA mailing list
Subject: Re: [Linux-HA] FW: Can't seem to shutdown the DC

On 2013-09-12T18:14:04, marcy.d.cor...@wellsfargo.com wrote:

 
 Hello list,
 
 Using SUSE SLES 11 SP2.
 
 I have 4 servers in a cluster running cLVM + OCFS2.
 
 If I tried to shutdown the one that is the DC using openais stop, strange 
 things happen resulting in a really messed up cluster.
 One one occasion, another server decided he was the DC and the other 2 still 
 thought the original DC was online and still it.
 Often it results in fencing and lots of reboots.
 If I tried to put the DC into standby mode, I get this

Your configuration for Pacemaker looks OK. My expectation would be that you are 
suffering some networking issue.

Can you kindly open a support request with Novell Technical Services?


Regards,
Lars

--
Architect Storage/HA
SUSE LINUX Products GmbH, GF: Jeff Hawn, Jennifer Guild, Felix Imendörffer, HRB 
21284 (AG Nürnberg) Experience is the name everyone gives to their mistakes. 
-- Oscar Wilde

___
Linux-HA mailing list
Linux-HA@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems
___
Linux-HA mailing list
Linux-HA@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems


[Linux-HA] FW: Can't seem to shutdown the DC

2013-09-12 Thread Marcy.D.Cortes

Hello list,

Using SUSE SLES 11 SP2.

I have 4 servers in a cluster running cLVM + OCFS2.

If I tried to shutdown the one that is the DC using openais stop, strange 
things happen resulting in a really messed up cluster.
One one occasion, another server decided he was the DC and the other 2 still 
thought the original DC was online and still it.
Often it results in fencing and lots of reboots.
If I tried to put the DC into standby mode, I get this

cpzea01a0017:~ # crm node standby cpzea01a0017
Error setting standby=on (section=nodes, set=null): Remote node did not 
respond
Error performing operation: Remote node did not respond

Is there some special way to take it down?

node cpzea01a0015 \
attributes standby=off
node cpzea01a0017 \
attributes standby=off
node cpzea02a0015 \
attributes standby=off
node cpzea02a0017 \
attributes standby=off
primitive clvm ocf:lvm2:clvmd \
params daemon_timeout=30
primitive dlm ocf:pacemaker:controld \
op monitor interval=60 timeout=60
primitive o2cb ocf:ocfs2:o2cb \
op monitor interval=60 timeout=60
primitive ocfs2-1 ocf:heartbeat:Filesystem \
params device=/dev/sharedg/lvol1 directory=/app/data/index 
fstype=ocfs2 options=acl \
op monitor interval=20 timeout=40
primitive stonith_sbd stonith:external/sbd \
meta target-role=Started \
op monitor interval=15 timeout=15 start-delay=15 \
params sbd_device=/dev/disk/by-path/ccw-0.0.7000-part1
primitive vg1 ocf:heartbeat:LVM \
params volgrpname=sharedg \
op monitor interval=60 timeout=60
group base-group dlm o2cb clvm vg1 ocfs2-1
clone base-clone base-group \
meta interleave=true target-role=Started
property $id=cib-bootstrap-options \
dc-version=1.1.7-77eeb099a504ceda05d648ed161ef8b1582c7daf \
cluster-infrastructure=openais \
expected-quorum-votes=4 \
stonith-enabled=true \
stonith-timeout=72s \
no-quorum-policy=freeze


Marcy






___
Linux-HA mailing list
Linux-HA@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems