Re: [Linux-HA] FW: Can't seem to shutdown the DC

2013-09-13 Thread Marcy.D.Cortes
Thanks Lars,  support request is 10855509041.   I improved the situation a bit 
by added a dc-deadline of 2min, but it still won't shutdown correctly (the 
other nodes still think it is online).
Also upgraded it to SP3 with no change in results.

Marcy

-Original Message-
From: linux-ha-boun...@lists.linux-ha.org 
[mailto:linux-ha-boun...@lists.linux-ha.org] On Behalf Of Lars Marowsky-Bree
Sent: Thursday, September 12, 2013 12:15 PM
To: General Linux-HA mailing list
Subject: Re: [Linux-HA] FW: Can't seem to shutdown the DC

On 2013-09-12T18:14:04, marcy.d.cor...@wellsfargo.com wrote:

 
 Hello list,
 
 Using SUSE SLES 11 SP2.
 
 I have 4 servers in a cluster running cLVM + OCFS2.
 
 If I tried to shutdown the one that is the DC using openais stop, strange 
 things happen resulting in a really messed up cluster.
 One one occasion, another server decided he was the DC and the other 2 still 
 thought the original DC was online and still it.
 Often it results in fencing and lots of reboots.
 If I tried to put the DC into standby mode, I get this

Your configuration for Pacemaker looks OK. My expectation would be that you are 
suffering some networking issue.

Can you kindly open a support request with Novell Technical Services?


Regards,
Lars

--
Architect Storage/HA
SUSE LINUX Products GmbH, GF: Jeff Hawn, Jennifer Guild, Felix Imendörffer, HRB 
21284 (AG Nürnberg) Experience is the name everyone gives to their mistakes. 
-- Oscar Wilde

___
Linux-HA mailing list
Linux-HA@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems
___
Linux-HA mailing list
Linux-HA@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems


[Linux-HA] FW: Can't seem to shutdown the DC

2013-09-12 Thread Marcy.D.Cortes

Hello list,

Using SUSE SLES 11 SP2.

I have 4 servers in a cluster running cLVM + OCFS2.

If I tried to shutdown the one that is the DC using openais stop, strange 
things happen resulting in a really messed up cluster.
One one occasion, another server decided he was the DC and the other 2 still 
thought the original DC was online and still it.
Often it results in fencing and lots of reboots.
If I tried to put the DC into standby mode, I get this

cpzea01a0017:~ # crm node standby cpzea01a0017
Error setting standby=on (section=nodes, set=null): Remote node did not 
respond
Error performing operation: Remote node did not respond

Is there some special way to take it down?

node cpzea01a0015 \
attributes standby=off
node cpzea01a0017 \
attributes standby=off
node cpzea02a0015 \
attributes standby=off
node cpzea02a0017 \
attributes standby=off
primitive clvm ocf:lvm2:clvmd \
params daemon_timeout=30
primitive dlm ocf:pacemaker:controld \
op monitor interval=60 timeout=60
primitive o2cb ocf:ocfs2:o2cb \
op monitor interval=60 timeout=60
primitive ocfs2-1 ocf:heartbeat:Filesystem \
params device=/dev/sharedg/lvol1 directory=/app/data/index 
fstype=ocfs2 options=acl \
op monitor interval=20 timeout=40
primitive stonith_sbd stonith:external/sbd \
meta target-role=Started \
op monitor interval=15 timeout=15 start-delay=15 \
params sbd_device=/dev/disk/by-path/ccw-0.0.7000-part1
primitive vg1 ocf:heartbeat:LVM \
params volgrpname=sharedg \
op monitor interval=60 timeout=60
group base-group dlm o2cb clvm vg1 ocfs2-1
clone base-clone base-group \
meta interleave=true target-role=Started
property $id=cib-bootstrap-options \
dc-version=1.1.7-77eeb099a504ceda05d648ed161ef8b1582c7daf \
cluster-infrastructure=openais \
expected-quorum-votes=4 \
stonith-enabled=true \
stonith-timeout=72s \
no-quorum-policy=freeze


Marcy






___
Linux-HA mailing list
Linux-HA@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems


Re: [Linux-HA] FW: Can't seem to shutdown the DC

2013-09-12 Thread Lars Marowsky-Bree
On 2013-09-12T18:14:04, marcy.d.cor...@wellsfargo.com wrote:

 
 Hello list,
 
 Using SUSE SLES 11 SP2.
 
 I have 4 servers in a cluster running cLVM + OCFS2.
 
 If I tried to shutdown the one that is the DC using openais stop, strange 
 things happen resulting in a really messed up cluster.
 One one occasion, another server decided he was the DC and the other 2 still 
 thought the original DC was online and still it.
 Often it results in fencing and lots of reboots.
 If I tried to put the DC into standby mode, I get this

Your configuration for Pacemaker looks OK. My expectation would be that
you are suffering some networking issue.

Can you kindly open a support request with Novell Technical Services?


Regards,
Lars

-- 
Architect Storage/HA
SUSE LINUX Products GmbH, GF: Jeff Hawn, Jennifer Guild, Felix Imendörffer, HRB 
21284 (AG Nürnberg)
Experience is the name everyone gives to their mistakes. -- Oscar Wilde

___
Linux-HA mailing list
Linux-HA@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems