please check if you drbd is configured to call fence-handler
https://drbd.linbit.com/users-guide/s-pacemaker-fencing.html
2015-10-08 17:16 GMT+02:00 priyanka :
> Hi,
>
> We are trying to build a HA setup for our servers using DRBD + Corosync +
> pacemaker stack.
>
> Attached is the configuration f
you configured the stonith?
2015-11-16 14:43 GMT+01:00 Richard Korsten :
> Hello Cluster guru's.
>
> I'm having a bit of trouble with a cluster of ours. After an outage of 1
> node it went into a split brain situation where both nodes aren't talking to
> each other. Both say the other node is offl
; I'm not sure, how can i check it?
>
> Greetings Richard
>
> Op ma 16 nov. 2015 om 14:58 schreef emmanuel segura :
>>
>> you configured the stonith?
>>
>> 2015-11-16 14:43 GMT+01:00 Richard Korsten :
>> > Hello Cluster guru's.
>> >
>
p ma 16 nov. 2015 om 15:09 schreef emmanuel segura :
>>
>>
>> http://clusterlabs.org/doc/en-US/Pacemaker/1.1/html/Clusters_from_Scratch/ch08.html
>> and
>> https://github.com/ClusterLabs/pacemaker/blob/master/doc/pcs-crmsh-quick-ref.md
>> anyway you can use pcs c
using group is more simple
example:
group mygroup resource1 resource2 resource 3
order o_drbd_before_services inf: ms_drbd_export:promote mygroup:start
2015-11-20 15:45 GMT+01:00 Andrei Borzenkov :
> 20.11.2015 16:38, haseni...@gmx.de пишет:
>
>> Hi,
>> I want to start several services after the
I think the xml of your vm need to available on both nodes, but your
using a failover resource Filesystem_CDrive1, because pacemaker
monitor resource on both nodes to check if they are running in
multiple nodes.
2015-12-04 18:06 GMT+01:00 Ken Gaillot :
> On 12/04/2015 10:22 AM, Klechomir wrote:
>>
om both nodes.
> The live migration is always successful.
>
>
>
> On 4.12.2015 19:30, emmanuel segura wrote:
>>
>> I think the xml of your vm need to available on both nodes, but your
>> using a failover resource Filesystem_CDrive1, because pacemaker
>> monitor re
I'm not sbd expert but I try to describe one of this warnings.
sbd: WARN: Pacemaker state outdated (age: 4)
in sbd source code "./src/sbd-md.c"
you can use on-fail in the stop operation and for your other questions
you can use colocation + order or better if you use a group: for
example group mygroup resource1 resource2
When resource1 monitor fails the resource2 restarts
2016-01-11 17:09 GMT+01:00 John Gogu :
> Dear all,
> I have follow
please share your cluster config and say if your fencing is working.
2016-01-19 3:47 GMT+01:00 :
> One of my clusters is having a problem. It's no longer able to set up its
> GFS2 mounts. I've narrowed the problem down a bit. Here's the output when I
> try to start the DLM daemon (Normally this i
tebin.com/eAiq2yJ9
>
> Another cluster is running fine with an identical configuration.
>
> On 2016-01-19 03:49, emmanuel segura wrote:
>>
>> please share your cluster config and say if your fencing is working.
>>
>> 2016-01-19 3:47 GMT+01:00 :
>>>
>
you need to be sure that your redis resources has master/slave support
and I think this colocation need to be invert
colocation resource_location1 inf: redis_clone:Master kamailio
to
colocation resource_location1 inf: kamailio redis_clone:Master
You need a order too:
order resource_order1 inf:
use fence and after you configured the fencing you need to use
iptables for testing your cluster, with iptables you can block 5404
and 5405 ports
2016-02-14 14:09 GMT+01:00 Debabrata Pani :
> Hi,
> We ran into some problems when we pull down the ethernet interface using
> “ifconfig eth0 down”
>
>
maintenance mode
2016-02-19 16:43 GMT+01:00 Richard Stevenson :
> Hi,
>
> I'm having trouble updating pacemaker on a small 3 node cluster. All nodes
> are running Centos 7, and I'm upgrading via a simple `yum upgrade`. Whenever
> I attempt to do this the node is fenced when yum attempts to clean u
If you need help, the first thing that you need to do is show your cluster logs.
2016-03-05 15:17 GMT+01:00 Thorsten Stremetzne :
> Hello all,
>
> I have built an HA setup for a OpenVPN server.
> In my setup there are two hosts, running Ubuntu Linux, pacemaker &
> chorosync. Also both hosts have a
I think you should give the parameters to the stonith agent, anyway
show your config.
2016-03-09 5:29 GMT+01:00 vija ar :
> I have configured SLEHA cluster on cisco ucs boxes with ipmi configured, i
> have tested IPMI using impitool, however ipmitool to function neatly i have
> to pass parameter -
try to use on-fail for single resource.
2016-03-25 0:22 GMT+01:00 Adam Spiers :
> Sam Gardner wrote:
>> I'm having some trouble on a few of my clusters in which the DRBD Slave
>> resource does not want to come up after a reboot until I manually run
>> resource cleanup.
>>
>> Setting 'start-fail
monitor interval=13s role=Master
> (DRBDSlave-monitor-interval-13s)
>
>
> --
> Sam Gardner
> Trustwave | SMART SECURITY ON DEMAND
>
>
> On 3/25/16, 2:46 AM, "emmanuel segura" wrote:
>
>>try to use on-fail for single resource.
>>
>>2016-
you need to use pcs to do everything, pcs cluster setup and pcs
cluster start, try to use the redhat docs for more information.
2016-04-27 17:28 GMT+02:00 Sriram :
> Dear All,
>
> I m trying to use pacemaker and corosync for the clustering requirement that
> came up recently.
> We have cross compi
use fencing and drbd fencing handler
2016-05-04 14:46 GMT+02:00 Rafał Sanocki :
> Resources shuld move to second node when any interface is down.
>
>
>
>
> W dniu 2016-05-04 o 14:41, Ulrich Windl pisze:
>
> Rafal Sanocki schrieb am 04.05.2016 um 14:14
> in
>>
>> Nachricht <78d882b1-a407-
Hi,
But the latest lvm version doesn't worries about the aligned?
2016-05-27 18:37 GMT+02:00 Ken Gaillot :
> On 05/27/2016 12:58 AM, Ulrich Windl wrote:
>> Hi!
>>
>> Thanks for this info. We actually run the "noop" scheduler for the SAN
>> storage (as per menufacturer's recommendation), because
have you configured the stonith and drbd stonith handler?
2016-07-07 16:43 GMT+02:00 Carlos Xavier :
> Hi.
> We had a Pacemaker cluster running OCFS2 filesystem over a DRBD device and we
> completely lost one of the hosts.
> Now I need some help to recover the data on the remaining machine.
> I w
dlm_tool dump ?
2016-07-07 18:57 GMT+02:00 Carlos Xavier :
> Tank you for the fast reply
>
>>
>> have you configured the stonith and drbd stonith handler?
>>
>
> Yes. they were configured.
> The cluster was running fine for more than 4 years, until we loose one host
> by power supply failure.
> N
using pcs resource unmanage leave the monitoring resource actived, I
usually set the monitor interval=0 :)
2016-07-11 10:43 GMT+02:00 Tomas Jelinek :
> Dne 9.7.2016 v 06:39 jaspal singla napsal(a):
>>
>> Hello Everyone,
>>
>> I need little help, if anyone can give some pointers, it would help me a
enabled=false works with every pacemaker versions?
2016-07-13 16:48 GMT+02:00 Ken Gaillot :
> On 07/13/2016 05:50 AM, emmanuel segura wrote:
>> using pcs resource unmanage leave the monitoring resource actived, I
>> usually set the monitor interval=0 :)
>
> Yep :)
>
&
maybe you need interleave=true in your clones
2016-07-15 8:32 GMT+02:00 Ulrich Windl :
TEG AMJG schrieb am 14.07.2016 um 23:47 in Nachricht
> :
>> Dear list
>>
>> I am quite new to PaceMaker and i am configuring a two node active/active
>> cluster which consist basically on something like th
why you don't use the resource agent for using o2cb? This script for
begin used with ocfs legacy mode.
2016-08-02 12:39 GMT+02:00 Kyle O'Donnell :
> er forgot
>
> primitive p_o2cb lsb:o2cb \
> op monitor interval="10" timeout="30" \
> op start interval="0" timeout="120" \
>
ar with all that stuff.
>
> Thanks for any help,
>
> Thomas Hluchnik
>
>
> Am Tuesday 02 August 2016 15:28:17 schrieb emmanuel segura:
>> why you don't use the resource agent for using o2cb? This script for
>> begin used with ocfs legacy mode.
>>
&
your lvm filter include the drbd devices /dev/drbdX ?
2016-08-10 21:38 GMT+02:00 Darren Kinley :
> Hi,
>
> I have an LVM logical volume and used DRBD to replicate it to another
> server.
> The /dev/drbd0 has PV/VG/LVs which are mostly working.
> I have colocation and order constraints that bring u
filter = [ "r|/dev/.*/by-path/.*|", "r|/dev/.*/by-id/.*|",
> "r|/dev/fd.*|", "r|/dev/cdrom|", "a/.*/" ]
>
> -Original Message-
> From: emmanuel segura [mailto:emi2f...@gmail.com]
> Sent: Wednesday, August 10, 2016 2:33 PM
&g
le
> lvmetad in /var/log/messages.
>
> I had disabled write_cache_state=0 but it appears that use_lvmetad=0 and
> disabling
> the service were also needed. You also need to rebuild your initrd ram disk.
>
> Darren
>
> -----Original Message-
> From: emmanuel segura
you can have 4 configured in pacemaker 4 oracle instance(failover
mode), using 4 resource group in this way"
resourcegroup -> filesystem->vip->oracleinstance->oracle_listener
If you want to use lvm with your oracle instance for space management:
resourcegroup -> lvm_resource->filesystem->vip->or
But, I don't how someone can try to help you, if you only described
your problem and don't show any log or cluster configuration.
2016-09-15 13:27 GMT+02:00 Nurit Vilosny :
> Hi,
>
> I am working in a 3 node HA cluster with a resource group. I am seeing a
> weird behavior – whenever I shutdown on
{ocfs2}->{dlm}->{fencing}->{timeout}
2016-10-10 16:46 GMT+02:00 Ulrich Windl :
> Hi!
>
> I observed an interesting thing: In a three node cluster (SLES11 SP4) with
> cLVM and OCFS2 on top, one node was fenced as the OCFS2 filesystem was
> somehow busy on unmount. We have (for paranoid reasons ma
why you don't use oracle rac with asm?
2016-10-07 18:46 GMT+02:00 Chad Cravens :
> Hello:
>
> I'm working on a project where the client is using Oracle ASM (volume
> manager) for database storage. I have implemented a cluster before using
> LVM with ext4 and understand there are resource agents (
If you want to reduce the multipath switching time, when one
controller goes down
https://www.redhat.com/archives/dm-devel/2009-April/msg00266.html
2016-10-13 10:27 GMT+02:00 Ulrich Windl :
Eric Ren schrieb am 13.10.2016 um 09:31 in Nachricht
> :
>> Hi,
>>
>> On 10/10/2016 10:46 PM, Ulrich W
I been using this mode: iscsi_disks -> lvm volume ->
drbd_on_top_of_lvm -> filesystem
resize: add_one_iscsi_device_to_every_cluster_node_first ->
now_add_device_the_volume_group_on_every_cluster_node ->
now_resize_the_volume_on_every_cluster_node : now you have every
cluster with the same logical
the only thing that I can say is: sbd is a realtime process
2016-12-08 11:47 GMT+01:00 Jehan-Guillaume de Rorthais :
> Hello,
>
> While setting this various parameters, I couldn't find documentation and
> details about them. Bellow some questions.
>
> Considering the watchdog module used on a serv
But what if sbd fails to reset
the timer multiple times (eg. because of excessive load, swap storm etc)?
If I remember, sbd has allocated memory with mlock and SCHED_RR in
this way, when server is swapping, sbd doesn't stop.
2016-12-09 8:11 GMT+01:00 Ulrich Windl :
>>>> emmanue
sorry,
But do you mean, when you say, you migrated the vm outside of the
cluster? one server out side of you cluster?
2017-01-17 9:27 GMT+01:00 Oscar Segarra :
> Hi,
>
> I have configured a two node cluster whewe run 4 kvm guests on.
>
> The hosts are:
> vdicnode01
> vdicnode02
>
> And I have cre
n node1 (vdicnode01-priv)
>>virsh list
> ==
> vdicdb01 started
>
> On node2 (vdicnode02-priv)
>>virsh list
> ==
> vdicdb02 started
> vdicdb01 started
>
> If I query cluster pcs status, cluster thinks resource vm-vdicdb01 is only
please, if you need help, the first thing is show, your cluster configuration.
2017-01-30 23:15 GMT+01:00 Jihed M'selmi :
> I tried to install two resources: a resource for oracle database and oracle
> listener: but the pcmk can't install the resource (red hat 7.3) usint hte
> ocf:heartbeat:oracle
:heartbeat:oracle and ocf:heartbeat:orclsnr.
>
> Thanks
>
> Jihed M’SELMI
> Mobile: +21658433664
> http://about.me/jihed.mselmi
>
> On Tue, Jan 31, 2017 at 12:16 AM, emmanuel segura
> wrote:
>>
>> please, if you need help, the first thing is show, your cluster
modprobe softdog if you don't have an external watchdog
2017-02-13 18:34 GMT+01:00 :
> I am working to get an active/active cluster running.
> I have Windows 10 running 2 Fedora 25 Virtualbox VMs.
> VMs named node1, and node2.
>
> I created a vdi disk and set it to shared.
> I formatted it to gfs
I missed that, the same device for partition and sbd :(, really bad idea.
2017-02-13 19:04 GMT+01:00 Klaus Wenninger :
> On 02/13/2017 06:34 PM, dur...@mgtsciences.com wrote:
>> I am working to get an active/active cluster running.
>> I have Windows 10 running 2 Fedora 25 Virtualbox VMs.
>> VMs na
The first place where you need to look is oracle log.
2017-02-22 8:43 GMT+01:00 Ulrich Windl :
Chad Cravens schrieb am 22.02.2017 um 02:44 in
> Nachricht
> :
>> Hello fellow Cluster Geeks!
>>
>> I'm having an issue with the standard Oracle RA script that I can't
>> understand why this is hap
I think no, in /usr/lib/ocf/resource.d/heartbeat/oralsnr
start function, oralsnr_start = "output=`echo lsnrctl start $listener
| runasdba`"
stop function, oralsnr_stop = "output=`echo lsnrctl stop $listener | runasdba`"
Where listener variable is the resource agent parameter given by
pacemaker :
your cluster was in maintenance state?
2017-03-03 13:59 GMT+01:00 Ulrich Windl :
> Hello!
>
> After Update and reboot of 2nd of three nodes (SLES11 SP4) I see a
> "cluster-dlm[4494]: setup_cpg_daemon: daemon cpg_join error retrying" message
> when I expected the node to joint the cluster. What c
I think is a good idea to put your cluster in maintenance mode, when
you do an update.
2017-03-03 15:11 GMT+01:00 Ulrich Windl :
>>>> emmanuel segura schrieb am 03.03.2017 um 14:22 in
> Nachricht
> :
>> your cluster was in maintenance state?
>
> No, it wasn't?
use something like standby?
2017-03-03 16:02 GMT+01:00 Ulrich Windl :
>>>> emmanuel segura schrieb am 03.03.2017 um 15:35 in
> Nachricht
> :
>> I think is a good idea to put your cluster in maintenance mode, when
>> you do an update.
>
> You should know that I
that you say to the cluster, to not perform any action, because you
are doing an intervention.
2017-03-06 9:14 GMT+01:00 Ulrich Windl :
>>>> emmanuel segura schrieb am 03.03.2017 um 17:43 in
> Nachricht
> :
>> use something like standby?
>
> Hi!
>
> W
please, give more information and if you are using lvm, share your lvm
cluster information and the cluster config too.
2017-06-15 9:22 GMT+02:00 :
> Hi. We need to clear an old EMC storage and the only thing that's left
> there is the shared disk of our Pacemaker cluster.
>
>
>
> Version:
>
> RHE
you can go ahead without updates, anyway, if you don't to pay for support,
use centos or other distro.
2017-06-16 10:14 GMT+02:00 Eric Robinson :
>
>
> Ø You could test it for free, you just need to register
>
> Ø to https://scc.suse.com/login
>
> Ø After that, you have an access for 60 days t
I don't know what can happen, if the ssl expired, but looking in
/usr/lib/pcsd/ssl.rb I found the function.
def generate_cert_key_pair(server_name)
name = "/C=US/ST=MN/L=Minneapolis/O=pcsd/OU=pcsd/CN=#{server_name}"
ca = OpenSSL::X509::Name.parse(name)
key = OpenSSL::PKey::RSA.new(2048)
I think is a good idea, if you first show your config and cluster logs,
because I never any limitation to run active/active in pacemaker.
2017-07-06 21:52 GMT+02:00 Jesse P. Johnson :
> ALL,
>
>
> I have setup an active/passive cluster using Pacemaker, CLVM, and GFS2 for
> Oracle12c. I can fail o
you need to configure cluster fencing and drbd fencing handler, in this
way, the cluster can recevory without manual intervention.
2017-07-12 11:33 GMT+02:00 ArekW :
> Hi,
> Can in be fixed that the drbd is entering split brain after cluster
> node recovery? After few tests I saw drbd recovered b
yes, if you are using drbd in master/slave, first promote the resource to
master and then start vm on the node, if you use drbd in multimaster, only
start the vm when drbd is started.
Use SAN, with multipath.
2017-07-18 16:34 GMT+02:00 Lentes, Bernd :
>
>
> - On Jul 17, 2017, at 11:51 AM, Be
I never tried to set an virtual ip in one interfaces without ip, because
the vip is a secondary ip that switch between nodes, not primary ip
2017-09-05 15:41 GMT+02:00 Octavian Ciobanu :
> Hello all,
>
> I've encountered an issue with IP cloning.
>
> Based the "Pacemaker 1.1 Clusters from Scratch
, another node can take over the failed node’s
> "request bucket". Otherwise, requests intended for the failed node would be
> discarded."
>
> To have this functionality do I must have a static IP set on the
> interfaces ?
>
>
>
> On Tue, Sep 5, 2017 at
I put a node in maintenance mode?
do you mean you put the cluster in maintenance mode
2017-10-16 19:24 GMT+02:00 Lentes, Bernd :
> Hi,
>
> i have the following behavior: I put a node in maintenance mode,
> afterwards stop corosync on that node with /etc/init.d/openais stop.
> This node is immedi
You need to configure the stonith and drbd stonith handler
2017-12-19 8:19 GMT+01:00 Прокопов Павел :
> Hello!
>
> pacemaker pingd with ms drbd = double masters short time when disconnected
> networks.
>
> My crm config:
>
> node 168885811: pp-pacemaker1.heliosoft.ru
> node 168885812: pp-pacemake
the start function, need to start the resource when monitor doesn't return
success
2018-04-12 23:38 GMT+02:00 Bishoy Mikhael :
> Hi All,
>
> I'm trying to create a resource agent to promote a standby HDFS namenode
> to active when the virtual IP failover to another node.
>
> I've taken the skelet
the first thing that you need to configure is the stonith, because you have
this constraint "constraint order promote DrbdResClone then start HALVM"
To recover and promote drbd to master when you crash a node, configurare
the drbd fencing handler.
pacemaker execute monitor in both nodes, so this
https://oss.clusterlabs.org/pipermail/pacemaker/2013-July/019224.html
2018-04-25 10:58 GMT+02:00 范国腾 :
> Hi,
>
>
>
> Our lab has two resource: (1) PAF (master/slave)(2) VIP (bind to the
> master PAF node). The configuration is in the attachment.
>
> Each node has two network card: One(enp0s8)
But I think using ifdown isn't the correct way to test the cluster, this
topic was discussed many times
2018-04-26 9:53 GMT+02:00 范国腾 :
> 1. There is no failure in initial status. sds1 is master
>
>
>
> 2. ifdown the sds1 VIP network card.
>
> 3. ifup the sds1 VIP network card and then ifdown sds
Sorry, but I think is more easy to help you, If you provide more
information about your problem.
2015-08-03 14:14 GMT+02:00 Vijay Partha :
> Hi
>
> When i start cman it hangs in joining fence domain.
>
> this is my message log.
>
>
> Aug 3 14:12:16 vmx-occ-005 dlm_controld[2112]: daemon cpg_join
; Aug 3 14:36:11 vmx-occ-004 dlm_controld[2383]: daemon cpg_join error
> retrying
> Aug 3 14:36:14 vmx-occ-004 fenced[2359]: daemon cpg_join error retrying
> Aug 3 14:36:18 vmx-occ-004 gfs_controld[2458]: daemon cpg_join error
> retrying
> Aug 3 14:36:18 vmx-occ-004 /usr/sbin/gm
Sorry, but from my point of view, the agent first check if the
resource is running, for example you can check that from
/usr/lib/ocf/resource.d/heartbeat/Filesystem
The logic is
Filesystem::start(parameter as parameter for the
agent)->Filesystem_start(function called from start in the case which
please, share your cluster logs && config and so on.
2015-08-23 20:16 GMT+02:00 Digimer :
> On 23/08/15 02:13 PM, Jorge Fábregas wrote:
>> Hi,
>>
>> I'm still doing some tests on SLES 11 SP4 & I was trying to run
>> "mkfs.ocfs2" against a logical volume (with all infrastructure
>> ready: cLVM & DL
69 matches
Mail list logo