Hi!
> Try "corosync-objctl runtime.totem.pg.mrp.srp.members". You should see
> something like:
Actually (honestly!) this command does not return anything. For corosync-objctl
I can see a whole lot of objects of the type/class runtime.totem.pg.mrp, but
none of the type "members".
The corosync
not desireable.
Cheers and thanks again for your support,
Andreas
-Ursprüngliche Nachricht-
Von: linux-ha-boun...@lists.linux-ha.org
[mailto:linux-ha-boun...@lists.linux-ha.org] Im Auftrag von Stallmann, Andreas
Gesendet: Freitag, 29. April 2011 10:39
An: General Linux-HA mailing list
Betreff
Hi!
> Just on a punt... There's not a (partial) firewall running on app02 is there?
No, no iptables running anywhere and no layer 3 switches around which could do
any filtering.
How do you debug corosync? Every command I find to debug corosync shows, that
everything is allright. Still, both n
Hi!
> If the resource ends up on the non-preferred node, those settings will cause
> it to have
> an equal score on both nodes, so it should stay put.
> If you want to verify, try "ptest -Ls" to see what scores each resource has.
Great, that's the command I was looking for!
Before the failover t
Hi!
I configured my nodes *not* to auto failback after a defective node comes back
online. This worked nicely for a while, but now it doesn't (and, honestly, I do
not know what was changed in the meantime).
What we do: We disconnect the two (virtual) interfaces of our node mgmt01
(running on v
Hi!
In one of my clusters I disconnect one of the nodes (say app01) from the
network. App02 takes of the resources as it should. Nice.
When I reconnect app01 to the network, crm_mon on app01 continues to report
app02 as "offline" and crm_mon on app02 does the same for app01. Still, no
errors ar
Hi Andrew,
> According to your configuration, it can be up to 60s before we'll detect a
> change in external connectivity.
> Thats plenty of time for the cluster to start resources.
> Maybe shortening the monitor interval will help you.
TNX for the suggestion, I'll try that. Any suggestions on r
Hi Lars,
Hi Lars!
> You are exercising complete cluster communication loss.
> Which is cluster split brain.
Correct, yes.
> If you are specifically exercising cluster split brain, why are you surprised
> that you get exactly that?
Because ping(d) is supposed to keep ressources from starting on
Hi!
I've two cluster-nodes, both running pingd (as a clone), to keep ressources
from starting on nodes which have not obvious connection to the network. The
ping-nodes are:
-appl01 (10.10.10.202)
-appl02 (10.10.10.203)
-Default GW (10.10.10.254)
Before shutting down
Hi!
I tried to compare a value returned by ping(d) to a value given in a location
contrain:
location only-if-connected nag_grp \
rule $id="only-if-connected-rule" -inf: not_defined pingd or pingd lte
2000
I thought, lte stands for "[l]ess [t]h[e]n". That's obviously wrong, because
whe
Hi!
We've got a pretty straightforward and easy configuration:
Corosync 1.2.1 / Pacemaker 2.0.0 on OpenSuSE 11.3 running DRBD (M/S), Ping
(clone), and a resource-group, containing a shared IP, tomcat and mysql (where
the datafiles of mysql reside on the DRBD). The cluster consists of two virtua
Hi there,
I asked the same question some time ago and received no suitable answer so far.
DRBD [1] does no "proper" replication over three nodes; it's basically still a
Two-Node-RAID-1 with a third node, which doesn't really take part in the
cluster but receives replication data as kind of a "b
Hi!
Just yesterday I made some changes to the vmware-stonith-script, so that it's
possible to shutdown/start/reset nodes on vmware hosts, even if the cluster
nodes are spread over several vmware hosts (where the vmware hosts are not
clusterd themselves and thus aren't reachable over the same IP
Von: linux-ha-boun...@lists.linux-ha.org
[mailto:linux-ha-boun...@lists.linux-ha.org] Im Auftrag von Dejan Muhamedagic
>> I tried "crm -f filename" and "crm > the changes line-by-line imediately, which can lead to undesireable
>> sideeffects (because some primitives start at once, where I acutal
Hi there,
is it possible to exchange a complete CIB with an other CIB?
The background is, that we have to roll out the same cluster in different
customer enviroments with different IPs / networks.
Instead of manipulating the CIB by hand via CRM, I'd rather replace
placeholders in a "template ci
Hi Andrew,
>> If "suicide" is no supported fencing option, why is it still included with
>> stonith?
> Left over from heartbeat v1 days I guess.
> Could also be a testing-only device like ssh.
www.clusterlabs.org tells me, you're the Pacemaker project leader. Would you,
by chance, know who main
Hi there,
I'm currently trying to set up a three node active/passive/passive storage
cluster.
My first tought was to use DRBD. The fact that drbd wouldn't want to come up
lead me to the fact, that DRBD, as of now, does support three node clustes only
by means of a "stacked ressource".
My ques
Hi!
I conentrate both your answers into one mail, I hope that's allright for you.
> >For now, I need an interim solution, which is, as of now, stonith via
> >suicide.
> Doesn't work as suicide is not considered reliable - by definition the
> remaining nodes have no way to verify that the fencin
By the way:
stonith -t suicide -T off mgmt03
works nicely. Thus the command itself is working.
Cheers folks, and thanks again (in advance) for your help,
Andreas
CONET Solutions GmbH, Theodor-Heuss-Allee 19, 53773 Hennef.
Registergericht/Registration Court: Amtsgericht
Hi again!
I tried to think my setup trough again, but I'm still not coming to any
sensible conclusion.
The stonith:suicide ressource was set up as a clone ressource, because that's
how it's done in all the examples I found. Well - I didn't find a single
example on "suicide", but that's at lea
Hi!
First: I set up my configuration anew, and it works. I didn't change that much,
just set the monitor-action differently from before.
Instead of:
> webserver_ressource ocf:heartbeat:apache \
> params httpd="/usr/sbin/httpd2-prefork" \
> op start interval="0" timeout="40s" \
>
Hi!
TNX for your answer. We will switch to sbd after the shared storage has been
set up.
For now, I need an interim solution, which is, as of now, stonith via suicide.
My configuration doesn't work, though.
I tried:
~~Output from crm configure show~~
primitive suicide_res stonith:
Hi!
I still have problems getting apache up and running via pacemaker.
To do some bugtracking, I tried to figure out how and when the script
/usr/lib/ocf/resource.d/heartbeat/apache is called.
Strangely, it doesn't seem to be called with the "start"-Parameter at all.
Date: Thu Feb 24 11:01:47
Hi there,
I'm afraid, I'm asking a question that several other people asked before.
Believe me, I think I tried everything from the posts I've found yet.
I'm currently trying to get my apache webserver to be started by pacemaker.
Here's the config:
primitive sharedIP ocf:heartbeat:IPaddr2 \
Hi there!
...
> Please no-one try a loop-mounted image file on NFS ;-) Even though in theory
> it may work, if you mount -o sync ...
> *Outch*
...
> Does this help?
> http://www.linux-ha.org/w/index.php?title=SBD_Fencing&diff=481&oldid=97
Yes, this helps... somehow. Well, I should use iSCSI to s
Hi!
>> - (3) rules out sbd, as this method requires access to a physical device,
>> that offers the shared storage. Am I right? The manual explicitly says, that
>> sbd may even not be used on a DRBD-Partition. Question: Is there a way to
>> insert the sbd-Header on a mounted drive instead of a
Hello!
I'm currently looking for a suitable stonith solution for our environment:
1. We have three cluster nodes running OpenSuSE 10.3 with corosync and
pacemaker.
2. The nodes reside on two VMware ESXi-Servers ( v. 4.1.0) in two locations,
where one VMware Server hosts two, the other hosts on
Hi!
An other issue adding to the problem described before:
For testing purposes, we set in ha.cf:
auto_failback off
Still, when our old primary comes back, it takes over the ressources!
Gnagnagnagna...!
Please, make it stop!
*sigh*
It seems that the setting has no consequence at all!
Thanks
Hi there,
were still in deep sh** with heartbeat and drbd in a split brain
szenario.
We have the following set up:
- A two node active/passive cluster (heartbeat 2.1.3 without crm)
- Dopd with drbd-peer-outdater (the newest ones, patched).
- Ipfail
Still, if we disconnect one host from the netw
Hi there!
Thank you Dominik. dopd works just fine in heartbeat 2.1.3-21.1
toghether with drbd 8.2.5-3 and solved our problem.
Kind regards,
Andreas
--
CONET Solutions GmbH
Andreas Stallmann, Senior Berater
---
CONET Solutions GmbH, Theodor-Heuss-Allee 19, 537
Hi there!
I have set up a two-node heartbeat cluster running
apache and drbd.
Everthing went fine, till we tested a "split brain"
scenario. In this case, when we detach both network
cables from one host, we get a two-primary situation.
I read in the thread "methods of dealing with network failo
31 matches
Mail list logo