currently doing another conntrackd project and therefore using the
Found a minor issue:
When the active host is fenced and returns to the cluster, it does not
request the current connection tracking states. Therefore state
information might be lost. This patch fixes that. Any comments?
I'm
Hi people
currently doing another conntrackd project and therefore using the
code once again (jippie :)). Found a minor issue:
When the active host is fenced and returns to the cluster, it does not
request the current connection tracking states. Therefore state
information might be lost. This
Node level failure is detected on the communications layer, ie hearbeat
or corosync. That software is run with realtime priority. So it keeps
working just fine (use tcpdump on the healthy node to verify). So
pacemaker on the healthy node does now know that the other node has a
problem and
On 08/29/2011 09:51 AM, Dominik Klein wrote:
Node level failure is detected on the communications layer, ie hearbeat
or corosync. That software is run with realtime priority. So it keeps
working just fine (use tcpdump on the healthy node to verify). So
pacemaker on the healthy node does now
There did not have to be a negative location constraint up to now,
because the cluster took care of that.
Only because it didn't work correctly.
Okay.
Actually, this is a wanted setup. It happened that VMs configs were
changed in ways that lead to a VM not being startable any more. For
On 06/27/2011 11:09 AM, Dejan Muhamedagic wrote:
Hi Dominik,
On Fri, Jun 24, 2011 at 03:50:40PM +0200, Dominik Klein wrote:
Hi Dejan,
this way, the cluster never learns that it can't start a resource on
that node.
This resource depends on shared storage. So, the cluster won't
try
With the agent before the mentioned patch, during probe of a newly
configured resource, the cluster would have learned that the VM is not
available on one of the nodes (ERR_INSTALLED), so it would never start
the resource there.
This is exactly the problem with shared storage setups, where
I'm not sure my fix is correct.
According to
https://github.com/ClusterLabs/resource-agents/commit/96ff8e9ad3d4beca7e063beef156f3b838a798e1#heartbeat/VirtualDomain
this is a regression which was introduced in April '11.
So the fix should be the other way around: Introduce a parameter that
This fixes the issue described yesterday.
Comments?
Regards
Dominik
exporting patch:
# HG changeset patch
# User Dominik Klein dominik.kl...@gmail.com
# Date 1308909599 -7200
# Node ID 2b1615aaca2c90f2f4ab93eb443e5902906fb28a
# Parent 7a11934b142d1daf42a04fbaa0391a3ac47cee4c
RA VirtualDomain
Hi Dejan,
this way, the cluster never learns that it can't start a resource on
that node.
I don't consider this a solution.
Regards
Dominik
___
Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org
Hi
code snippet from
http://hg.linux-ha.org/agents/raw-file/7a11934b142d/heartbeat/VirtualDomain
(which I believe is the current version)
VirtualDomain_Validate_All() {
snip
if [ ! -r $OCF_RESKEY_config ]; then
if ocf_is_probe; then
ocf_log info Configuration file
through some crm commands or similar
(crm_attributes, etc.)
but they are not taken in account before the 60s are ended.
Alain
Dominik Klein a écrit :
Just write it to the xml on all nodes?
On 05/10/2011 01:23 PM, Alain.Moulle wrote:
Sorry I meant directly with is_managed=false of course
On 05/11/2011 10:24 AM, Alain.Moulle wrote:
Hi Dominik,
I just have tried again :
service corosync stop on both nodes node1 node2
remove the cib.xml on node2
vi cib.xml on node1
set the property maintenance-mode=true
(nvpair id=cib-bootstrap-options-maintenance-mode
Just write it to the xml on all nodes?
On 05/10/2011 01:23 PM, Alain.Moulle wrote:
Sorry I meant directly with is_managed=false of course !
Alain
___
Linux-HA mailing list
Linux-HA@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha
It waits $dampen before changes are pushed to the cib. So that
eventually occuring icmp hickups do not produce an unintended failover.
At least that's my understanding.
Regards
Dominik
On 04/29/2011 09:22 AM, Ulrich Windl wrote:
Hi,
I think the description for dampen in OCF:pacemaker:ping
correcto
wow.
again! :)
___
Linux-HA mailing list
Linux-HA@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems
Am I too paranoid?
I don't think you are. Some non-root pratically being able to remove any
file is certainly a valid concern.
Thing is: I needed an RA that configured a cronjob. Florian suggested
writing the symlink RA instead, that could manage symlink. Apparently
there was an IRC discussion
Mornin Dejan,
The reason was that libglue2 and cluster-glue were not installed from
the clusterlabs repository, as the rest of the packages were, but
instead they were pulled from the original opensuse repository in an
older version.
This is what I found in pacemaker.spec.in in the
This is what I found in pacemaker.spec.in in the repository:
Requires(pre): cluster-glue = 1.0.6
The 1.0.10 rpm from clusterlabs for opensuse 11.2 just says
cluster-glue afaict:
rpm -qR pacemaker
cluster-glue
resource-agents
python = 2.4
libpacemaker3 = 1.0.10-1.4
libesmtp
net-snmp
Hi
as some of you might have seen on the pacemaker list, I tried to install
a 3 node cluster and there were ipc issues reported by the cib and
therefore the cluster could not start correctly.
The reason was that libglue2 and cluster-glue were not installed from
the clusterlabs repository, as the
Hi Dejan
The reason was that libglue2 and cluster-glue were not installed from
the clusterlabs repository, as the rest of the packages were, but
instead they were pulled from the original opensuse repository in an
older version.
This is what I found in pacemaker.spec.in in the repository:
You could also try apcmastersnmp.
Got that to work with apc devices which did not work with the telnet thing.
As long as they didn't change mibs (which I don't know whether they
have). Might be worth a shot.
Regards
Dominik
On 02/25/2011 02:24 AM, Avestan wrote:
Hello Dejan,
As I am
Thanks for inclusion.
While looking through the pushed changes, I spotted two meta-data typos.
See trivial patch.
Regards
Dominik
Applied and pushed with two minor edits. Thanks a lot!
Cheers,
Florian
--- conntrackd.orig 2011-02-14 11:43:22.0 +0100
+++ conntrackd 2011-02-14
@@
# An OCF RA for conntrackd
# http://conntrack-tools.netfilter.org/
#
-# Copyright (c) 2010 Dominik Klein
+# Copyright (c) 2011 Dominik Klein
#
# This program is free software; you can redistribute it and/or modify
# it under the terms of version 2 of the GNU General Public License as
@@ -25,11
Maybe you applied the s/100/$slavescore patch someone sent a couple
weeks ago. I used the last version from thread New stateful RA:
conntrackd dated october 27th 3:29pm.
Anyway, here's my version.
Regards
Dominik
On 02/11/2011 01:36 PM, Florian Haas wrote:
On 2011-02-11 09:48, Dominik Klein
Not yet. That's why I wrote soon_-ish_ ;)
Any release coming up you want to include this in?
any news on this?
Cheers,
Florian
___
Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
Home
Hi
thanks for testing and feedback.
On 01/27/2011 01:37 PM, Marjan, BlatnikČŠŽ wrote:
Conntrackd RA from Dominik Klein works. We can now successfully
migrate/fail from one node to another one.
At the begining, we have problems with failing. After reboot/fail, the
slave was not synced
at the mailing list we found a couple of
options although we only fully evaluated the RA produced by Dominik
Klein as it appears to be more feature complete than the alternative.
For a full description of his RA please see his original thread[2].
So far throughout testing we have been very pleased
Or, put differently: is us tracking the supposed state really necessary,
or can we inquire it from the service somehow?
From the submitted RA:
# You can't query conntrackd whether it is master or slave. It can
be both at the same time.
# This RA creates a statefile
# http://conntrack-tools.netfilter.org/
#
# Copyright (c) 2010 Dominik Klein
#
# This program is free software; you can redistribute it and/or modify
# it under the terms of version 2 of the GNU General Public License as
# published by the Free Software Foundation.
#
# This program is distributed
.
--
IN-telegence GmbH
Oskar-Jäger-Str. 125
50825 Köln
Registergericht AG Köln - HRB 34038
USt-ID DE210882245
Geschäftsführende Gesellschafter: Christian Plätke und Holger Jansen
#!/bin/bash
#
#
# An OCF RA for conntrackd
# http://conntrack-tools.netfilter.org/
#
# Copyright (c) 2010 Dominik Klein
budgets for this.
Anyway, again, thanks for your advice. I'm going to do some research on
them.
On Thu, Apr 1, 2010 at 6:38 AM, Dominik Klein d...@in-telegence.net wrote:
Tony Gan wrote:
Hi,
For a two-node cluster, what are the best STONITH devices?
Currently I am using Dell's iDrac
Tony Gan wrote:
Hi,
For a two-node cluster, what are the best STONITH devices?
Currently I am using Dell's iDrac for STONITH device. It works pretty well.
However the biggest problem for iDrac or any other lights-out devices is
that they share power supply with hosts machines.
Once an
Aclhk Aclhk wrote:
On the same lan, there are already two heartbeat node 136pri and 137sec.
I setup another 2 nodes with heartbeat. they keep receiving the following
messages:
heartbeat[9931]: 2010/01/19_10:53:01 WARN: string2msg_ll: node [136pri]
failed authentication
heartbeat[9931]:
Andrew Beekhof wrote:
On Tue, Jan 12, 2010 at 10:43 AM, Raoul Bhatia [IPAX] r.bha...@ipax.at
wrote:
On 01/12/2010 10:39 AM, Florian Haas wrote:
Why not simply set that for root at boot? (it rhymes too :)
because i do not like the idea that each and every process gets
elevated limits by
I'd suggest an approach like Florian's from the Virtualdomain RA. Here's
a quote, guess you get the idea.
shutdown_timeout=$((($OCF_RESKEY_CRM_meta_timeout/1000)-5))
Regards
Dominik
Dejan Muhamedagic wrote:
Hi Hideo-san,
On Mon, Nov 30, 2009 at 11:00:05AM +0900, renayama19661...@ybb.ne.jp
Tomasz Chmielewski wrote:
Dejan Muhamedagic wrote:
Hi,
On Sun, Nov 15, 2009 at 09:09:53PM +0100, Tomasz Chmielewski wrote:
I have two nodes, node_1 and node_2.
node_2 was down, but is now up.
How can I execute a custom script on node_1 when it detects that node_2
is back?
That's not
Kenneth Simbron wrote:
Hi,
Is there a way to restrict some resources to work only on specific nodes and
other resources on another nodes?
http://clusterlabs.org/mediawiki/images/f/fb/Configuration_Explained.pdf
Read up on location constraints.
Regards
Dominik
Dejan Muhamedagic wrote:
Hi Florian,
On Wed, Sep 16, 2009 at 08:25:30AM +0200, Florian Haas wrote:
Lars, Dejan,
as discussed on #linux-ha yesterday, I've pushed a small changeset to
the Filesystem RA that implements a monitor operation which checks
whether I/O on the mounted filesystem is
Ivan Gromov wrote:
Hi, all
How to get group members?
I use crm_resource -x -t group -r group_Name. Can I get members without
xml part?
What about
crm configure show group-name ?
Regards
Dominik
___
Linux-HA mailing list
Tobias Appel wrote:
Hi,
I have a very weird error with heartbeat version 2.14.
I have two IPMI resources for my two nodes. The configuration is posted
here: http://pastebin.com/m52c1809c
node1 is named nagios1
node2 is named nagios2
now I have ipmi_nagios1 (which should run on
Alain.Moulle wrote:
Hello Andrew,
Could you explain why this functionnality is no more available
(configuration
lines remain in ha.cf) ?
ipfail was replaced by pingd in v2. That was in the very first version
of v2 afaik.
And how should we proceed to avoid split-brain cases in a two-nodes
Tobias Appel wrote:
Hi,
I need a command to see if a resource is started or not. Somehow my IPMI
resource does not always start, especially on one node (for example if I
reboot the node, or have a failover). There is no error and nothing, it
just does nothing at all.
Usually I have to
Tobias Appel wrote:
On 08/05/2009 10:30 AM, Dominik Klein wrote:
Tobias Appel wrote:
So all I need is a command line tool to check wether a resource is
currently started or not. I tried to check the resources with the
failcount command, but it's always 0. And the crm_resource command is
used
Alain.Moulle wrote:
Thanks Andrew,
1. So my understanding is that in a more than 2 nodes cluster , if
two nodes are failed, the have_quorum is set to 0 by the cluster soft
and the behavior is choosen by the administrator with the no-quorum-policy
parameter. So the question is now : what is
You can try to compose the output of cibadmin -Q -o
crm_config|resources|constraints to something usable for you.
looks like I have to run the command once for each type and then
concatenate the results.
That's sort of what I meant to say. Sorry for being unclear.
Regards
Dominik
Gavin Hamill wrote:
Hi :)
I'm using the Lenny packages http://people.debian.org/~madkiss/ha/ and
have been enjoying success with pacemaker + heartbeat (I've used a
heartbeat v1 config for years without problems).
I have a few IPaddr2 primitives in groups, but I'd like to understand
how I
Is there a query/config dump setting that will dump the running config to the
command line without the status attributes?
cibadmin -Q -o configuration
___
Linux-HA mailing list
Linux-HA@lists.linux-ha.org
Dave Augustus wrote:
On Mon, 2009-07-27 at 15:09 +0200, Dominik Klein wrote:
Is there a query/config dump setting that will dump the running config to
the command line without the status attributes?
cibadmin -Q -o configuration
What a quick reply!
However I get:
Call cib_query
Ehlers, Kolja wrote:
Yeah it supports SSH but if I log in using SSH there is just a menu to
configure the card. Since I can enter only 2 digits at that
prompt
1- Control
2- Diagnostics
3- Configuration
4- Detailed Status
5- About UPS
ESC- Back, ENTER-
...@lists.linux-ha.org] Im Auftrag von Dominik Klein
Gesendet: Freitag, 10. Juli 2009 08:27
An: General Linux-HA mailing list
Betreff: Re: [Linux-HA] Stonith with APC Smart UPS1000 +Network ManagementCard
Ehlers, Kolja wrote:
Yeah it supports SSH but if I log in using SSH there is just a menu
Steinhauer Juergen wrote:
Hi guys!
In my cluster setup, I have 6 IP addresses which should be started in
parallel for speed purpose, and two apps, depending on the six addresses.
What would be the best way to configure this?
Putting all IPs in a group will start them one after another.
c smith wrote:
Hi-
I currently implement DRBD with Pacemaker. The DRBD resource is configured
as a multi-state Master-slave resource in which node1 is the default master
and node2 is the default slave. I am putting together a backup system that
will run some automated scheduled tasks on
c smith wrote:
Dominik-
Thanks for the reply.. I'm aware that the documents advise against it, but
surely there must be a way. I was just looking at the new DRBD 8.3.2. It
includes a fencing handler script that, upon failure of a DRBD master, adds
a -INFINITY location constraint into the
On 6/25/09 1:20 AM, Dominik Klein d...@in-telegence.net wrote:
David Hoskinson wrote:
Thanks Got it going again. However my amavisd service fails with a
unknown exec error. Its the only one that won't work, and isn't related to
the group question. I have it setup the same as postfix
David Hoskinson wrote:
Thanks Got it going again. However my amavisd service fails with a
unknown exec error. Its the only one that won't work, and isn't related to
the group question. I have it setup the same as postfix, dovecot, etc.
Primitive amavisd lsb:amavisd op monitor
The default value for stonith-enabled is true. If you however do not
have a stonith device, that will give you an endless loop of
unsuccessfully trying to shoot the other node before doing anything else
to the resources the dead node was running.
try
crm configure property stonith-enabled=false
Koen Verwimp wrote:
Hi!
I have defined a resources called rg_alfresco_ip . This resource consists of
a OCF script (AlfrescoIP). This is script is a copy of IPAddr but with a
customized status/monitoring procedure.
group id= rg_alfresco_ip
primitive class=ocf
darren.mans...@opengi.co.uk wrote:
Hello everyone. Long post, sorry.
I've been trying to get SLES11 with Pacemaker 1.0 / OpenAIS working for
most of this week without success so far. I thought I may as well bundle
my problems into one mail to see if anyone can offer any advice.
Dominik Klein wrote:
darren.mans...@opengi.co.uk wrote:
Hello everyone. Long post, sorry.
I've been trying to get SLES11 with Pacemaker 1.0 / OpenAIS working for
most of this week without success so far. I thought I may as well bundle
my problems into one mail to see if anyone can offer
Trivial. See attached patch.
Regards
Dominik
exporting patch:
# HG changeset patch
# User Dominik Klein d...@in-telegence.net
# Date 1240578752 -7200
# Node ID 2d97904c385cc9b4779286001611bd748f48589d
# Parent 60cc2d6eee88ff6c2dedf7b539b9ee018efda6da
Low: RA mysql: Correctly remove eventually
fsalas wrote:
Hi, I'm quite new to clustering and HeartBeat, but as far as I know, a very
nice packages.
Well, here is my problem, I'm willing to setup a cluster for an small
enterprise that will have several services located in virtual machines, to
make it simpler, let's say we have four
Noah Miller wrote:
Hi -
Is it possible to restart a clustered service (v2 cluster) without its
dependent services also stopping and starting?
When the constraint score is advisory (0), dependencies should not be
restarted, but then they are not really dependencies in the sense of
the word.
Joe Bill wrote:
Stopping the Heartbeat daemon (service heartbeat stop)
does not stop the DRBD daemon even if it is one of
the resources.
- Heartbeat and DRBD are 2 different products/packages
- Like most services, DRBD doesn't need Heartbeat to run. You can set up and
run DRBD
- The DRBD daemons provide the communication interface
for each network volume and are therefor an integral
part of the volume management. Without the DRBD daemons,
you (manually) and Heartbeat (automagically) could not
handle the DRBD volumes.
Just to avoid confusion: There is no such thing
darren.mans...@opengi.co.uk wrote:
Hi. Can anyone recommend any good books about HA with regards to the
latest incarnations such as Pacemaker etc? I understand enough about the
CRM and heartbeat 2 to get by but lots of the stuff on this list still
goes over my head.
Thanks.
Darren
Jerome Yanga wrote:
Stopping the Heartbeat daemon (service heartbeat stop) does not stop the DRBD
daemon even if it is one of the resources.
# service heartbeat stop
Stopping High-Availability services:
[ OK ]
# service drbd
I know. But this attrbiut does not exist in my setup. pacemaker verison
1.0.1-1. Is this a feature of 1.0.2?
1.0.1 is 4 months old. The RA was updated with those features 3 months
ago. So basically, yes. You could still update the single RA from the
mercurial repository though.
Regards
So here's an update. Michael Schwartzkopf pointed out a bug regarding
groups. That has been fixed now and the appropriate values should be
shown. Thanks!
There's not been a lot of feedback, is it because nobody uses the script
or does it just work for you?
Regards
Dominik
Dominik Klein wrote
florian.engelm...@bt.com wrote:
Hello,
I spent the whole afternoon to search for a good heartbeat v2
documentation, but it looks like this is somehow difficult. Maybe
someone in here can help me?
Anyway I have a short question about stickiness. I only know about sun
cluster but I have to
Michael Schwartzkopff wrote:
Hi,
I am testing the pingd from the provider pacemaker. As Dominik told me, there
is no need to define ping nodes in the ha.cf any more. OK so far.
As I see pingd tries to reach all pingnodes of the hostlist attribute every
10
seconds. Is it possible to
Michael Schwartzkopff wrote:
Am Dienstag, 31. März 2009 15:27:47 schrieb Dominik Klein:
Michael Schwartzkopff wrote:
Hi,
I am testing the pingd from the provider pacemaker. As Dominik told me,
there is no need to define ping nodes in the ha.cf any more. OK so far.
As I see pingd tries
Juha Heinanen wrote:
Juha Heinanen writes:
the real problem is that start of mysql server by pacemaker stops
altogether after a few manual stops (/etc/init.d/mysql stop).
i think i figured this out. when pacemaker needed to start my
mysql-server resource three times on node lenny1,
Les Mikesell wrote:
My first HA setup is for a squid proxy where all I need is to move an IP
address to a backup server if the primary fails (and the cache can just
rebuild on its own). This seems to work, but will only fail if the
machine goes down completely or the primary IP is
Juha Heinanen wrote:
Dominik Klein writes:
Heartbeat in v1 mode (haresources configuration) cannot do any resource
level monitoring itself. You'd need to do that externally by any
means.
yes, in v2 mode i have managed to make pacemaker to monitor resources,
for example, like
Michael Schwartzkopff wrote:
Hi,
In the metadata of the pengine I found the attribute maintenance-mode. I did
not find any documentation about it. The long description also says: Should
the cluster Anybody knows what this options does?
Thanks.
It disables resource management when
crmd metadata tells me that expected-quorum-votes
are used to calculate quorum in openais based clusters. Its default value is
2. Do I have to change this value if I have 3 or more nodes in a OpenAIS
based
cluster?
No. It is automatically adjusted by the cluster.
Regards
Dominik
You cannot use drbd in heartbeat the way you configured it.
Please refer to http://wiki.linux-ha.org/DRBD/HowTov2 and (if that
wasn't made clear enough on the page) make sure the first thing you do
is upgrade your cluster software. Read here on how to do that:
http://clusterlabs.org/wiki/Install
Dominik Klein wrote:
You cannot use drbd in heartbeat the way you configured it.
Please refer to http://wiki.linux-ha.org/DRBD/HowTov2
Sorry, copy/paste error. I meant to say
http://www.clusterlabs.org/wiki/DRBD_HowTo_1.0
___
Linux-HA mailing list
Is there some documentation available for openais? I can't even find a
good description of what it does or why you would use it. Also, will
this help with my 2nd question: having a few spares for a large number
of servers? While my objective with the squid cache is to proxy
everything
Dejan Muhamedagic wrote:
Hi,
On Wed, Mar 18, 2009 at 11:37:27AM -0700, Neil Katin wrote:
Dejan Muhamedagic wrote:
Hi,
On Tue, Mar 17, 2009 at 11:56:04AM +0530, Arun G wrote:
Hi,
I observed below error message when I upgraded drbd to drbd-8.3.0 in
heartbeat 2.1.4 cluster on
Michael Schwartzkopff wrote:
Hi,
As far as I know pingnodes have to be configured in heartbeat. heartbeat
pings
the nodes and updates the CIB.
Where can I configure pingnodes, when I use OpenAIS as the cluster stack?
Create a pingd clone resource in the CIB. It's the preferred way of
Hi
Jerome Yanga wrote:
Dominik,
As usual, you are right on the money. I should have caught that myself.
Thank you for catching that for me. What happened was that I used a
different server to compile DRBD and I had assumed that Nomen and Rubic (my
test nodes) were on the same kernel.
Hi
Jerome Yanga wrote:
Hi! I am having issues with getting DRBD to work with Pacemaker. I can get
Pacemaker and DRBD run individually but not DRBD managed by Pacemaker. I
tried following the instruction in the site below but the resources will not
go online.
Hi
I made the necessary changes to the showscores script to work with
pacemaker 1.0.2.
Please test and report problems. Has been reported to work by some
people and should go into the repository soon. Still, I'd like more
people to test and confirm.
Important changes:
* correctly fetch
showscores gives me:
~# ./showscores.sh
ResourceScore NodeStickiness #FailFail-
Stickiness
50 0
on 50
Tears ! wrote:
Dear members!
I have first time Install heartbeat on Slackware 12.2. I have
enable debugging in ha.cf
Here is the some debug message i want to describe here.
Feb 14 23:01:15 haServer1 heartbeat: [15131]: WARN: Core dumps could be lost
if multiple dumps occur.
Feb 14
v2.1 to 2.9
but must have missed this bit.
user land and kernel module all report the same version.
I am on my way into the office now and I will apply the changes once
there
thanks again
Jason
2009/2/12 Dominik Klein d...@in-telegence.net
Right, this one looks better.
I'll refer
Hi
heartbeat in v1 mode does not do resource monitoring by itself. So if
you did not set up any custom resource monitoring, you can just stop
your application in whatever way you normally do that and re-start it
whenever you like.
v1 clusters will not notice. They only see node state changes.
if you can improve things.
Regards
Dominik
exporting patch:
# HG changeset patch
# User Dominik Klein d...@in-telegence.net
# Date 1234350091 -3600
# Node ID 04533b37813c8be009814f52de7b14ff65bf9862
# Parent 90ff997faa7288248ac57583b0c03df4c8e41bda
RA: anything. Implement most of lmbs suggestions
Gerd König wrote:
Hi Dominik,
thanks for answering quickly, but there were no dependencies found:
#zypper search openipmi
* Lese installierte Pakete [100%]
Keine möglichen Abhängigkeiten gefunden.
Do I need some additional software repositories ?
I don't think so. The packages should
Zemke, Kai wrote:
Hi,
I'm running a two node failover cluster. Yesterday the cluster tried to
manage a state transition. In the log files I found the following entries:
heartbeat[6905]: 2009/02/10_21:45:55 WARN: node nagios-drbd2: is dead
heartbeat[6905]: 2009/02/10_21:45:55
archive :)
Regards
Dominik
Thanks
Jason
2009/2/11 Dominik Klein d...@in-telegence.net
Hi Jason
any chance you started drbd at boot or the drbd device was active at the
time you started the cluster resource? If so, read the introduction of
the howto again and correct your setup.
Jason
The archive only contains info for one node and the logfile is empty.
Did you use appropriate -f time and does ssh work between the nodes?
So far, nothing obvious to me except for the order between your FS and
DRBD lacking the role definition, but that's not what your problem is
about (yet *g*).
Right, this one looks better.
I'll refer to nodes as 1001 and 1002.
1002 is your DC.
You have stonith enabled, but no stonith devices. Disable stonith or get
and configure a stonith device (_please_ dont use ssh).
1002 ha-log lines 926:939, node 1002 wants to shoot 1001, but cannot (l
978).
Dominik Klein wrote:
Right, this one looks better.
I'll refer to nodes as 1001 and 1002.
1002 is your DC.
You have stonith enabled, but no stonith devices. Disable stonith or get
and configure a stonith device (_please_ dont use ssh).
1002 ha-log lines 926:939, node 1002 wants to shoot
Jason Fitzpatrick wrote:
Hi All
I am having a hell of a time trying to get heartbeat to fail over my DRBD
harddisk and am hoping for some help.
I have a 2 node cluster, heartbeat is working as I am able to fail over IP
Addresses and services successfully, but when I try to fail over my DRBD
make it primary on either node
Thanks
Jason
2009/2/10 Jason Fitzpatrick jayfitzpatr...@gmail.com
Thanks,
This was the latest version in the Fedora Repos, I will upgrade and see
what happens
Jason
2009/2/10 Dominik Klein d...@in-telegence.net
Jason Fitzpatrick wrote:
Hi All
I am
Gerd König wrote:
Hello list,
I wanted to start with heartbeat using the latest sources for
OpenSuse10.3 64bit.
I've downloaded these rpm's:
heartbeat-2.99.2-6.1.x86_64.rpm
heartbeat-common-2.99.2-6.1.x86_64.rpm
heartbeat-debuginfo-2.99.2-6.1.x86_64.rpm
It is OCF_ERR_GENERIC, not OCF_ERROR_GENERIC. Read
/usr/lib/ocf/resource.d/heartbeat/.ocf-returncodes
You can also use ocf-tester to test your ocf script.
Regards
Dominik
lakshmipadmaja maddali wrote:
Hi all,
I have a strange issue, that ocf_error_generic is being
ingored at times.
1 - 100 of 338 matches
Mail list logo