Re: [Linux-HA] Q on http://clusterlabs.org/wiki/FAQ#Resource_is_Too_Active

2011-12-16 Thread Dominik Klein
On 12/15/2011 11:19 AM, Ulrich Windl wrote: > Hi! > > I have a problem with some client-server software (I don't want to > name it here) where client and server both need an etry for inetd > (xinetd). It's also possible that client and server are running on > one machine. > > For a cluster soluti

Re: [Linux-HA] Antw: Re: Forkbomb not initiating failover

2011-08-29 Thread Dominik Klein
On 08/29/2011 09:51 AM, Dominik Klein wrote: > Node level failure is detected on the communications layer, ie hearbeat > or corosync. That software is run with realtime priority. So it keeps > working just fine (use tcpdump on the healthy node to verify). So > pacemaker on the healt

Re: [Linux-HA] Antw: Re: Forkbomb not initiating failover

2011-08-29 Thread Dominik Klein
Node level failure is detected on the communications layer, ie hearbeat or corosync. That software is run with realtime priority. So it keeps working just fine (use tcpdump on the healthy node to verify). So pacemaker on the healthy node does now know that the other node has a problem and there

Re: [Linux-HA] How to start Pacemaker in unmanaged mode ?

2011-05-11 Thread Dominik Klein
On 05/11/2011 10:24 AM, Alain.Moulle wrote: > Hi Dominik, > I just have tried again : > "service corosync stop" on both nodes node1 & node2 > remove the cib.xml on node2 > vi cib.xml on node1 > set the property maintenance-mode=true > ( name="maintenance-mode" value="true"/>) > wq! > star

Re: [Linux-HA] How to start Pacemaker in unmanaged mode ?

2011-05-10 Thread Dominik Klein
; Pacemaker a lot ! > We always have to go through some crm commands or similar > (crm_attributes, etc.) > but they are not taken in account before the 60s are ended. > > Alain > > Dominik Klein a écrit : >> Just write it to the xml on all nodes? >> >> On 0

Re: [Linux-HA] How to start Pacemaker in unmanaged mode ?

2011-05-10 Thread Dominik Klein
Just write it to the xml on all nodes? On 05/10/2011 01:23 PM, Alain.Moulle wrote: > Sorry I meant "directly with is_managed=false" of course ! > Alain ___ Linux-HA mailing list Linux-HA@lists.linux-ha.org http://lists.linux-ha.org/mailman/listinfo/linux

Re: [Linux-HA] ocf:pacemaker:ping: dampen

2011-04-29 Thread Dominik Klein
> correcto wow. again! :) ___ Linux-HA mailing list Linux-HA@lists.linux-ha.org http://lists.linux-ha.org/mailman/listinfo/linux-ha See also: http://linux-ha.org/ReportingProblems

Re: [Linux-HA] ocf:pacemaker:ping: dampen

2011-04-29 Thread Dominik Klein
It waits $dampen before changes are pushed to the cib. So that eventually occuring icmp hickups do not produce an unintended failover. At least that's my understanding. Regards Dominik On 04/29/2011 09:22 AM, Ulrich Windl wrote: > Hi, > > I think the description for "dampen" in OCF:pacemaker:pi

Re: [Linux-HA] stonith + APC Masterswitch (AP9225 + AP9616)

2011-02-25 Thread Dominik Klein
You could also try apcmastersnmp. Got that to work with apc devices which did not work with the telnet thing. As long as they didn't change mibs (which I don't know whether they have). Might be worth a shot. Regards Dominik On 02/25/2011 02:24 AM, Avestan wrote: > > Hello Dejan, > > As I am t

Re: [Linux-HA] Need suggestion on STONITH device

2010-04-07 Thread Dominik Klein
> > But yes, probably need additional budgets for this. > > Anyway, again, thanks for your advice. I'm going to do some research on > them. > > > > On Thu, Apr 1, 2010 at 6:38 AM, Dominik Klein wrote: > >> Tony Gan wrote: >>> Hi, >>> For a t

Re: [Linux-HA] Need suggestion on STONITH device

2010-04-01 Thread Dominik Klein
Tony Gan wrote: > Hi, > For a two-node cluster, what are the best STONITH devices? > > Currently I am using Dell's iDrac for STONITH device. It works pretty well. > However the biggest problem for iDrac or any other lights-out devices is > that they share power supply with hosts machines. > > Onc

Re: [Linux-HA] messages from existing hearbeat on the same lan

2010-01-19 Thread Dominik Klein
Aclhk Aclhk wrote: > On the same lan, there are already two heartbeat node 136pri and 137sec. > > I setup another 2 nodes with heartbeat. they keep receiving the following > messages: > > heartbeat[9931]: 2010/01/19_10:53:01 WARN: string2msg_ll: node [136pri] > failed authentication > heartbeat

Re: [Linux-HA] heartbeat - execute a script on a running node when the other node is back?

2009-11-16 Thread Dominik Klein
Tomasz Chmielewski wrote: > Dejan Muhamedagic wrote: >> Hi, >> >> On Sun, Nov 15, 2009 at 09:09:53PM +0100, Tomasz Chmielewski wrote: >>> I have two nodes, node_1 and node_2. >>> >>> node_2 was down, but is now up. >>> >>> >>> How can I execute a custom script on node_1 when it detects that node_2

Re: [Linux-HA] Restrict resources to specific nodes only

2009-09-28 Thread Dominik Klein
Kenneth Simbron wrote: > Hi, > > Is there a way to restrict some resources to work only on specific nodes and > other resources on another nodes? http://clusterlabs.org/mediawiki/images/f/fb/Configuration_Explained.pdf Read up on location constraints. Regards Dominik ___

Re: [Linux-HA] Master/Slave constraints preventing resources from failing over ?

2009-09-23 Thread Dominik Klein
> ptest[25537]: 2009/09/22_21:37:56 CRIT: dump_node_scores: clone_color: > ms-drbd0.lnodbct = 0 > ptest[25537]: 2009/09/22_21:37:56 CRIT: dump_node_scores: clone_color: > ms-drbd0.lnodbbt = 0 > ptest[25537]: 2009/09/22_21:37:56 CRIT: dump_node_scores: clone_color: > drbd0:0.lnodbct = 0 > ptest[2

Re: [Linux-HA] how to get group members

2009-08-19 Thread Dominik Klein
Ivan Gromov wrote: > Hi, all > > How to get group members? > I use crm_resource -x -t group -r group_Name. Can I get members without > xml part? What about crm configure show ? Regards Dominik ___ Linux-HA mailing list Linux-HA@lists.linux-ha.org ht

Re: [Linux-HA] Constraints works for one resource but not for another

2009-08-17 Thread Dominik Klein
Tobias Appel wrote: > Hi, > > I have a very weird error with heartbeat version 2.14. > > I have two IPMI resources for my two nodes. The configuration is posted > here: http://pastebin.com/m52c1809c > > node1 is named nagios1 > node2 is named nagios2 > > now I have ipmi_nagios1 (which should r

Re: [Linux-HA] Pacemaker 1.4 & HBv2 1.99 // About quorum choice (contd.)

2009-08-07 Thread Dominik Klein
Alain.Moulle wrote: > Hello Andrew, > Could you explain why this functionnality is no more available > (configuration > lines remain in ha.cf) ? ipfail was replaced by pingd in v2. That was in the very first version of v2 afaik. > And how should we proceed to avoid split-brain cases in a two-nod

Re: [Linux-HA] Pacemaker 1.4 & HBv2 1.99 // About quorum choice (contd.)

2009-08-05 Thread Dominik Klein
Alain.Moulle wrote: > Thanks Andrew, > > 1. So my understanding is that in a "more than 2 nodes cluster" , if > two nodes are failed, the have_quorum is set to 0 by the cluster soft > and the behavior is choosen by the administrator with the no-quorum-policy > parameter. So the question is now : w

Re: [Linux-HA] Command to see if a resource is started or not

2009-08-05 Thread Dominik Klein
Tobias Appel wrote: > On 08/05/2009 10:30 AM, Dominik Klein wrote: >> Tobias Appel wrote: >>> So all I need is a command line tool to check wether a resource is >>> currently started or not. I tried to check the resources with the >>> failcount command, but

Re: [Linux-HA] Command to see if a resource is started or not

2009-08-05 Thread Dominik Klein
Tobias Appel wrote: > Hi, > > I need a command to see if a resource is started or not. Somehow my IPMI > resource does not always start, especially on one node (for example if I > reboot the node, or have a failover). There is no error and nothing, it > just does nothing at all. > Usually I hav

Re: [Linux-HA] Adding to a group without downtime

2009-07-28 Thread Dominik Klein
Gavin Hamill wrote: > Hi :) > > I'm using the Lenny packages http://people.debian.org/~madkiss/ha/ and > have been enjoying success with pacemaker + heartbeat (I've used a > heartbeat v1 config for years without problems). > > I have a few IPaddr2 primitives in groups, but I'd like to understand

Re: [Linux-HA] updating cib without status attributes

2009-07-27 Thread Dominik Klein
>> You can try to compose the output of cibadmin -Q -o >> crm_config|resources|constraints to something usable for you. >> > > looks like I have to run the command once for each type and then > concatenate the results. That's sort of what I meant to say. Sorry for being unclear. Regards Dominik

Re: [Linux-HA] updating cib without status attributes

2009-07-27 Thread Dominik Klein
Dave Augustus wrote: > On Mon, 2009-07-27 at 15:09 +0200, Dominik Klein wrote: >>> Is there a query/config dump setting that will dump the running config to >>> the command line without the status attributes? >> cibadmin -Q -o configuration > > What a quick repl

Re: [Linux-HA] updating cib without status attributes

2009-07-27 Thread Dominik Klein
> Is there a query/config dump setting that will dump the running config to the > command line without the status attributes? cibadmin -Q -o configuration ___ Linux-HA mailing list Linux-HA@lists.linux-ha.org http://lists.linux-ha.org/mailman/listinfo/l

Re: [Linux-HA] Resource set question

2009-07-10 Thread Dominik Klein
Steinhauer Juergen wrote: > Hi guys! > > In my cluster setup, I have 6 IP addresses which should be started in > parallel for speed purpose, and two apps, depending on the six addresses. > > What would be the best way to configure this? > Putting all IPs in a group will start them one after anoth

Re: [Linux-HA] Stonith with APC Smart UPS1000 +Network ManagementCard

2009-07-10 Thread Dominik Klein
on: linux-ha-boun...@lists.linux-ha.org > [mailto:linux-ha-boun...@lists.linux-ha.org] Im Auftrag von Dominik Klein > Gesendet: Freitag, 10. Juli 2009 08:27 > An: General Linux-HA mailing list > Betreff: Re: [Linux-HA] Stonith with APC Smart UPS1000 +Network ManagementCard > > Ehlers, Kolja

Re: [Linux-HA] Stonith with APC Smart UPS 1000 +Network ManagementCard

2009-07-09 Thread Dominik Klein
Ehlers, Kolja wrote: > Yeah it supports SSH but if I log in using SSH there is just a menu to > configure the card. Since I can enter only 2 digits at that > prompt > > 1- Control > 2- Diagnostics > 3- Configuration > 4- Detailed Status > 5- About UPS > > - Back, -

Re: [Linux-HA] Master-slave, stopping a slave.

2009-07-08 Thread Dominik Klein
c smith wrote: > Dominik- > > Thanks for the reply.. I'm aware that the documents advise against it, but > surely there must be a way. I was just looking at the new DRBD 8.3.2. It > includes a fencing handler script that, upon failure of a DRBD master, adds > a -INFINITY location constraint into

Re: [Linux-HA] Master-slave, stopping a slave.

2009-07-08 Thread Dominik Klein
c smith wrote: > Hi- > > I currently implement DRBD with Pacemaker. The DRBD resource is configured > as a multi-state Master-slave resource in which node1 is the default master > and node2 is the default slave. I am putting together a backup system that > will run some automated scheduled task

Re: [Linux-HA] Add resource to a group

2009-06-25 Thread Dominik Klein
ll try this in a bit, thanks for the tip >> >> >> On 6/25/09 1:20 AM, "Dominik Klein" wrote: >> >>> David Hoskinson wrote: >>>> Thanks Got it going again. However my amavisd service fails with a >>>> unknown exec error. Its

Re: [Linux-HA] Failover problem

2009-06-24 Thread Dominik Klein
The default value for stonith-enabled is true. If you however do not have a stonith device, that will give you an endless loop of unsuccessfully trying to shoot the other node before doing anything else to the resources the dead node was running. try crm configure property stonith-enabled=false c

Re: [Linux-HA] Add resource to a group

2009-06-24 Thread Dominik Klein
David Hoskinson wrote: > Thanks Got it going again. However my amavisd service fails with a > unknown exec error. Its the only one that won't work, and isn't related to > the group question. I have it setup the same as postfix, dovecot, etc. > > Primitive amavisd lsb:amavisd op monitor inte

Re: [Linux-HA] Monitoring resources

2009-05-26 Thread Dominik Klein
Koen Verwimp wrote: > Hi! > > > > I have defined a resources called rg_alfresco_ip . This resource consists of > a OCF script (AlfrescoIP). This is script is a copy of IPAddr but with a > customized status/monitoring procedure. > > > > > > > > > > > >

Re: [Linux-HA] Problems With SLES11 + DRBD

2009-05-04 Thread Dominik Klein
Dominik Klein wrote: > darren.mans...@opengi.co.uk wrote: >> Hello everyone. Long post, sorry. >> >> >> >> I've been trying to get SLES11 with Pacemaker 1.0 / OpenAIS working for >> most of this week without success so far. I thought I may as well b

Re: [Linux-HA] Problems With SLES11 + DRBD

2009-05-04 Thread Dominik Klein
darren.mans...@opengi.co.uk wrote: > Hello everyone. Long post, sorry. > > > > I've been trying to get SLES11 with Pacemaker 1.0 / OpenAIS working for > most of this week without success so far. I thought I may as well bundle > my problems into one mail to see if anyone can offer any advice. >

Re: [Linux-HA] Assymetric Clustering

2009-04-22 Thread Dominik Klein
fsalas wrote: > Hi, I'm quite new to clustering and HeartBeat, but as far as I know, a very > nice packages. > > Well, here is my problem, I'm willing to setup a cluster for an small > enterprise that will have several services located in virtual machines, to > make it simpler, let's say we have f

Re: [Linux-HA] Restart a service without the dependent services restarting?

2009-04-20 Thread Dominik Klein
Noah Miller wrote: > Hi - > > Is it possible to restart a clustered service (v2 cluster) without its > dependent services also stopping and starting? When the constraint score is advisory (0), dependencies should not be restarted, but then they are not really "dependencies" in the sense of the wo

Re: [Linux-HA] Re: Re: Re: Stopping the Heartbeat daemon does not stop the DRBD Daemon

2009-04-03 Thread Dominik Klein
>>> - The DRBD daemons provide the communication interface >>> for each network volume and are therefor an integral >>> part of the volume management. Without the DRBD daemons, >>> you (manually) and Heartbeat (automagically) could not >>> handle the DRBD volumes. >> Just to avoid confusion: There

Re: [Linux-HA] Re: Stopping the Heartbeat daemon does not stop the DRBD Daemon

2009-04-03 Thread Dominik Klein
Joe Bill wrote: > >> Stopping the Heartbeat daemon (service heartbeat stop) >> does not stop the DRBD daemon even if it is one of >> the resources. > > - Heartbeat and DRBD are 2 different products/packages > > - Like most services, DRBD doesn't need Heartbeat to run. You can set up and > run

Re: [Linux-HA] Stopping the Heartbeat daemon does not stop the DRBD Daemon

2009-04-02 Thread Dominik Klein
Jerome Yanga wrote: > Stopping the Heartbeat daemon (service heartbeat stop) does not stop the DRBD > daemon even if it is one of the resources. > > # service heartbeat stop > Stopping High-Availability services: >[ OK ] > # service dr

Re: [Linux-HA] HA Books

2009-04-02 Thread Dominik Klein
darren.mans...@opengi.co.uk wrote: > Hi. Can anyone recommend any good books about HA with regards to the > latest incarnations such as Pacemaker etc? I understand enough about the > CRM and heartbeat 2 to get by but lots of the stuff on this list still > goes over my head. > > Thanks. > > Darren

Re: [Linux-HA] Heartbeat v2 stickiness, score and more

2009-04-01 Thread Dominik Klein
florian.engelm...@bt.com wrote: > Hello, > I spent the whole afternoon to search for a good heartbeat v2 > documentation, but it looks like this is somehow difficult. Maybe > someone in here can help me? > > Anyway I have a short question about "stickiness". I only know about sun > cluster but I h

Re: [Linux-HA] showscores.sh for pacemaker 1.0.2

2009-04-01 Thread Dominik Klein
So here's an update. Michael Schwartzkopf pointed out a bug regarding groups. That has been fixed now and the appropriate values should be shown. Thanks! There's not been a lot of feedback, is it because nobody uses the script or does it just work for you? Regards Dominik Dominik K

Re: [Linux-HA] pingd/pacemaker

2009-03-31 Thread Dominik Klein
> I know. But this attrbiut does not exist in my setup. pacemaker verison > 1.0.1-1. Is this a feature of 1.0.2? 1.0.1 is 4 months old. The RA was updated with those features 3 months ago. So basically, yes. You could still update the single RA from the mercurial repository though. Regards Domin

Re: [Linux-HA] pingd/pacemaker

2009-03-31 Thread Dominik Klein
Michael Schwartzkopff wrote: > Am Dienstag, 31. März 2009 15:27:47 schrieb Dominik Klein: >> Michael Schwartzkopff wrote: >>> Hi, >>> >>> I am testing the pingd from the provider pacemaker. As Dominik told me, >>> there is no need to define ping nodes i

Re: [Linux-HA] pingd/pacemaker

2009-03-31 Thread Dominik Klein
Michael Schwartzkopff wrote: > Hi, > > I am testing the pingd from the provider pacemaker. As Dominik told me, there > is no need to define ping nodes in the ha.cf any more. OK so far. > > As I see pingd tries to reach all pingnodes of the hostlist attribute every > 10 > seconds. Is it possibl

Re: [Linux-HA] DRBD / HAr2

2009-03-30 Thread Dominik Klein
Michael Judd wrote: > Hi, > > I'm using the ocf/heartbeat resource (Master/Slave OCF Resource Agent > for DRBD) with a ocf/heartbeat Filesystem resource to mount the > filesystem which is ext3. > > This can be mounted and unmounted on each node without hearbeat. > > However, when I fail over a

Re: [Linux-HA] Beginner questions

2009-03-23 Thread Dominik Klein
Juha Heinanen wrote: > Juha Heinanen writes: > > > the real problem is that start of mysql server by pacemaker stops > > altogether after a few manual stops (/etc/init.d/mysql stop). > > i think i figured this out. when pacemaker needed to start my > mysql-server resource three times on node l

Re: [Linux-HA] Beginner questions

2009-03-23 Thread Dominik Klein
> Is there some documentation available for openais? I can't even find a > good description of what it does or why you would use it. Also, will > this help with my 2nd question: having a few spares for a large number > of servers? While my objective with the squid cache is to proxy > everything

Re: [Linux-HA] Heartbeat degrades drbd resource

2009-03-23 Thread Dominik Klein
Dominik Klein wrote: > You cannot use drbd in heartbeat the way you configured it. > > Please refer to http://wiki.linux-ha.org/DRBD/HowTov2 Sorry, copy/paste error. I meant to say http://www.clusterlabs.org/wiki/DRBD_HowTo_1.0 ___ Linux-H

Re: [Linux-HA] Heartbeat degrades drbd resource

2009-03-23 Thread Dominik Klein
You cannot use drbd in heartbeat the way you configured it. Please refer to http://wiki.linux-ha.org/DRBD/HowTov2 and (if that wasn't made clear enough on the page) make sure the first thing you do is upgrade your cluster software. Read here on how to do that: http://clusterlabs.org/wiki/Install

Re: [Linux-HA] expected-quorum-votes

2009-03-23 Thread Dominik Klein
> crmd metadata tells me that expected-quorum-votes > are used to calculate quorum in openais based clusters. Its default value is > 2. Do I have to change this value if I have 3 or more nodes in a OpenAIS > based > cluster? No. It is automatically adjusted by the cluster. Regards Dominik

Re: [Linux-HA] maintenance-mode of pengine

2009-03-23 Thread Dominik Klein
Michael Schwartzkopff wrote: > Hi, > > In the metadata of the pengine I found the attribute maintenance-mode. I did > not find any documentation about it. The long description also says: "Should > the cluster ...". Anybody knows what this options does? > > Thanks. It disables resource manageme

Re: [Linux-HA] Beginner questions

2009-03-23 Thread Dominik Klein
Juha Heinanen wrote: > Dominik Klein writes: > > > Heartbeat in v1 mode (haresources configuration) cannot do any resource > > level monitoring itself. You'd need to do that externally by any > > means. > > yes, in v2 mode i have managed to make pacemaker

Re: [Linux-HA] Beginner questions

2009-03-23 Thread Dominik Klein
Les Mikesell wrote: > My first HA setup is for a squid proxy where all I need is to move an IP > address to a backup server if the primary fails (and the cache can just > rebuild on its own). This seems to work, but will only fail if the > machine goes down completely or the primary IP is unreacha

Re: [Linux-HA] drbd RA issue in (heartbeat 2.1.4 + drbd-8.3.0)

2009-03-19 Thread Dominik Klein
Dejan Muhamedagic wrote: > Hi, > > On Wed, Mar 18, 2009 at 11:37:27AM -0700, Neil Katin wrote: >> >> Dejan Muhamedagic wrote: >>> Hi, >>> >>> On Tue, Mar 17, 2009 at 11:56:04AM +0530, Arun G wrote: Hi, I observed below error message when I upgraded drbd to drbd-8.3.0 in heartbe

Re: [Linux-HA] pingnodes in openais

2009-03-18 Thread Dominik Klein
Michael Schwartzkopff wrote: > Hi, > > As far as I know pingnodes have to be configured in heartbeat. heartbeat > pings > the nodes and updates the CIB. > > Where can I configure pingnodes, when I use OpenAIS as the cluster stack? Create a pingd clone resource in the CIB. It's the preferred wa

Re: [Linux-HA] Having issues with getting DRBD to work with Pacemaker

2009-03-09 Thread Dominik Klein
Hi Jerome Yanga wrote: > Dominik, > > As usual, you are right on the money. I should have caught that myself. > Thank you for catching that for me. What happened was that I used a > different server to compile DRBD and I had assumed that Nomen and Rubic (my > test nodes) were on the same ke

Re: [Linux-HA] Having issues with getting DRBD to work with Pacemaker

2009-03-04 Thread Dominik Klein
Hi Jerome Yanga wrote: > Hi! I am having issues with getting DRBD to work with Pacemaker. I can get > Pacemaker and DRBD run individually but not DRBD managed by Pacemaker. I > tried following the instruction in the site below but the resources will not > go online. > > http://clusterlabs.o

[Linux-HA] showscores.sh for pacemaker 1.0.2

2009-03-03 Thread Dominik Klein
Hi I made the necessary changes to the showscores script to work with pacemaker 1.0.2. Please test and report problems. Has been reported to work by some people and should go into the repository soon. Still, I'd like more people to test and confirm. Important changes: * correctly fetch stickines

Re: [Linux-HA] showscores for pacamaker-1.0

2009-03-02 Thread Dominik Klein
> showscores gives me: > ~# ./showscores.sh > ResourceScore NodeStickiness #FailFail- > Stickiness > 50 0 > > on 50

Re: [Linux-HA] HA debug message

2009-02-16 Thread Dominik Klein
Tears ! wrote: > Dear members! > I have first time Install heartbeat on Slackware 12.2. I have > enable debugging in ha.cf > > Here is the some debug message i want to describe here. > > Feb 14 23:01:15 haServer1 heartbeat: [15131]: WARN: Core dumps could be lost > if multiple dumps occur. > Feb

Re: [Linux-HA] Is it possible to cleanly take down a resource in a v1 config?

2009-02-12 Thread Dominik Klein
Hi heartbeat in v1 mode does not do resource monitoring by itself. So if you did not set up any custom resource monitoring, you can just stop your application in whatever way you normally do that and re-start it whenever you like. v1 clusters will not notice. They only see node state changes. Re

Re: [Linux-HA] DRBD in a 2 node cluster

2009-02-12 Thread Dominik Klein
s improves things, >> >> >> re Stonith I Uninstalled as part of the move from heartbeat v2.1 to 2.9 > but must have missed this bit. >> >> user land and kernel module all report the same version. >> >> I am on my way into the office now and I will apply th

Re: [Linux-HA] DRBD in a 2 node cluster

2009-02-11 Thread Dominik Klein
Dominik Klein wrote: > Right, this one looks better. > > I'll refer to nodes as 1001 and 1002. > > 1002 is your DC. > You have stonith enabled, but no stonith devices. Disable stonith or get > and configure a stonith device (_please_ dont use ssh). > > 1002 ha-lo

Re: [Linux-HA] DRBD in a 2 node cluster

2009-02-11 Thread Dominik Klein
Right, this one looks better. I'll refer to nodes as 1001 and 1002. 1002 is your DC. You have stonith enabled, but no stonith devices. Disable stonith or get and configure a stonith device (_please_ dont use ssh). 1002 ha-log lines 926:939, node 1002 wants to shoot 1001, but cannot (l 978). Retr

Re: [Linux-HA] Problem understanding resource fencing

2009-02-11 Thread Dominik Klein
J. Friedrich wrote: > Hi Michael, > HI group, > > I expanded my configuration and added the attributes > resource-failure-stickiness and resource-stickiness to one resource of > the "bar group" . I used "-30" for resource-failure-stickiness and > "100" for resource-stickiness. In old versions it

Re: [Linux-HA] DRBD in a 2 node cluster

2009-02-11 Thread Dominik Klein
The archive only contains info for one node and the logfile is empty. Did you use appropriate "-f" time and does ssh work between the nodes? So far, nothing obvious to me except for the order between your FS and DRBD lacking the role definition, but that's not what your problem is about (yet *g*).

Re: [Linux-HA] DRBD in a 2 node cluster

2009-02-11 Thread Dominik Klein
ion and logs. hb_report should gather everything needed and put it into a nice .bz2 archive :) Regards Dominik > Thanks > > Jason > > 2009/2/11 Dominik Klein > >> Hi Jason >> >> any chance you started drbd at boot or the drbd device was active at the >>

Re: [Linux-HA] Failovercluster considered one node down but state transition did not happen succesfully

2009-02-11 Thread Dominik Klein
Zemke, Kai wrote: > Hi, > > > > I'm running a two node failover cluster. Yesterday the cluster tried to > manage a state transition. In the log files I found the following entries: > > > > heartbeat[6905]: 2009/02/10_21:45:55 WARN: node nagios-drbd2: is dead > > heartbeat[6905]: 2009/02/1

Re: [Linux-HA] failed dependencies while installing heartbeat 2.99.2-6.1

2009-02-11 Thread Dominik Klein
Gerd König wrote: > Hi Dominik, > > thanks for answering quickly, but there were no dependencies found: > > #>zypper search openipmi > * Lese installierte Pakete [100%] > Keine möglichen Abhängigkeiten gefunden. > > Do I need some additional software repositories ? I don't think so. The package

Re: [Linux-HA] failed dependencies while installing heartbeat 2.99.2-6.1

2009-02-10 Thread Dominik Klein
Gerd König wrote: > Hello list, > > I wanted to start with heartbeat using the latest sources for > OpenSuse10.3 64bit. > I've downloaded these rpm's: > > heartbeat-2.99.2-6.1.x86_64.rpm > heartbeat-common-2.99.2-6.1.x86_64.rpm > heartbeat-debuginfo-2.99.2-6.1.x86_64.rpm > heartbeat-resources-2.9

Re: [Linux-HA] DRBD in a 2 node cluster

2009-02-10 Thread Dominik Klein
=1, cib-update=380, confirmed=true) complete > unknown e > rror > . > > I have checked the DRBD device Storage1 and it is in secondary mode after > the start, and should I choose I can make it primary on either node > > Thanks > > Jason > > 2009/2/10 Jason

Re: [Linux-HA] DRBD in a 2 node cluster

2009-02-10 Thread Dominik Klein
Jason Fitzpatrick wrote: >> Hi All >> >> I am having a hell of a time trying to get heartbeat to fail over my DRBD >> harddisk and am hoping for some help. >> >> I have a 2 node cluster, heartbeat is working as I am able to fail over IP >> Addresses and services successfully, but when I try to fail

Re: [Linux-HA] STONITH / ssh not working in a 2-node cluster

2009-02-06 Thread Dominik Klein
Tobias Appel wrote: > On Fri, 2009-02-06 at 14:52 +0100, Dominik Klein wrote: >> Just guessing, but your cluster does not know about those "external" >> addresses, does it? >> > Well I have an internal ip added to /etc/hosts and a shortcut for the > hostna

Re: [Linux-HA] STONITH / ssh not working in a 2-node cluster

2009-02-06 Thread Dominik Klein
Just guessing, but your cluster does not know about those "external" addresses, does it? Sounds like you're only using one connection between the nodes for cluster communication, pull that, see split-brain and want the cluster to use a connection it does not know about. But even if you configured

Re: [Linux-HA] OCF_ERROR_GENERIC

2009-02-04 Thread Dominik Klein
It is OCF_ERR_GENERIC, not OCF_ERROR_GENERIC. Read /usr/lib/ocf/resource.d/heartbeat/.ocf-returncodes You can also use ocf-tester to test your ocf script. Regards Dominik lakshmipadmaja maddali wrote: > Hi all, > > I have a strange issue, that ocf_error_generic is being > ingored at tim

Re: [Linux-HA] Alternative to monitor network instead of using pingd?

2009-02-02 Thread Dominik Klein
Maybe you should grab a cup of coffee, lay back and start over again. pingd is working fine for probably 1000s of people ... so it _may_ be your setup that is not correct yet. Maybe share your configuration? Regrads Dominik Tobias Appel wrote: > Well I've given up on pingd, I just can't get it t

Re: [Linux-HA] getting heartbeat2, drbd, and xen to work

2009-02-02 Thread Dominik Klein
First of all: Please upgrade to the latest pacemaker software. You'll have a lot less pain getting things to work and get support here. Read http://clusterlabs.org/wiki/Install on how to install the new software. Michael Grant wrote: > I'm trying to set up heartbeat-2 to manage xen in a drbd devi

Re: [Linux-HA] Simple STONITH question

2009-02-02 Thread Dominik Klein
Terry Hull wrote: > I have been looking at STONITH examples on the web and am somewhat confused. > I can't figure out how STONITH knows how to kill an individual server. I've > been looking at some examples such as this one. How does heartbeat know > that this will shut off node01. I see the nod

Re: [Linux-HA] Failover not working as I expected

2009-02-01 Thread Dominik Klein
> Moreover, even hb_gui shows that the services are bounced/restarted when a > node joins the cluster. The status of the resources changes to "failed" for > a second and changes back to "running on". Sounds like a bug in your RA to me. Regards Dominik _

Re: [Linux-HA] Linux-HA configuration on SLES 10.2 problem

2009-01-29 Thread Dominik Klein
peteridah wrote: > Hello, > > I have set up a 2-node heartbeat cluster on Suse Linux 10.2.I am using IBM > RSA slimlime II adapter cards on ibm x3655 servers connected to a SAN > device.So far I have set up an ext3 filesystem,ip address & > external/ibmrsa-telnet stonith resources which seem to lo

Re: [Linux-HA] Failover not working as I expected

2009-01-28 Thread Dominik Klein
Good morning Jerome we should make this a daily thing, shouldn't we? Jerome Yanga wrote: > Dominik, > > I apologize for leaving resource-stickiness out. I had it there previously > but due to the trial and errors I had performed on the crm shell, I had > forgotten to re-add it. Nevertheless,

Re: [Linux-HA] Absolute values of stickiness parameters

2009-01-28 Thread Dominik Klein
Hi Zakh, Rami wrote: > Hi, > > Sorry if i am being redundant here, but i could not locate a clear-cut answer > to this query. > > As far as i understand, a resource will be failed over and declared unable to > run on a given node as soon as the multiply of its "failcount" and its > "resource

Re: [Linux-HA] Failover not working as I expected

2009-01-27 Thread Dominik Klein
> > > > timeout="3s"/> > > > id="Emergency_Contact"> > > name="email" value="jya...@esri.com"/> > name=&qu

Re: [Linux-HA] Failover not working as I expected

2009-01-26 Thread Dominik Klein
Jerome Yanga wrote: > Andrew, > > I apologize for my sending my previous email abruptly. > > I have followed your recommendation and installed Pacemaker. > > Here is my config. > > Packages Installed: > heartbeat-2.99.2-6.1 > heartbeat-common-2.99.2-6.1 > heartbeat-debug-2.99.2-6.1 > heartbeat-

Re: [Linux-HA] Documentation on how to use crm(live)

2009-01-26 Thread Dominik Klein
Jerome Yanga wrote: > Please provide the documentation on how to use crm(live). > > I have found this but it is not detailed enough on how to use the rest of the > options. > > http://clusterlabs.org/wiki/Example_configurations > > I have been using crm(live) because I cannot seem to get the ri

Re: [Linux-HA] cibadmin won't parse input file

2009-01-26 Thread Dominik Klein
Tobias Appel wrote: > Hi, > > cibadmin just won't parse my input file, I've rewritten it twice now and > can't spot the error - maybe I haven't had enough coffee yet but this > feels like one of those games where you have two pictures and need to > spot the errors...sigh > > my xml looks like thi

Re: [Linux-HA] resource_stickiness and groups - how it is calculated?

2009-01-26 Thread Dominik Klein
Tobias Appel wrote: > On Mon, 2009-01-26 at 13:04 +0100, Andrew Beekhof wrote: >> On Mon, Jan 26, 2009 at 12:10, Tobias Appel wrote: >>> Well I've got a lot of questions today as you can see :) >>> >>> I have a group of resources which is ordered and colocated (due to drbd >>> master / slave const

Re: [Linux-HA] pingd with multiple networks

2009-01-25 Thread Dominik Klein
Christian Charles wrote: > Dejan Muhamedagic wrote: >> Hi, >> >> On Fri, Jan 23, 2009 at 02:42:35PM +0100, Christian Charles wrote: >> >>> Hello, >>> >>> i used the guide at http://www.linux-ha.org/PingdWithMultipleNetworks >>> to monitor connectivity to 2 different nets and it works. >>> >>> I d

Re: [Linux-HA] Problem in switchover DRBD disk

2009-01-21 Thread Dominik Klein
Hi Stefano > that's my first attempt to mount a HA cluster. > value="2.1.3-node: 552305612591183b1628baa5bc6e903e0f1e26a3"/> Before looking any further: If this is your first attempt, then why use such old software? Upgrade to the latest heartbeat (preferrably even openais) and pac

Re: [Linux-HA] cache question

2009-01-20 Thread Dominik Klein
> is there any way to setup a cache? > I want when user A with IP-A visit my website > that heartbeat caches his request and will forward them for 300 > seconds always to Server A and not to server B or C > > Is there any way to configure that? Are you sure you're on the right mailinglist? What y

Re: [Linux-HA] Failover not working as I expected

2009-01-19 Thread Dominik Klein
Jerome Yanga wrote: > Dominik, > > Thank you much. Adding "resource-stickiness" and getting rid of the > constraint helped a lot. The resources does not go back to Nomen anymore > when it's heartbeat is started again (resources stays with Rubric). > However, the resources still gets bounce

Re: [Linux-HA] Timeout IP - monitor

2009-01-16 Thread Dominik Klein
Thomas Roth wrote: > Hi all, > > I have the following primitive in my cib.xml: > > type="IPaddr"> > > timeout="15s"/> > > > > > > > > > > Sometimes I get the following e

Re: [Linux-HA] Meta data syntax

2009-01-16 Thread Dominik Klein
Michele Codutti wrote: > Heartbeat 2.1.3 on a debian etch taken from debian backports. http://developerbugs.linux-foundation.org/show_bug.cgi?id=1871 Here's the bug report and fix. I think that was after 2.1.3 Regards Dominik ___ Linux-HA mailing list

Re: [Linux-HA] Failover not working as I expected

2009-01-15 Thread Dominik Klein
Hi Jerome > The name of the servers are as follows: Nomen and Rubric. > > Let us start when Nomen owns all resources and its status states > "running(dc)". When I stop heartbeat on Nomen, Rubric takes over all the > resources and its status turns into "running(dc)". This is good as this is

Re: AW: [Linux-HA] Problem with linux-ha and drbd (ERROR: Return code 1 from /etc/ha.d/resource.d/Filesystem)

2009-01-15 Thread Dominik Klein
Sebastian Kösters wrote: > no one of you an idea how to fix this problem? > > -Ursprüngliche Nachricht- > Von: linux-ha-boun...@lists.linux-ha.org > [mailto:linux-ha-boun...@lists.linux-ha.org] Im Auftrag von Sebastian Kösters > Gesendet: Montag, 12. Januar 2009 23:10 > An: linux-ha@lists

Re: [Linux-HA] Running Two Different Versions of Heartbeat

2009-01-15 Thread Dominik Klein
h...@buglecreek.com wrote: > I have a fairly simple 2 node cluster running drbd, drbdlinks and > heartbeat. It runs the services httpd, mysql and smb. The drbd disk is > on a separate disk than the OS. > > The cluster's OS is fedora Core 8 and runs heartbeat-2.1.2-2.fc8 and > uses R1-style con

Re: [Linux-HA] Meta data syntax

2009-01-15 Thread Dominik Klein
Which version are you using? That's a known and fixed bug from a rather old version. Unfortunately, the bugzilla is not available at the moment. But searching for bugs with keyword "meta" once it is back should get you to the changeset. Regards Dominik Michele Codutti wrote: > Hello, I'm workin

  1   2   3   4   >