Re: [Linux-HA] lsb resource not starting during bootup

2010-12-13 Thread Pavlos Parissis
On 11 December 2010 16:48, Pavlos Parissis pavlos.paris...@gmail.com wrote: On 9 December 2010 09:44, Linux Cook linuxc...@gmail.com wrote: Hi! I have a lsb resource not running during bootup but successfully runs after issueing command: Any application that is under cluster control should

Re: [Linux-HA] Two nodes can't see each other, how can I diagnose?

2010-12-13 Thread Pavlos Parissis
On 14 December 2010 08:03, Bin Chen(sunwen_ling) binary.c...@gmail.com wrote: Hi, I have two nodes setup, it works well before, each one can see the other one. But today somehow the nodes can't discover the other one, the crm status in one node always shows the other node is offline, but I

Re: [Linux-HA] lsb resource not starting during bootup

2010-12-11 Thread Pavlos Parissis
On 9 December 2010 09:44, Linux Cook linuxc...@gmail.com wrote: Hi! I have a lsb resource not running during bootup but successfully runs after issueing command: Any application that is under cluster control should start only by the cluster and not by the system via init process crm

Re: [Linux-HA] failed authentication between nodes?

2010-12-09 Thread Pavlos Parissis
On 9 December 2010 18:06, sunitha kumar skumar.na...@gmail.com wrote: See failed authentication between nodes : WARN: string2msg_ll: node [xxx] failed authentication and their status becomes OFFLINE Heartbeat is running on both nodes. Any pointers on what authentication checks are made

Re: [Linux-HA] Are there any Linux alternatives to drbd and heartbeat?

2010-12-09 Thread Pavlos Parissis
On 9 December 2010 17:09, Igor Chudov ichu...@gmail.com wrote: On Thu, Dec 9, 2010 at 9:31 AM, Dimitri Maziuk dmaz...@bmrb.wisc.edu wrote: See LRM operation WebSite_start_0 unknown error from November, that's where your pdf led me. By the time I hit unknown error starting drbd resource -- set

Re: [Linux-HA] Regarding split brain

2010-12-09 Thread Pavlos Parissis
On 10 December 2010 04:36, Preeti Jain preeti_8...@yahoo.com wrote: Hello list,  I am testing network failure case by removing nic cable on one node and getting unwanted outcomes as whole cluster gets disturbed and resource appears to move on different nodes until it gets stabled on one node

Re: [Linux-HA] pacemaker with dual primary drbd resource

2010-12-07 Thread Pavlos Parissis
On 7 December 2010 21:25, Linux Cook linuxc...@gmail.com wrote: hi! Where can I read a good documentation on how to configure pacemaker in order to promote dual primary drbd resource? I think setting meta master-max=2 on ms clone for DRBD should be enough, take a look here

Re: [Linux-HA] How to monitor the nic link status

2010-12-03 Thread Pavlos Parissis
On 2 December 2010 08:27, Nikita Michalko michalko.sys...@a-i-p.com wrote: Hi Pavlos! Am Dienstag, 30. November 2010 22:28 schrieb Pavlos Parissis: Hi Nikita, On 30 November 2010 08:42, Nikita Michalko michalko.sys...@a-i-p.com wrote: Hi Pavlos, Am Dienstag, 30. November 2010 05:59

Re: [Linux-HA] How to monitor the nic link status

2010-12-03 Thread Pavlos Parissis
On 4 December 2010 01:12, Dimitri Maziuk dmaz...@bmrb.wisc.edu wrote: Pavlos Parissis wrote: I've just done a quick test. Since I don't have physical access on systems, I shutdown the port on the switch and run ping on the IP assigned to the interface, which has NO-CARRIER flag on and flag

Re: [Linux-HA] How to monitor the nic link status

2010-12-02 Thread Pavlos Parissis
Hi Nikita, On 2 December 2010 08:27, Nikita Michalko michalko.sys...@a-i-p.com wrote: Hi Pavlos! Am Dienstag, 30. November 2010 22:28 schrieb Pavlos Parissis: Hi Nikita, On 30 November 2010 08:42, Nikita Michalko michalko.sys...@a-i-p.com wrote:  - what about configure monitor operation

Re: [Linux-HA] How to set the node priority in pacemaker

2010-12-01 Thread Pavlos Parissis
2010/12/1 Mia Lueng xiaozun...@gmail.com: Sometime we need set a resource is perferred running a node. When the perferred  node  or some depend resources(ip,filesystem)  failed,  the resource will fail over to another node; And if the node or its depend resources is recovered, the resource

Re: [Linux-HA] How to monitor the nic link status

2010-11-30 Thread Pavlos Parissis
Hi Nikita, On 30 November 2010 08:42, Nikita Michalko michalko.sys...@a-i-p.com wrote: Hi Pavlos, Am Dienstag, 30. November 2010 05:59 schrieb Pavlos Parissis: On 29 November 2010 23:43, Lars Ellenberg lars.ellenb...@linbit.com wrote: On Mon, Nov 29, 2010 at 10:24:17PM +0800, Mia Lueng

Re: [Linux-HA] How to monitor the nic link status

2010-11-30 Thread Pavlos Parissis
On 30 November 2010 10:31, Max nospam-105177-...@slb.com wrote: Nikita, ...  - what about configure monitor operation of IP in cib.xml - sth. like this:    resources       primitive id=IPaddr_194_37_40_42 class=ocf provider=heartbeat type=IPaddr          meta_attributes

Re: [Linux-HA] How to monitor the nic link status

2010-11-30 Thread Pavlos Parissis
On 30 November 2010 22:55, Pavlos Parissis pavlos.paris...@gmail.com wrote: By mistake I hit send without finishing my previous mail so I was say that I use for now is the following primitive ping ocf:pacemaker:ping \        params host_list=192.168.78.4 name=ping \        op monitor interval

Re: [Linux-HA] confused in two node heartbeat cluster

2010-11-30 Thread Pavlos Parissis
On 30 November 2010 19:22, Mia Lueng xiaozun...@gmail.com wrote: I've setup a two-node cluster in sles11 sp1.   I use sbd as the stonith device. Here is my configuration: #crm configure show node hbtest01 \        attributes standby=off node hbtest02 \        attributes standby=off

Re: [Linux-HA] Why Did the Resource Fail Back whenResource Stickiness was Set?

2010-11-30 Thread Pavlos Parissis
On 25 November 2010 16:08, Robinson, Eric eric.robin...@psmnv.com wrote: FYI -- additional info. 1. I did an 'unmove' for resources g_clust04 and g_clust05, which removed the 'location cli-prefer' statements from the crm config. The resources stayed where they were. 2. I changed

Re: [Linux-HA] How to monitor the nic link status

2010-11-29 Thread Pavlos Parissis
On 29 November 2010 15:24, Mia Lueng xiaozun...@gmail.com wrote: Hi: I have configured a cluster with two nodes.    Lan setting is A eth0:  192.168.10.110 eth1: 172.16.0.1 B eth0:192.168.10.111 eth1: 172.16.0.2 I have configured a resource ip_0  192.168.10.100  on eth0. But when I

Re: [Linux-HA] How to monitor the nic link status

2010-11-29 Thread Pavlos Parissis
On 29 November 2010 23:43, Lars Ellenberg lars.ellenb...@linbit.com wrote: On Mon, Nov 29, 2010 at 10:24:17PM +0800, Mia Lueng wrote: Hi: I have configured a cluster with two nodes.    Lan setting is A eth0:  192.168.10.110 eth1: 172.16.0.1 B eth0:192.168.10.111 eth1: 172.16.0.2 I have

Re: [Linux-HA] migration-threshold global vs resource specific

2010-11-26 Thread Pavlos Parissis
On 26 November 2010 14:52, Kris Buytaert m...@inuits.be wrote: Imagine that I have 3 resources.  d_A, d_B and d_C.   They all have to run together on the same node however I want the behaviour upon failure of a service to be different. Define failure? Failure to stop or start or monitor  If

Re: [Linux-HA] Fail over cluster

2010-11-19 Thread Pavlos Parissis
On 19 November 2010 12:13, benjamin fernandis benjo11...@gmail.com wrote: Hi, As per your information, i installed heartbeat + pacemaker on both node.And then i configured ha.cf and authkeys file in both node.So now please guide me how to set haresources because as per my understanding if we

Re: [Linux-HA] 3-Node DRBD Cluster without Stacking

2010-11-16 Thread Pavlos Parissis
On 16 November 2010 17:15, Robinson, Eric eric.robin...@psmnv.com wrote: I'm not sure if this list or the DRBD list is the right one to ask this. Is it possible to deploy a 3-node CRM-based cluster where: -- nodes A and C share resource R1 on /dev/drbd0 -- nodes B and C share

Re: [Linux-HA] resource is active and should not be ERROR!

2010-11-12 Thread Pavlos Parissis
On 12 November 2010 05:23, Linux Cook linuxc...@gmail.com wrote: I am having a problem when starting postgresql from heartbeat (/etc/ha.d/haresources). HA says CRITICAL: Resource postgresql is active, and should not be! any thoughts? Have you checked before starting up the resource if

Re: [Linux-HA] heartbeat multicast status

2010-11-10 Thread Pavlos Parissis
I can confirm that on the same release, but I have no idea if it is normal or not. Could be a expected behavior due to the use of multicast [r...@node-01 tmp]# ./checkcl_status.pl /usr/bin/cl_status hblinkstatus node-01 eth0 node-01 eth0: dead /usr/bin/cl_status hblinkstatus node-01 eth1 node-01

Re: [Linux-HA] 802.1Q VLAN problem: Removing ethN.vlan:first also removes ethN.vlan:rest (on CentOS 5.5)

2010-11-05 Thread Pavlos Parissis
How about using vlan configuration like below where vlan interface is an interface by itself? I don't have cluster software on this system, so I don't know if it will make any difference, but you can try out ifconfig -a bond0 Link encap:Ethernet HWaddr 18:A9:05:46:B0:38 inet

Re: [Linux-HA] Help me adjust my ha.cf settings

2010-11-05 Thread Pavlos Parissis
On 5 November 2010 20:32, mike mgbut...@nbnet.nb.ca wrote: Hi all, I'm running a simple MySQL cluster on a very heavily loaded LPAR and experiencing some outages due to late heartbeat packets, Gmain timeouts and so on. Before we look at the settings, do you know if keepalives are lost due to

Re: [Linux-HA] Renaming hostnames and changing IPs

2010-11-04 Thread Pavlos Parissis
On 4 November 2010 09:17, Stephane Neveu stephane.ne...@thalesgroup.com wrote: Hi all, I'm experiencing a problem trying to rename 2 nodes with different hostnames (and others ip...). I've changed both /etc/hosts (the DNS is ok by the way) ha.cf is ok with new settings and new IPs. Both

Re: [Linux-HA] Question about time between actions on resources

2010-11-03 Thread Pavlos Parissis
On 3 November 2010 07:31, Alain.Moulle alain.mou...@bull.net wrote: Hi, If you execute a crm action (whatever action) on a resource, let's say for example a stop ... on res1, and if just after , you execute whatever other crm action on another resource , for example a stop on res2, there is

Re: [Linux-HA] Question about time between actions on resources

2010-11-03 Thread Pavlos Parissis
On 3 November 2010 11:31, Dejan Muhamedagic deja...@fastmail.fm wrote: Hi, On Wed, Nov 03, 2010 at 11:16:14AM +0100, Pavlos Parissis wrote: On 3 November 2010 07:31, Alain.Moulle alain.mou...@bull.net wrote: Hi, If you execute a crm action (whatever action) on a resource, let's say

Re: [Linux-HA] Recommended settings for keepalive

2010-11-02 Thread Pavlos Parissis
On 2 November 2010 16:14, mike mgbut...@nbnet.nb.ca wrote: On 10-11-02 11:52 AM, Dejan Muhamedagic wrote: Hi, On Tue, Nov 02, 2010 at 11:13:49AM -0300, mike wrote: Hi guys, Can you tell me what you would recommend for the following settings in the ha.cf file: Here are my settings. #

Re: [Linux-HA] search heartbeat+drbd postgresql tutorial

2010-10-28 Thread Pavlos Parissis
On 28 October 2010 11:19, ramarovelo art...@gulfsat.mg wrote: Hi list! can someone tell me where i can find how-tos for heartbeat + drbd with postgresql? I am afraid you have to use pacemaker as resource manager, so here is all the relevant doc http://www.clusterlabs.org/wiki/Documentation

Re: [Linux-HA] Redundant Rings Still Not There?

2010-10-27 Thread Pavlos Parissis
On 24 October 2010 22:09, Robinson, Eric eric.robin...@psmnv.com wrote: But now could someone please elaborate on Dejan Muhamedagic's original comment that started the thread? What does redundant rings are still not there mean? Is a three-node cluster an unreliable setup because

Re: [Linux-HA] Redundant Rings Still Not There?

2010-10-27 Thread Pavlos Parissis
On 24 October 2010 23:24, Robinson, Eric eric.robin...@psmnv.com wrote: Heartbeat is not deprecated, it is still supported by Linbut folks. (Many thanks to them). But if you need clvmd, GFS2, you would have to use corosync, for example. It may not be deprecated per se, but there is no

Re: [Linux-HA] My Second Pacemaker Cluster

2010-10-20 Thread Pavlos Parissis
On 20 October 2010 14:02, Robinson, Eric eric.robin...@psmnv.com wrote: Drat, my drawing was 77 characters wide, got wrapped, and now it is practically unreadable in the list. :-( make a visio/dia drawing, upload it somewhere and post the link 1 pictures is always 1000 words:-)

Re: [Linux-HA] Stonith log entries

2010-10-19 Thread Pavlos Parissis
On 14 October 2010 02:29, mike mgbut...@nbnet.nb.ca wrote: Fedora 13 on i686 btw. On 10-10-13 09:26 PM, mike wrote: Hi all, I've started building a simple 2 node http cluster. I've built several clusters so this should be a joke. I got the first node fired up and noticed these

Re: [Linux-HA] /etc/hosts VS node directive in ha.cf

2010-10-07 Thread Pavlos Parissis
On 6 October 2010 14:12, Lars Ellenberg lars.ellenb...@linbit.com wrote: On Fri, Oct 01, 2010 at 08:45:58AM +0200, Pavlos Parissis wrote: Hi, It was mentioned in this thread [1] that Heartbeat does care about the content of /etc/hosts file. That statement triggered me because in my

[Linux-HA] /etc/hosts VS node directive in ha.cf

2010-10-01 Thread Pavlos Parissis
Hi, It was mentioned in this thread [1] that Heartbeat does care about the content of /etc/hosts file. That statement triggered me because in my setup the /etc/hosts doesn't resolve the nodename configured in ha.cf to an IP address which is part of the 2 cluster LANs. My system have 3 different

[Linux-HA] exit code for status when program is stopped, 7 or 3

2010-10-01 Thread Pavlos Parissis
Hi, I am checking if my script which starts a program is LSB compliant. So, I followed the steps mentioned here [1] and in one of the steps says Status (stopped): /etc/init.d/some_service status ; echo result: $? Did the script accept the command? Did the script indicate the service was not

Re: [Linux-HA] exit code for status when program is stopped, 7 or 3

2010-10-01 Thread Pavlos Parissis
On 1 October 2010 10:38, Andrew Beekhof and...@beekhof.net wrote: On Fri, Oct 1, 2010 at 10:25 AM, Pavlos Parissis pavlos.paris...@gmail.com wrote: Hi, I am checking if my script which starts a program is LSB compliant. So, I followed the steps mentioned here [1] and in one of the steps

[Linux-HA] ccm slow start

2010-09-20 Thread Pavlos Parissis
have a misconfiguration system? Regards, Pavlos Parissis Sep 20 14:24:49 node-01 heartbeat: [9240]: info: Enabling logging daemon Sep 20 14:24:49 node-01 heartbeat: [9240]: info: logfile and debug file are those specified in logd config file (default /etc/logd.cf) Sep 20 14:24:50 node-01 heartbeat

Re: [Linux-HA] active or passive status of a node

2010-09-20 Thread Pavlos Parissis
On 20 September 2010 16:06, John Adams mailingli...@belfin.ch wrote: Hi running with heartbeat 3.0.2 I set up a active-passive 2-node cluster. If I transfer all resources to the other node is there a command that returns which of both clusters is active? Or is it possible to find out on a

Re: [Linux-HA] active or passive status of a node

2010-09-20 Thread Pavlos Parissis
On 20 September 2010 17:07, Pavlos Parissis pavlos.paris...@gmail.com wrote: On 20 September 2010 16:06, John Adams mailingli...@belfin.ch wrote: Hi running with heartbeat 3.0.2 I set up a active-passive 2-node cluster. If I transfer all resources to the other node is there a command

Re: [Linux-HA] Multicast or unicast

2010-09-17 Thread Pavlos Parissis
On 17 September 2010 12:03, Dejan Muhamedagic deja...@fastmail.fm wrote: Hi, On Thu, Sep 16, 2010 at 11:08:44PM +0200, Pavlos Parissis wrote: Hi, I have 2 dedicated LANs to be used by the Cluster+DRBD and I would like to know what is the recommended method for the communication path. What

Re: [Linux-HA] Multicast or unicast

2010-09-17 Thread Pavlos Parissis
On 17 September 2010 15:42, Dejan Muhamedagic deja...@fastmail.fm wrote: On Fri, Sep 17, 2010 at 12:43:44PM +0200, Pavlos Parissis wrote: [...snip...] I'd go with ucast. and why? Because it's the simplest. There's nothing to gain by using broadcast or multicast. Of course, if you expect

[Linux-HA] Multicast or unicast

2010-09-16 Thread Pavlos Parissis
Hi, I have 2 dedicated LANs to be used by the Cluster+DRBD and I would like to know what is the recommended method for the communication path. What are the pros and cons of each? I know that if someone has to use a communication path over shared network, the recommended method is unicast since