[Linux-ha-dev] [PATCH] Correction at the time of the slapd start delay

2012-05-09 Thread nozawat
Hi Dejan,

 It becomes the error at the time of a start delay of ldapserver now.
 Therefore I send a patch revising it.

 I did pull request to github.
 
https://github.com/nozawat/resource-agents/commit/bbc0934ea7b906f818e3c7e00bfaaa61ae46bc51

Regards,
Tomo

-- 
Tomoya Nozawa

___
Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
Home Page: http://linux-ha.org/


Re: [Linux-ha-dev] [PATCH] Correction at the time of the slapd start delay

2012-05-09 Thread Dejan Muhamedagic
Hi Tomoya-san,

On Wed, May 09, 2012 at 06:14:34PM +0900, noza...@gmail.com wrote:
 Hi Dejan,
 
  It becomes the error at the time of a start delay of ldapserver now.
  Therefore I send a patch revising it.
 
  I did pull request to github.
  
 https://github.com/nozawat/resource-agents/commit/bbc0934ea7b906f818e3c7e00bfaaa61ae46bc51

OK. Though the description doesn't seem to fit the change.
Shouldn't it be something like this:

Medium: slapd: always set the exit code correctly in monitor

In case ldapsearch would exit with an error code different from
49, the exit code may have been wrong.

Cheers,

Dejan

 Regards,
 Tomo
 
 -- 
 Tomoya Nozawa
 
 ___
 Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org
 http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
 Home Page: http://linux-ha.org/
___
Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
Home Page: http://linux-ha.org/


Re: [Linux-ha-dev] [PATCH] Correction at the time of the slapd start delay

2012-05-09 Thread Dejan Muhamedagic
Hi again,

On Wed, May 09, 2012 at 06:14:34PM +0900, noza...@gmail.com wrote:
 Hi Dejan,

Since recently it seems as if all emails to linux-ha-dev are
addressed to me. Happily, I'm not the only one around watching.
Nor should I feel as the only responsible for the state of
affairs in our project. I just thought I should straighten things
before it gets too late ;-)

Cheers,

Dejan

P.S. Not to be taken personally!
___
Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
Home Page: http://linux-ha.org/


Re: [Linux-ha-dev] [PATCH] Correction at the time of the slapd start delay

2012-05-09 Thread 野沢 智也
Hi Dejan,

 Thank you for comment.
 I made modifications and did pull request again.

Regards,
Tomo

On Wed, 9 May 2012 11:47:50 +0200
Dejan Muhamedagic de...@suse.de wrote:

 Hi Tomoya-san,
 
 On Wed, May 09, 2012 at 06:14:34PM +0900, noza...@gmail.com wrote:
  Hi Dejan,
  
   It becomes the error at the time of a start delay of ldapserver now.
   Therefore I send a patch revising it.
  
   I did pull request to github.
   
  https://github.com/nozawat/resource-agents/commit/bbc0934ea7b906f818e3c7e00bfaaa61ae46bc51
 
 OK. Though the description doesn't seem to fit the change.
 Shouldn't it be something like this:
 
 Medium: slapd: always set the exit code correctly in monitor
 
 In case ldapsearch would exit with an error code different from
 49, the exit code may have been wrong.
 
 Cheers,
 
 Dejan
 
  Regards,
  Tomo
  
  -- 
  Tomoya Nozawa
  
  ___
  Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org
  http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
  Home Page: http://linux-ha.org/
 ___
 Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org
 http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
 Home Page: http://linux-ha.org/

-- 
Tomoya Nozawa

___
Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
Home Page: http://linux-ha.org/


Re: [Linux-HA] We Rebooted a Healthy Standby Node and All the Services on the Primary Node Restarted?

2012-05-09 Thread alain . moulle
Hi,

Just an idea to be verified : 
Usually this is due to the fact that services are launched at boot time, 
so when the standby node is rebooted ,
Pacemaker detects that service is running on more than 1 node, and so 
stops the resource(s) and start it again only on 1 node. 
Again: to be checked (chkconfig etc.)
Alain



De :Robinson, Eric eric.robin...@psmnv.com
A : linux-ha@lists.linux-ha.org
Date :  07/05/2012 20:27
Objet : [Linux-HA] We Rebooted a Healthy Standby Node and All the Services 
on the Primary Node Restarted?
Envoyé par :linux-ha-boun...@lists.linux-ha.org



Hi guys, we rebooted a standby node of a healthy cluster and suddenly all 
the resources on the primary cluster restarted. What's up with that? 
Before rebooting the standby node, we did the normal stuff to verify that 
all was well.

crm_mon showed all nodes online, in their expected roles, with correct 
quorum votes
cat /proc/drbd showed correct dbbd status
corosync-cfgtool -s showed all rings active without faults

When we rebooted the standby node (ha08c), crm_mon on the primary node 
(ha08a) showed that all the resources stopped and then restarted, 
resulting in brief loss of availability to customers.

Following is what crm_mon showed before server ha08c was rebooted and 
after it came back up. Following that is our crm configuration.


Last updated: Mon May  7 11:13:32 2012
Stack: openais
Current DC: ha08a.mycharts.md - partition with quorum
Version: 1.1.2-f059ec7ced7a86f18e5490b67ebf4a0b963bccfe
3 Nodes configured, 3 expected votes
4 Resources configured.


Online: [ ha08a.mycharts.md ha08b.mycharts.md ha08c.mycharts.md ]

 Master/Slave Set: ms_drbd0
 Masters: [ ha08a.mycharts.md ]
 Slaves: [ ha08c.mycharts.md ]
 Master/Slave Set: ms_drbd1
 Masters: [ ha08b.mycharts.md ]
 Slaves: [ ha08c.mycharts.md ]
 Resource Group: g_clust06
 p_fs_clust06   (ocf::heartbeat:Filesystem):Started 
ha08a.mycharts.md
 p_vip_clust06  (ocf::heartbeat:IPaddr2):   Started 
ha08a.mycharts.md
 p_mysql_371(lsb:mysql_371):Started ha08a.mycharts.md
 p_mysql_372(lsb:mysql_372):Started ha08a.mycharts.md
 p_mysql_373(lsb:mysql_373):Started ha08a.mycharts.md
 p_mysql_374(lsb:mysql_374):Started ha08a.mycharts.md
 p_mysql_375(lsb:mysql_375):Started ha08a.mycharts.md
 p_mysql_376(lsb:mysql_376):Started ha08a.mycharts.md
 p_mysql_047(lsb:mysql_047):Started ha08a.mycharts.md
 p_mysql_100(lsb:mysql_100):Started ha08a.mycharts.md
 p_mysql_379(lsb:mysql_379):Started ha08a.mycharts.md
 p_mysql_377(lsb:mysql_377):Started ha08a.mycharts.md
 p_mysql_378(lsb:mysql_378):Started ha08a.mycharts.md
 p_mysql_380(lsb:mysql_380):Started ha08a.mycharts.md
 p_mysql_381(lsb:mysql_381):Started ha08a.mycharts.md
 p_mysql_382(lsb:mysql_382):Started ha08a.mycharts.md
 p_mysql_383(lsb:mysql_383):Started ha08a.mycharts.md
 p_mysql_384(lsb:mysql_384):Started ha08a.mycharts.md
 p_mysql_385(lsb:mysql_385):Started ha08a.mycharts.md
 p_mysql_386(lsb:mysql_386):Started ha08a.mycharts.md
 p_mysql_387(lsb:mysql_387):Started ha08a.mycharts.md
 p_mysql_002(lsb:mysql_002):Started ha08a.mycharts.md
 p_mysql_035(lsb:mysql_035):Started ha08a.mycharts.md
 p_mysql_049(lsb:mysql_049):Started ha08a.mycharts.md
 p_mysql_097(lsb:mysql_097):Started ha08a.mycharts.md
 p_mysql_024(lsb:mysql_024):Started ha08a.mycharts.md
 p_mysql_077(lsb:mysql_077):Started ha08a.mycharts.md
 p_mysql_084(lsb:mysql_084):Started ha08a.mycharts.md
 p_mysql_113(lsb:mysql_113):Started ha08a.mycharts.md
 p_mysql_116(lsb:mysql_116):Started ha08a.mycharts.md
 p_mysql_388(lsb:mysql_388):Started ha08a.mycharts.md
 p_mysql_389(lsb:mysql_389):Started ha08a.mycharts.md
 p_mysql_390(lsb:mysql_390):Started ha08a.mycharts.md
 p_mysql_391(lsb:mysql_391):Started ha08a.mycharts.md
 p_mysql_392(lsb:mysql_392):Started ha08a.mycharts.md
 p_mysql_393(lsb:mysql_393):Started ha08a.mycharts.md
 p_mysql_394(lsb:mysql_394):Started ha08a.mycharts.md
 p_mysql_395(lsb:mysql_395):Started ha08a.mycharts.md
 p_mysql_396(lsb:mysql_396):Started ha08a.mycharts.md
 p_mysql_397(lsb:mysql_397):Started ha08a.mycharts.md
 p_mysql_398(lsb:mysql_398):Started ha08a.mycharts.md
 p_mysql_399(lsb:mysql_399):Started 

[Linux-HA] Heartbeat question about multiple services

2012-05-09 Thread sgm
Hi,
I have a question about heartbeat, if I have three services, apache, mysql and 
sendmail,if apache is down, heartbeat will switch all the services to the 
standby server, right?
If so, how to configure heartbeat to avoid this happen?
Very Appreciated.gm
___
Linux-HA mailing list
Linux-HA@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems


[Linux-HA] Pacemaker monitor

2012-05-09 Thread dong he
Hi,
  recently I'm clustering the OpenSIPS with two Ubuntu computers.
I did it step by step and used the tutorial :
http://anders.com/cms/259/Linux.Tutorial/OpenSer/Heartbeat.v2.0
But unfortunately I still met so many problems.

The follows are my configuration files:
cib.xml

cib validate-with=transitional-0.6 crm_feature_set=3.0.1
have-quorum=1 admin_epoch=0 epoch=13 num_updates=0
cib-last-written=Wed Apr 25 19:06:24 2012
dc-uuid=1ca0c19b-5955-42f9-9131-b200c7c0d8ca
  configuration
crm_config
  cluster_property_set id=cluster-property-set
attributes
  nvpair id=short_resource_names name=short_resource_names
value=true/
  nvpair id=pe-input-series-max name=pe-input-series-max
value=-1/
  nvpair id=default-resource-stickiness
name=default-resource-stickiness value=10/
  nvpair id=default-resource-failure-stickiness
name=default-resource-failure-stickiness value=-10/
  nvpair id=start-failure-is-fatal name=start-failure-is-fatal
value=false/
/attributes
  /cluster_property_set
  cluster_property_set id=cib-bootstrap-options
attributes
  nvpair id=cib-bootstrap-options-last-lrm-refresh
name=last-lrm-refresh value=1194982799/
  nvpair id=cib-bootstrap-options-dc-version name=dc-version
value=1.0.9-da7075976b5ff0bee71074385f8fd02f296ec8a3/
  nvpair id=cib-bootstrap-options-cluster-infrastructure
name=cluster-infrastructure value=Heartbeat/
/attributes
  /cluster_property_set
/crm_config
nodes
  node id=1ca0c19b-5955-42f9-9131-b200c7c0d8ca uname=opensips1
type=normal/
  node id=e1044f5c-c3a6-44b6-84de-8513b2f7df90 uname=opensips2
type=normal/
/nodes
resources
  group id=IPaddr2_OpenSIPS_group
primitive id=IPaddr2-10.120.89.222 class=ocf type=IPaddr2
provider=heartbeat
  operations
op id=ipaddr2-10.120.89.222-monitor name=monitor
interval=5s timeout=3s/
  /operations
  instance_attributes id=IPaddr2-10.120.89.222-attributes
attributes
  nvpair id=ipaddr2-10.120.89.222-ip name=ip
value=10.120.89.222/
  nvpair id=ipaddr2-10.120.89.222-broadcast name=broadcast
value=10.120.89.255/
  nvpair id=ipaddr2-10.120.89.222-nic name=nic
value=eth0/
  nvpair id=ipaddr2-10.120.89.222-cidr_netmask
name=cidr_netmask value=24/
/attributes
  /instance_attributes
/primitive
primitive id=IPaddr2-192.168.56.199 class=ocf type=IPaddr2
provider=heartbeat
  operations
op id=ipaddr2-1192.168.56.199-monitor name=monitor
interval=5s timeout=3s/
  /operations
  instance_attributes id=IPaddr2-192.168.56.199-attributes
attributes
  nvpair id=ipaddr2-192.168.56.199-ip name=ip
value=192.168.56.199/
  nvpair id=ipaddr2-192.168.56.199-broadcast
name=broadcast value=192.168.56.255/
  nvpair id=ipaddr2-192.168.56.199-nic name=nic
value=eth1/
  nvpair id=ipaddr2-192.168.56.199-cidr_netmask
name=cidr_netmask value=24/
/attributes
  /instance_attributes
/primitive
primitive id=OpenSIPS class=ocf type=OpenSIPS provider=
anders.com
  operations
op id=opensips-start name=start timeout=20s/
op id=opensips-stop name=stop timeout=3s/
op id=opensips-monitor name=monitor interval=10s
timeout=6s/
  /operations
/primitive
  /group
/resources
constraints
  rsc_location id=OpenSIPS_resource_location rsc=OpenSIPS
rule id=rule_opensips1 score=100
  expression id=expression_uname_eq_opensips1 attribute=#uname
operation=eq value=opensips1/
/rule
rule id=rule_opensips2 score=10
  expression id=expression_uname_eq_opensips2 attribute=#uname
operation=eq value=opensips2/
/rule
  /rsc_location
/constraints
  /configuration
/cib


OpenSIPS script:

#!/bin/sh
# Initialization:
. /usr/lib/ocf/resource.d/heartbeat/.ocf-shellfuncs
usage() {
  cat END
usage: $0 {start|stop|status|monitor|meta-data|validate-all}
END
}
meta_data() {
  cat END
?xml version=1.0?
!DOCTYPE resource-agent SYSTEM ra-api-1.dtd
 resource-agent name=OpenSIPS
 version1.0/version
 longdesc lang=en
  Resource Agent for the OpenSIPS SIP Proxy.
 /longdesc
 shortdesc lang=enOpenSIPS resource agent/shortdesc

 actions
  action name=start timeout=30 /
  action name=stop timeout=30 /
  action name=status depth=0 timeout=30 interval=10
start-delay=30 /
  action name=monitor depth=0 timeout=30 interval=10
start-delay=30 /
  action name=meta-data timeout=5 /
  action name=validate-all timeout=5 /
  action name=notify timeout=5 /
  action name=promote timeout=5 /
  action name=demote timeout=5 /
 /actions
/resource-agent
END
}
OpenSIPS_Status() {


  /root/sipp.svn/sipp 127.0.0.1 -sf
/root/sipp.svn/files/REGISTER_client.xml -inf

[Linux-HA] questions about Heartbeat

2012-05-09 Thread Zhimin Wu
Dear Sir/Madam:

I have questions about Heartbeat 3.0. Could you please tell me how much
nodes Heartbeat 3.0 can support?  If I run a Multi-Process program in
several nodes, could this program be declined as a service monitored by
Heartbeat? and the last question is can I decline my function to deal with
the node error and insert into Heartbeat? if so, How to do it ?

Thank you very much for your time. I am  looking forward to your return.

Zhimin Wu
2012.4.27
___
Linux-HA mailing list
Linux-HA@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems


[Linux-HA] Linux-HA help needed

2012-05-09 Thread Christopher Davis
Does anyone know how to configure Linux-ha as a load balancer?  I seem
to remember it's possible but can't find any documentation on it.  What
we want is for a user coming into a given VIP to get load balanced to
one of four active ports on the box in a round robin situation.  Thanks
in advance for any help.

 

Regards,

 

Christopher A. Davis

 

Global Support Desk

Patsystems (NA) LLC

200 W. Madison Avenue

Suite 1400

Chicago, IL 60606

www.patsystems.com http://www.patsystems.com/ 

 

Please note that the Patsystems Chicago office has relocated to: 200 W.
Madison, Suite 1400, Chicago, IL 60606

 



DISCLAIMER: This e-mail is confidential and may also be legally privileged. If 
you are not the intended recipient, use of the information contained in this 
e-mail
 (including disclosure, copying or distribution) is prohibited and may be 
unlawful. Please inform the sender and delete the message immediately from your 
system.
This e-mail is attributed to the sender and may not necessarily reflect the 
views of the Patsystems Group and no member of the Patsystems Group accepts any
liability for any action taken in reliance on the contents of this e-mail 
(other than where it has a legal or regulatory obligation to do so) or for the 
consequences of any
computer viruses which may have been transmitted by this e-mail. The Patsystems 
Group comprises Patsystems plc and its subsidiary group of companies.

___
Linux-HA mailing list
Linux-HA@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems


Re: [Linux-HA] Linux-HA help needed

2012-05-09 Thread Michael Schwartzkopff
 Does anyone know how to configure Linux-ha as a load balancer?  I seem
 to remember it's possible but can't find any documentation on it.  What
 we want is for a user coming into a given VIP to get load balanced to
 one of four active ports on the box in a round robin situation.  Thanks
 in advance for any help.
 
 
 
 Regards,
 
 
 
 Christopher A. Davis

Make the loadbalancer run as a clone on both nodes under cluster control and 
make the VIP dependent on the LVS clone.

I use ldirectord configuring the LVS and for the monitoring of the real 
servers because it nicely integrates into the cluster with its own resource 
agent.

With this setup you even can use the state table sync feature of LVS.

Greetings,

-- 
Dr. Michael Schwartzkopff
Guardinistr. 63
81375 München

Tel: (0163) 172 50 98


signature.asc
Description: This is a digitally signed message part.
___
Linux-HA mailing list
Linux-HA@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems

Re: [Linux-HA] Heartbeat question about multiple services

2012-05-09 Thread Nikita Michalko
Am Freitag, 20. April 2012 12:42:16 schrieb sgm:
 Hi,
 I have a question about heartbeat, if I have three services, apache, mysql
  and sendmail,if apache is down, heartbeat will switch all the services to
  the standby server, right?
It's depending on configuration - also possible ...

  If so, how to configure heartbeat to avoid this
  happen?

You can configure your 2 services (mysql and sendmail for example )  with 
colocations  constraints, or as a group - there are many possibilities.
Did you already RTFM (read the f... manuals)?


 Very Appreciated.gm
 ___
 Linux-HA mailing list
 Linux-HA@lists.linux-ha.org
 http://lists.linux-ha.org/mailman/listinfo/linux-ha
 See also: http://linux-ha.org/ReportingProblems
 

HTH

Nikita Michalko 
___
Linux-HA mailing list
Linux-HA@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems


Re: [Linux-HA] weird problem w/ R1

2012-05-09 Thread Dejan Muhamedagic
Hi,

On Wed, May 02, 2012 at 06:25:55PM -0500, Dimitri Maziuk wrote:
 Hi everyone,
 
 I must be overlooking something obvious... I have a simple haresources
 setup with
 
 node drbddisk::sessdata Filesystem::/dev/drbd0::/raid::ext3::rw \
 ip.addr httpd xinetd pure_ftpd pure_uploadscript bacula-client mon
 
 bacula-client is in /etc/ha.d/resource.d, it's a copy of stock
 /etc/init.d/bacula-fd with config, lock, and pid file changed to make it
 listen on a non-standard port: this is for backing up drbd filesystem
 (there's the standard bacula client running also).
 
 bacula-client doesn't start. I added a couple of 'logger' lines and if I
 manually run /etc/ha.d/resource.d/bacula-client start ; echo $? I get
 0 and the log:
 node logger: starting bacula-fd -c /etc/bacula/deposit-fd.conf
 node logger: bacula-fd -c /etc/bacula/deposit-fd.conf running
 
 Yet on failover I get this:
 
 node ResourceManager[3734]: info: Running /etc/init.d/httpd  start
 node ResourceManager[3734]: info: Running /etc/init.d/xinetd  start
 node ResourceManager[3734]: info: Running /etc/ha.d/resource.d/pure_ftpd
  start
 node xinetd[4204]: xinetd Version 2.3.14 started with libwrap loadavg
 labeled-networking options compiled in.
 node xinetd[4204]: Started working: 1 available service
 node ResourceManager[3734]: info: Running
 /etc/ha.d/resource.d/pure_uploadscript  start
 node ResourceManager[3734]: info: Running /etc/init.d/mon  start
 
 It doesn't seem to run that particular script: it starts
 pure_uploadscript from resource.d and mon from init.d, but not the one
 in between. What's weird is I now have it happening on 2 clusters:
 centos 5 w/ heartbeat 2.1.4, and centos 6 w/ heartbeat 3.0.4. The only
 common thing is bacula version: 5.
 
 Any ideas?

No, but you can add set -x in some places in ResourceManager and
see what gives.

Thanks,

Dejan

 TIA
 -- 
 Dimitri Maziuk
 Programmer/sysadmin
 BioMagResBank, UW-Madison -- http://www.bmrb.wisc.edu
 



 ___
 Linux-HA mailing list
 Linux-HA@lists.linux-ha.org
 http://lists.linux-ha.org/mailman/listinfo/linux-ha
 See also: http://linux-ha.org/ReportingProblems

___
Linux-HA mailing list
Linux-HA@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems


Re: [Linux-HA] Heartbeat question about multiple services

2012-05-09 Thread RaSca
Il giorno Ven 20 Apr 2012 12:42:16 CEST, sgm ha scritto:
 Hi,
 I have a question about heartbeat, if I have three services, apache, mysql 
 and sendmail,if apache is down, heartbeat will switch all the services to the 
 standby server, right?
 If so, how to configure heartbeat to avoid this happen?
 Very Appreciated.gm

You may want to start from here:
http://www.clusterlabs.org/doc/en-US/Pacemaker/1.1/html/Clusters_from_Scratch/

-- 
RaSca
Mia Mamma Usa Linux: Niente è impossibile da capire, se lo spieghi bene!
ra...@miamammausalinux.org
http://www.miamammausalinux.org
___
Linux-HA mailing list
Linux-HA@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems


Re: [Linux-HA] Heartbeat question about multiple services

2012-05-09 Thread David Gersic
 On 4/20/2012 at 05:42 AM, sgm sgm...@yahoo.com.cn wrote: 
 Hi,
 I have a question about heartbeat, if I have three services, apache, mysql 
 and sendmail,if apache is down, heartbeat will switch all the services to the 
 standby server, right?

Maybe. It depends on how you have built and configured your cluster.



___
Linux-HA mailing list
Linux-HA@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems


Re: [Linux-HA] We Rebooted a Healthy Standby Node and All the Services on the Primary Node Restarted?

2012-05-09 Thread Robinson, Eric
 Hi,
 
 Just an idea to be verified : 
 Usually this is due to the fact that services are launched at 
 boot time, so when the standby node is rebooted , Pacemaker 
 detects that service is running on more than 1 node, and so 
 stops the resource(s) and start it again only on 1 node. 
 Again: to be checked (chkconfig etc.)
 Alain
 
 


Verified. Services are only started by Pacemaker, nothing starts on boot 
(except Pacemaker and Corosync).


 
 De :Robinson, Eric eric.robin...@psmnv.com
 A : linux-ha@lists.linux-ha.org
 Date :  07/05/2012 20:27
 Objet : [Linux-HA] We Rebooted a Healthy Standby Node and All 
 the Services on the Primary Node Restarted?
 Envoyé par :linux-ha-boun...@lists.linux-ha.org
 
 
 
 Hi guys, we rebooted a standby node of a healthy cluster and 
 suddenly all 
 the resources on the primary cluster restarted. What's up with that? 
 Before rebooting the standby node, we did the normal stuff to 
 verify that 
 all was well.
 
 crm_mon showed all nodes online, in their expected roles, 
 with correct 
 quorum votes
 cat /proc/drbd showed correct dbbd status
 corosync-cfgtool -s showed all rings active without faults
 
 When we rebooted the standby node (ha08c), crm_mon on the 
 primary node 
 (ha08a) showed that all the resources stopped and then restarted, 
 resulting in brief loss of availability to customers.
 
 Following is what crm_mon showed before server ha08c was rebooted and 
 after it came back up. Following that is our crm configuration.
 
 
 Last updated: Mon May  7 11:13:32 2012
 Stack: openais
 Current DC: ha08a.mycharts.md - partition with quorum
 Version: 1.1.2-f059ec7ced7a86f18e5490b67ebf4a0b963bccfe
 3 Nodes configured, 3 expected votes
 4 Resources configured.
 
 
 Online: [ ha08a.mycharts.md ha08b.mycharts.md ha08c.mycharts.md ]
 
  Master/Slave Set: ms_drbd0
  Masters: [ ha08a.mycharts.md ]
  Slaves: [ ha08c.mycharts.md ]
  Master/Slave Set: ms_drbd1
  Masters: [ ha08b.mycharts.md ]
  Slaves: [ ha08c.mycharts.md ]
  Resource Group: g_clust06
  p_fs_clust06   (ocf::heartbeat:Filesystem):Started 
 ha08a.mycharts.md
  p_vip_clust06  (ocf::heartbeat:IPaddr2):   Started 
 ha08a.mycharts.md
  p_mysql_371(lsb:mysql_371):Started 
 ha08a.mycharts.md
  p_mysql_372(lsb:mysql_372):Started 
 ha08a.mycharts.md
  p_mysql_373(lsb:mysql_373):Started 
 ha08a.mycharts.md
  p_mysql_374(lsb:mysql_374):Started 
 ha08a.mycharts.md
  p_mysql_375(lsb:mysql_375):Started 
 ha08a.mycharts.md
  p_mysql_376(lsb:mysql_376):Started 
 ha08a.mycharts.md
  p_mysql_047(lsb:mysql_047):Started 
 ha08a.mycharts.md
  p_mysql_100(lsb:mysql_100):Started 
 ha08a.mycharts.md
  p_mysql_379(lsb:mysql_379):Started 
 ha08a.mycharts.md
  p_mysql_377(lsb:mysql_377):Started 
 ha08a.mycharts.md
  p_mysql_378(lsb:mysql_378):Started 
 ha08a.mycharts.md
  p_mysql_380(lsb:mysql_380):Started 
 ha08a.mycharts.md
  p_mysql_381(lsb:mysql_381):Started 
 ha08a.mycharts.md
  p_mysql_382(lsb:mysql_382):Started 
 ha08a.mycharts.md
  p_mysql_383(lsb:mysql_383):Started 
 ha08a.mycharts.md
  p_mysql_384(lsb:mysql_384):Started 
 ha08a.mycharts.md
  p_mysql_385(lsb:mysql_385):Started 
 ha08a.mycharts.md
  p_mysql_386(lsb:mysql_386):Started 
 ha08a.mycharts.md
  p_mysql_387(lsb:mysql_387):Started 
 ha08a.mycharts.md
  p_mysql_002(lsb:mysql_002):Started 
 ha08a.mycharts.md
  p_mysql_035(lsb:mysql_035):Started 
 ha08a.mycharts.md
  p_mysql_049(lsb:mysql_049):Started 
 ha08a.mycharts.md
  p_mysql_097(lsb:mysql_097):Started 
 ha08a.mycharts.md
  p_mysql_024(lsb:mysql_024):Started 
 ha08a.mycharts.md
  p_mysql_077(lsb:mysql_077):Started 
 ha08a.mycharts.md
  p_mysql_084(lsb:mysql_084):Started 
 ha08a.mycharts.md
  p_mysql_113(lsb:mysql_113):Started 
 ha08a.mycharts.md
  p_mysql_116(lsb:mysql_116):Started 
 ha08a.mycharts.md
  p_mysql_388(lsb:mysql_388):Started 
 ha08a.mycharts.md
  p_mysql_389(lsb:mysql_389):Started 
 ha08a.mycharts.md
  p_mysql_390(lsb:mysql_390):Started 
 ha08a.mycharts.md
  p_mysql_391(lsb:mysql_391):Started 
 ha08a.mycharts.md
  p_mysql_392(lsb:mysql_392):Started 
 ha08a.mycharts.md
  p_mysql_393(lsb:mysql_393):Started 
 ha08a.mycharts.md
  p_mysql_394(lsb:mysql_394):Started 
 ha08a.mycharts.md
  p_mysql_395(lsb:mysql_395):Started 
 

Re: [Linux-HA] We Rebooted a Healthy Standby Node and All the Services on the Primary Node Restarted?

2012-05-09 Thread Andrew Beekhof
On Tue, May 8, 2012 at 4:27 AM, Robinson, Eric eric.robin...@psmnv.com wrote:
 Hi guys, we rebooted a standby node of a healthy cluster and suddenly all the 
 resources on the primary cluster restarted. What's up with that? Before 
 rebooting the standby node, we did the normal stuff to verify that all was 
 well.

 crm_mon showed all nodes online, in their expected roles, with correct quorum 
 votes
 cat /proc/drbd showed correct dbbd status
 corosync-cfgtool -s showed all rings active without faults

 When we rebooted the standby node (ha08c), crm_mon on the primary node 
 (ha08a) showed that all the resources stopped and then restarted, resulting 
 in brief loss of availability to customers.

Did they actually restart though?
When a new DC gathers the current state of the cluster, the resources
may appear to restart in crm_mon but its often just a display issue.
Granted its quite annoying.


 Following is what crm_mon showed before server ha08c was rebooted and after 
 it came back up. Following that is our crm configuration.

 
 Last updated: Mon May  7 11:13:32 2012
 Stack: openais
 Current DC: ha08a.mycharts.md - partition with quorum
 Version: 1.1.2-f059ec7ced7a86f18e5490b67ebf4a0b963bccfe
 3 Nodes configured, 3 expected votes
 4 Resources configured.
 

 Online: [ ha08a.mycharts.md ha08b.mycharts.md ha08c.mycharts.md ]

  Master/Slave Set: ms_drbd0
     Masters: [ ha08a.mycharts.md ]
     Slaves: [ ha08c.mycharts.md ]
  Master/Slave Set: ms_drbd1
     Masters: [ ha08b.mycharts.md ]
     Slaves: [ ha08c.mycharts.md ]
  Resource Group: g_clust06
     p_fs_clust06       (ocf::heartbeat:Filesystem):    Started 
 ha08a.mycharts.md
     p_vip_clust06      (ocf::heartbeat:IPaddr2):       Started 
 ha08a.mycharts.md
     p_mysql_371        (lsb:mysql_371):        Started ha08a.mycharts.md
     p_mysql_372        (lsb:mysql_372):        Started ha08a.mycharts.md
     p_mysql_373        (lsb:mysql_373):        Started ha08a.mycharts.md
     p_mysql_374        (lsb:mysql_374):        Started ha08a.mycharts.md
     p_mysql_375        (lsb:mysql_375):        Started ha08a.mycharts.md
     p_mysql_376        (lsb:mysql_376):        Started ha08a.mycharts.md
     p_mysql_047        (lsb:mysql_047):        Started ha08a.mycharts.md
     p_mysql_100        (lsb:mysql_100):        Started ha08a.mycharts.md
     p_mysql_379        (lsb:mysql_379):        Started ha08a.mycharts.md
     p_mysql_377        (lsb:mysql_377):        Started ha08a.mycharts.md
     p_mysql_378        (lsb:mysql_378):        Started ha08a.mycharts.md
     p_mysql_380        (lsb:mysql_380):        Started ha08a.mycharts.md
     p_mysql_381        (lsb:mysql_381):        Started ha08a.mycharts.md
     p_mysql_382        (lsb:mysql_382):        Started ha08a.mycharts.md
     p_mysql_383        (lsb:mysql_383):        Started ha08a.mycharts.md
     p_mysql_384        (lsb:mysql_384):        Started ha08a.mycharts.md
     p_mysql_385        (lsb:mysql_385):        Started ha08a.mycharts.md
     p_mysql_386        (lsb:mysql_386):        Started ha08a.mycharts.md
     p_mysql_387        (lsb:mysql_387):        Started ha08a.mycharts.md
     p_mysql_002        (lsb:mysql_002):        Started ha08a.mycharts.md
     p_mysql_035        (lsb:mysql_035):        Started ha08a.mycharts.md
     p_mysql_049        (lsb:mysql_049):        Started ha08a.mycharts.md
     p_mysql_097        (lsb:mysql_097):        Started ha08a.mycharts.md
     p_mysql_024        (lsb:mysql_024):        Started ha08a.mycharts.md
     p_mysql_077        (lsb:mysql_077):        Started ha08a.mycharts.md
     p_mysql_084        (lsb:mysql_084):        Started ha08a.mycharts.md
     p_mysql_113        (lsb:mysql_113):        Started ha08a.mycharts.md
     p_mysql_116        (lsb:mysql_116):        Started ha08a.mycharts.md
     p_mysql_388        (lsb:mysql_388):        Started ha08a.mycharts.md
     p_mysql_389        (lsb:mysql_389):        Started ha08a.mycharts.md
     p_mysql_390        (lsb:mysql_390):        Started ha08a.mycharts.md
     p_mysql_391        (lsb:mysql_391):        Started ha08a.mycharts.md
     p_mysql_392        (lsb:mysql_392):        Started ha08a.mycharts.md
     p_mysql_393        (lsb:mysql_393):        Started ha08a.mycharts.md
     p_mysql_394        (lsb:mysql_394):        Started ha08a.mycharts.md
     p_mysql_395        (lsb:mysql_395):        Started ha08a.mycharts.md
     p_mysql_396        (lsb:mysql_396):        Started ha08a.mycharts.md
     p_mysql_397        (lsb:mysql_397):        Started ha08a.mycharts.md
     p_mysql_398        (lsb:mysql_398):        Started ha08a.mycharts.md
     p_mysql_399        (lsb:mysql_399):        Started ha08a.mycharts.md
     p_mysql_400        (lsb:mysql_400):        Started ha08a.mycharts.md
     p_mysql_401        (lsb:mysql_401):        Started ha08a.mycharts.md
     p_mysql_402        (lsb:mysql_402):        Started ha08a.mycharts.md
     p_mysql_403        (lsb:mysql_403):       

Re: [Linux-HA] Pacemaker monitor

2012-05-09 Thread Andrew Beekhof
On Thu, Apr 26, 2012 at 1:17 PM, dong he smiledon...@gmail.com wrote:
 Hi,
      recently I'm clustering the OpenSIPS with two Ubuntu computers.
 I did it step by step and used the tutorial :
 http://anders.com/cms/259/Linux.Tutorial/OpenSer/Heartbeat.v2.0
 But unfortunately I still met so many problems.

 The follows are my configuration files:
 cib.xml

 cib validate-with=transitional-0.6 crm_feature_set=3.0.1
 have-quorum=1 admin_epoch=0 epoch=13 num_updates=0
 cib-last-written=Wed Apr 25 19:06:24 2012
 dc-uuid=1ca0c19b-5955-42f9-9131-b200c7c0d8ca
  configuration
    crm_config
      cluster_property_set id=cluster-property-set
        attributes
          nvpair id=short_resource_names name=short_resource_names
 value=true/
          nvpair id=pe-input-series-max name=pe-input-series-max
 value=-1/
          nvpair id=default-resource-stickiness
 name=default-resource-stickiness value=10/
          nvpair id=default-resource-failure-stickiness
 name=default-resource-failure-stickiness value=-10/
          nvpair id=start-failure-is-fatal name=start-failure-is-fatal
 value=false/
        /attributes
      /cluster_property_set
      cluster_property_set id=cib-bootstrap-options
        attributes
          nvpair id=cib-bootstrap-options-last-lrm-refresh
 name=last-lrm-refresh value=1194982799/
          nvpair id=cib-bootstrap-options-dc-version name=dc-version
 value=1.0.9-da7075976b5ff0bee71074385f8fd02f296ec8a3/
          nvpair id=cib-bootstrap-options-cluster-infrastructure
 name=cluster-infrastructure value=Heartbeat/
        /attributes
      /cluster_property_set
    /crm_config
    nodes
      node id=1ca0c19b-5955-42f9-9131-b200c7c0d8ca uname=opensips1
 type=normal/
      node id=e1044f5c-c3a6-44b6-84de-8513b2f7df90 uname=opensips2
 type=normal/
    /nodes
    resources
      group id=IPaddr2_OpenSIPS_group
        primitive id=IPaddr2-10.120.89.222 class=ocf type=IPaddr2
 provider=heartbeat
          operations
            op id=ipaddr2-10.120.89.222-monitor name=monitor
 interval=5s timeout=3s/
          /operations
          instance_attributes id=IPaddr2-10.120.89.222-attributes
            attributes
              nvpair id=ipaddr2-10.120.89.222-ip name=ip
 value=10.120.89.222/
              nvpair id=ipaddr2-10.120.89.222-broadcast name=broadcast
 value=10.120.89.255/
              nvpair id=ipaddr2-10.120.89.222-nic name=nic
 value=eth0/
              nvpair id=ipaddr2-10.120.89.222-cidr_netmask
 name=cidr_netmask value=24/
            /attributes
          /instance_attributes
        /primitive
        primitive id=IPaddr2-192.168.56.199 class=ocf type=IPaddr2
 provider=heartbeat
          operations
            op id=ipaddr2-1192.168.56.199-monitor name=monitor
 interval=5s timeout=3s/
          /operations
          instance_attributes id=IPaddr2-192.168.56.199-attributes
            attributes
              nvpair id=ipaddr2-192.168.56.199-ip name=ip
 value=192.168.56.199/
              nvpair id=ipaddr2-192.168.56.199-broadcast
 name=broadcast value=192.168.56.255/
              nvpair id=ipaddr2-192.168.56.199-nic name=nic
 value=eth1/
              nvpair id=ipaddr2-192.168.56.199-cidr_netmask
 name=cidr_netmask value=24/
            /attributes
          /instance_attributes
        /primitive
        primitive id=OpenSIPS class=ocf type=OpenSIPS provider=
 anders.com
          operations
            op id=opensips-start name=start timeout=20s/
            op id=opensips-stop name=stop timeout=3s/
            op id=opensips-monitor name=monitor interval=10s
 timeout=6s/
          /operations
        /primitive
      /group
    /resources
    constraints
      rsc_location id=OpenSIPS_resource_location rsc=OpenSIPS
        rule id=rule_opensips1 score=100
          expression id=expression_uname_eq_opensips1 attribute=#uname
 operation=eq value=opensips1/
        /rule
        rule id=rule_opensips2 score=10
          expression id=expression_uname_eq_opensips2 attribute=#uname
 operation=eq value=opensips2/
        /rule
      /rsc_location
    /constraints
  /configuration
 /cib


 OpenSIPS script:

 #!/bin/sh
 # Initialization:
 . /usr/lib/ocf/resource.d/heartbeat/.ocf-shellfuncs
 usage() {
  cat END
        usage: $0 {start|stop|status|monitor|meta-data|validate-all}
 END
 }
 meta_data() {
  cat END
 ?xml version=1.0?
 !DOCTYPE resource-agent SYSTEM ra-api-1.dtd
  resource-agent name=OpenSIPS
  version1.0/version
  longdesc lang=en
  Resource Agent for the OpenSIPS SIP Proxy.
  /longdesc
  shortdesc lang=enOpenSIPS resource agent/shortdesc

  actions
  action name=start timeout=30 /
  action name=stop timeout=30 /
  action name=status depth=0 timeout=30 interval=10
 start-delay=30 /
  action name=monitor depth=0 timeout=30 interval=10
 start-delay=30 /
  action name=meta-data timeout=5 /
  action name=validate-all timeout=5 /
  action name=notify timeout=5 /
  action name=promote timeout=5 /
  action name=demote timeout=5 /
  /actions
 /resource-agent
 END
 }