[Linux-ha-dev] [PATCH] Correction at the time of the slapd start delay
Hi Dejan, It becomes the error at the time of a start delay of ldapserver now. Therefore I send a patch revising it. I did pull request to github. https://github.com/nozawat/resource-agents/commit/bbc0934ea7b906f818e3c7e00bfaaa61ae46bc51 Regards, Tomo -- Tomoya Nozawa ___ Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev Home Page: http://linux-ha.org/
Re: [Linux-ha-dev] [PATCH] Correction at the time of the slapd start delay
Hi Tomoya-san, On Wed, May 09, 2012 at 06:14:34PM +0900, noza...@gmail.com wrote: Hi Dejan, It becomes the error at the time of a start delay of ldapserver now. Therefore I send a patch revising it. I did pull request to github. https://github.com/nozawat/resource-agents/commit/bbc0934ea7b906f818e3c7e00bfaaa61ae46bc51 OK. Though the description doesn't seem to fit the change. Shouldn't it be something like this: Medium: slapd: always set the exit code correctly in monitor In case ldapsearch would exit with an error code different from 49, the exit code may have been wrong. Cheers, Dejan Regards, Tomo -- Tomoya Nozawa ___ Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev Home Page: http://linux-ha.org/ ___ Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev Home Page: http://linux-ha.org/
Re: [Linux-ha-dev] [PATCH] Correction at the time of the slapd start delay
Hi again, On Wed, May 09, 2012 at 06:14:34PM +0900, noza...@gmail.com wrote: Hi Dejan, Since recently it seems as if all emails to linux-ha-dev are addressed to me. Happily, I'm not the only one around watching. Nor should I feel as the only responsible for the state of affairs in our project. I just thought I should straighten things before it gets too late ;-) Cheers, Dejan P.S. Not to be taken personally! ___ Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev Home Page: http://linux-ha.org/
Re: [Linux-ha-dev] [PATCH] Correction at the time of the slapd start delay
Hi Dejan, Thank you for comment. I made modifications and did pull request again. Regards, Tomo On Wed, 9 May 2012 11:47:50 +0200 Dejan Muhamedagic de...@suse.de wrote: Hi Tomoya-san, On Wed, May 09, 2012 at 06:14:34PM +0900, noza...@gmail.com wrote: Hi Dejan, It becomes the error at the time of a start delay of ldapserver now. Therefore I send a patch revising it. I did pull request to github. https://github.com/nozawat/resource-agents/commit/bbc0934ea7b906f818e3c7e00bfaaa61ae46bc51 OK. Though the description doesn't seem to fit the change. Shouldn't it be something like this: Medium: slapd: always set the exit code correctly in monitor In case ldapsearch would exit with an error code different from 49, the exit code may have been wrong. Cheers, Dejan Regards, Tomo -- Tomoya Nozawa ___ Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev Home Page: http://linux-ha.org/ ___ Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev Home Page: http://linux-ha.org/ -- Tomoya Nozawa ___ Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev Home Page: http://linux-ha.org/
Re: [Linux-HA] We Rebooted a Healthy Standby Node and All the Services on the Primary Node Restarted?
Hi, Just an idea to be verified : Usually this is due to the fact that services are launched at boot time, so when the standby node is rebooted , Pacemaker detects that service is running on more than 1 node, and so stops the resource(s) and start it again only on 1 node. Again: to be checked (chkconfig etc.) Alain De :Robinson, Eric eric.robin...@psmnv.com A : linux-ha@lists.linux-ha.org Date : 07/05/2012 20:27 Objet : [Linux-HA] We Rebooted a Healthy Standby Node and All the Services on the Primary Node Restarted? Envoyé par :linux-ha-boun...@lists.linux-ha.org Hi guys, we rebooted a standby node of a healthy cluster and suddenly all the resources on the primary cluster restarted. What's up with that? Before rebooting the standby node, we did the normal stuff to verify that all was well. crm_mon showed all nodes online, in their expected roles, with correct quorum votes cat /proc/drbd showed correct dbbd status corosync-cfgtool -s showed all rings active without faults When we rebooted the standby node (ha08c), crm_mon on the primary node (ha08a) showed that all the resources stopped and then restarted, resulting in brief loss of availability to customers. Following is what crm_mon showed before server ha08c was rebooted and after it came back up. Following that is our crm configuration. Last updated: Mon May 7 11:13:32 2012 Stack: openais Current DC: ha08a.mycharts.md - partition with quorum Version: 1.1.2-f059ec7ced7a86f18e5490b67ebf4a0b963bccfe 3 Nodes configured, 3 expected votes 4 Resources configured. Online: [ ha08a.mycharts.md ha08b.mycharts.md ha08c.mycharts.md ] Master/Slave Set: ms_drbd0 Masters: [ ha08a.mycharts.md ] Slaves: [ ha08c.mycharts.md ] Master/Slave Set: ms_drbd1 Masters: [ ha08b.mycharts.md ] Slaves: [ ha08c.mycharts.md ] Resource Group: g_clust06 p_fs_clust06 (ocf::heartbeat:Filesystem):Started ha08a.mycharts.md p_vip_clust06 (ocf::heartbeat:IPaddr2): Started ha08a.mycharts.md p_mysql_371(lsb:mysql_371):Started ha08a.mycharts.md p_mysql_372(lsb:mysql_372):Started ha08a.mycharts.md p_mysql_373(lsb:mysql_373):Started ha08a.mycharts.md p_mysql_374(lsb:mysql_374):Started ha08a.mycharts.md p_mysql_375(lsb:mysql_375):Started ha08a.mycharts.md p_mysql_376(lsb:mysql_376):Started ha08a.mycharts.md p_mysql_047(lsb:mysql_047):Started ha08a.mycharts.md p_mysql_100(lsb:mysql_100):Started ha08a.mycharts.md p_mysql_379(lsb:mysql_379):Started ha08a.mycharts.md p_mysql_377(lsb:mysql_377):Started ha08a.mycharts.md p_mysql_378(lsb:mysql_378):Started ha08a.mycharts.md p_mysql_380(lsb:mysql_380):Started ha08a.mycharts.md p_mysql_381(lsb:mysql_381):Started ha08a.mycharts.md p_mysql_382(lsb:mysql_382):Started ha08a.mycharts.md p_mysql_383(lsb:mysql_383):Started ha08a.mycharts.md p_mysql_384(lsb:mysql_384):Started ha08a.mycharts.md p_mysql_385(lsb:mysql_385):Started ha08a.mycharts.md p_mysql_386(lsb:mysql_386):Started ha08a.mycharts.md p_mysql_387(lsb:mysql_387):Started ha08a.mycharts.md p_mysql_002(lsb:mysql_002):Started ha08a.mycharts.md p_mysql_035(lsb:mysql_035):Started ha08a.mycharts.md p_mysql_049(lsb:mysql_049):Started ha08a.mycharts.md p_mysql_097(lsb:mysql_097):Started ha08a.mycharts.md p_mysql_024(lsb:mysql_024):Started ha08a.mycharts.md p_mysql_077(lsb:mysql_077):Started ha08a.mycharts.md p_mysql_084(lsb:mysql_084):Started ha08a.mycharts.md p_mysql_113(lsb:mysql_113):Started ha08a.mycharts.md p_mysql_116(lsb:mysql_116):Started ha08a.mycharts.md p_mysql_388(lsb:mysql_388):Started ha08a.mycharts.md p_mysql_389(lsb:mysql_389):Started ha08a.mycharts.md p_mysql_390(lsb:mysql_390):Started ha08a.mycharts.md p_mysql_391(lsb:mysql_391):Started ha08a.mycharts.md p_mysql_392(lsb:mysql_392):Started ha08a.mycharts.md p_mysql_393(lsb:mysql_393):Started ha08a.mycharts.md p_mysql_394(lsb:mysql_394):Started ha08a.mycharts.md p_mysql_395(lsb:mysql_395):Started ha08a.mycharts.md p_mysql_396(lsb:mysql_396):Started ha08a.mycharts.md p_mysql_397(lsb:mysql_397):Started ha08a.mycharts.md p_mysql_398(lsb:mysql_398):Started ha08a.mycharts.md p_mysql_399(lsb:mysql_399):Started
[Linux-HA] Heartbeat question about multiple services
Hi, I have a question about heartbeat, if I have three services, apache, mysql and sendmail,if apache is down, heartbeat will switch all the services to the standby server, right? If so, how to configure heartbeat to avoid this happen? Very Appreciated.gm ___ Linux-HA mailing list Linux-HA@lists.linux-ha.org http://lists.linux-ha.org/mailman/listinfo/linux-ha See also: http://linux-ha.org/ReportingProblems
[Linux-HA] Pacemaker monitor
Hi, recently I'm clustering the OpenSIPS with two Ubuntu computers. I did it step by step and used the tutorial : http://anders.com/cms/259/Linux.Tutorial/OpenSer/Heartbeat.v2.0 But unfortunately I still met so many problems. The follows are my configuration files: cib.xml cib validate-with=transitional-0.6 crm_feature_set=3.0.1 have-quorum=1 admin_epoch=0 epoch=13 num_updates=0 cib-last-written=Wed Apr 25 19:06:24 2012 dc-uuid=1ca0c19b-5955-42f9-9131-b200c7c0d8ca configuration crm_config cluster_property_set id=cluster-property-set attributes nvpair id=short_resource_names name=short_resource_names value=true/ nvpair id=pe-input-series-max name=pe-input-series-max value=-1/ nvpair id=default-resource-stickiness name=default-resource-stickiness value=10/ nvpair id=default-resource-failure-stickiness name=default-resource-failure-stickiness value=-10/ nvpair id=start-failure-is-fatal name=start-failure-is-fatal value=false/ /attributes /cluster_property_set cluster_property_set id=cib-bootstrap-options attributes nvpair id=cib-bootstrap-options-last-lrm-refresh name=last-lrm-refresh value=1194982799/ nvpair id=cib-bootstrap-options-dc-version name=dc-version value=1.0.9-da7075976b5ff0bee71074385f8fd02f296ec8a3/ nvpair id=cib-bootstrap-options-cluster-infrastructure name=cluster-infrastructure value=Heartbeat/ /attributes /cluster_property_set /crm_config nodes node id=1ca0c19b-5955-42f9-9131-b200c7c0d8ca uname=opensips1 type=normal/ node id=e1044f5c-c3a6-44b6-84de-8513b2f7df90 uname=opensips2 type=normal/ /nodes resources group id=IPaddr2_OpenSIPS_group primitive id=IPaddr2-10.120.89.222 class=ocf type=IPaddr2 provider=heartbeat operations op id=ipaddr2-10.120.89.222-monitor name=monitor interval=5s timeout=3s/ /operations instance_attributes id=IPaddr2-10.120.89.222-attributes attributes nvpair id=ipaddr2-10.120.89.222-ip name=ip value=10.120.89.222/ nvpair id=ipaddr2-10.120.89.222-broadcast name=broadcast value=10.120.89.255/ nvpair id=ipaddr2-10.120.89.222-nic name=nic value=eth0/ nvpair id=ipaddr2-10.120.89.222-cidr_netmask name=cidr_netmask value=24/ /attributes /instance_attributes /primitive primitive id=IPaddr2-192.168.56.199 class=ocf type=IPaddr2 provider=heartbeat operations op id=ipaddr2-1192.168.56.199-monitor name=monitor interval=5s timeout=3s/ /operations instance_attributes id=IPaddr2-192.168.56.199-attributes attributes nvpair id=ipaddr2-192.168.56.199-ip name=ip value=192.168.56.199/ nvpair id=ipaddr2-192.168.56.199-broadcast name=broadcast value=192.168.56.255/ nvpair id=ipaddr2-192.168.56.199-nic name=nic value=eth1/ nvpair id=ipaddr2-192.168.56.199-cidr_netmask name=cidr_netmask value=24/ /attributes /instance_attributes /primitive primitive id=OpenSIPS class=ocf type=OpenSIPS provider= anders.com operations op id=opensips-start name=start timeout=20s/ op id=opensips-stop name=stop timeout=3s/ op id=opensips-monitor name=monitor interval=10s timeout=6s/ /operations /primitive /group /resources constraints rsc_location id=OpenSIPS_resource_location rsc=OpenSIPS rule id=rule_opensips1 score=100 expression id=expression_uname_eq_opensips1 attribute=#uname operation=eq value=opensips1/ /rule rule id=rule_opensips2 score=10 expression id=expression_uname_eq_opensips2 attribute=#uname operation=eq value=opensips2/ /rule /rsc_location /constraints /configuration /cib OpenSIPS script: #!/bin/sh # Initialization: . /usr/lib/ocf/resource.d/heartbeat/.ocf-shellfuncs usage() { cat END usage: $0 {start|stop|status|monitor|meta-data|validate-all} END } meta_data() { cat END ?xml version=1.0? !DOCTYPE resource-agent SYSTEM ra-api-1.dtd resource-agent name=OpenSIPS version1.0/version longdesc lang=en Resource Agent for the OpenSIPS SIP Proxy. /longdesc shortdesc lang=enOpenSIPS resource agent/shortdesc actions action name=start timeout=30 / action name=stop timeout=30 / action name=status depth=0 timeout=30 interval=10 start-delay=30 / action name=monitor depth=0 timeout=30 interval=10 start-delay=30 / action name=meta-data timeout=5 / action name=validate-all timeout=5 / action name=notify timeout=5 / action name=promote timeout=5 / action name=demote timeout=5 / /actions /resource-agent END } OpenSIPS_Status() { /root/sipp.svn/sipp 127.0.0.1 -sf /root/sipp.svn/files/REGISTER_client.xml -inf
[Linux-HA] questions about Heartbeat
Dear Sir/Madam: I have questions about Heartbeat 3.0. Could you please tell me how much nodes Heartbeat 3.0 can support? If I run a Multi-Process program in several nodes, could this program be declined as a service monitored by Heartbeat? and the last question is can I decline my function to deal with the node error and insert into Heartbeat? if so, How to do it ? Thank you very much for your time. I am looking forward to your return. Zhimin Wu 2012.4.27 ___ Linux-HA mailing list Linux-HA@lists.linux-ha.org http://lists.linux-ha.org/mailman/listinfo/linux-ha See also: http://linux-ha.org/ReportingProblems
[Linux-HA] Linux-HA help needed
Does anyone know how to configure Linux-ha as a load balancer? I seem to remember it's possible but can't find any documentation on it. What we want is for a user coming into a given VIP to get load balanced to one of four active ports on the box in a round robin situation. Thanks in advance for any help. Regards, Christopher A. Davis Global Support Desk Patsystems (NA) LLC 200 W. Madison Avenue Suite 1400 Chicago, IL 60606 www.patsystems.com http://www.patsystems.com/ Please note that the Patsystems Chicago office has relocated to: 200 W. Madison, Suite 1400, Chicago, IL 60606 DISCLAIMER: This e-mail is confidential and may also be legally privileged. If you are not the intended recipient, use of the information contained in this e-mail (including disclosure, copying or distribution) is prohibited and may be unlawful. Please inform the sender and delete the message immediately from your system. This e-mail is attributed to the sender and may not necessarily reflect the views of the Patsystems Group and no member of the Patsystems Group accepts any liability for any action taken in reliance on the contents of this e-mail (other than where it has a legal or regulatory obligation to do so) or for the consequences of any computer viruses which may have been transmitted by this e-mail. The Patsystems Group comprises Patsystems plc and its subsidiary group of companies. ___ Linux-HA mailing list Linux-HA@lists.linux-ha.org http://lists.linux-ha.org/mailman/listinfo/linux-ha See also: http://linux-ha.org/ReportingProblems
Re: [Linux-HA] Linux-HA help needed
Does anyone know how to configure Linux-ha as a load balancer? I seem to remember it's possible but can't find any documentation on it. What we want is for a user coming into a given VIP to get load balanced to one of four active ports on the box in a round robin situation. Thanks in advance for any help. Regards, Christopher A. Davis Make the loadbalancer run as a clone on both nodes under cluster control and make the VIP dependent on the LVS clone. I use ldirectord configuring the LVS and for the monitoring of the real servers because it nicely integrates into the cluster with its own resource agent. With this setup you even can use the state table sync feature of LVS. Greetings, -- Dr. Michael Schwartzkopff Guardinistr. 63 81375 München Tel: (0163) 172 50 98 signature.asc Description: This is a digitally signed message part. ___ Linux-HA mailing list Linux-HA@lists.linux-ha.org http://lists.linux-ha.org/mailman/listinfo/linux-ha See also: http://linux-ha.org/ReportingProblems
Re: [Linux-HA] Heartbeat question about multiple services
Am Freitag, 20. April 2012 12:42:16 schrieb sgm: Hi, I have a question about heartbeat, if I have three services, apache, mysql and sendmail,if apache is down, heartbeat will switch all the services to the standby server, right? It's depending on configuration - also possible ... If so, how to configure heartbeat to avoid this happen? You can configure your 2 services (mysql and sendmail for example ) with colocations constraints, or as a group - there are many possibilities. Did you already RTFM (read the f... manuals)? Very Appreciated.gm ___ Linux-HA mailing list Linux-HA@lists.linux-ha.org http://lists.linux-ha.org/mailman/listinfo/linux-ha See also: http://linux-ha.org/ReportingProblems HTH Nikita Michalko ___ Linux-HA mailing list Linux-HA@lists.linux-ha.org http://lists.linux-ha.org/mailman/listinfo/linux-ha See also: http://linux-ha.org/ReportingProblems
Re: [Linux-HA] weird problem w/ R1
Hi, On Wed, May 02, 2012 at 06:25:55PM -0500, Dimitri Maziuk wrote: Hi everyone, I must be overlooking something obvious... I have a simple haresources setup with node drbddisk::sessdata Filesystem::/dev/drbd0::/raid::ext3::rw \ ip.addr httpd xinetd pure_ftpd pure_uploadscript bacula-client mon bacula-client is in /etc/ha.d/resource.d, it's a copy of stock /etc/init.d/bacula-fd with config, lock, and pid file changed to make it listen on a non-standard port: this is for backing up drbd filesystem (there's the standard bacula client running also). bacula-client doesn't start. I added a couple of 'logger' lines and if I manually run /etc/ha.d/resource.d/bacula-client start ; echo $? I get 0 and the log: node logger: starting bacula-fd -c /etc/bacula/deposit-fd.conf node logger: bacula-fd -c /etc/bacula/deposit-fd.conf running Yet on failover I get this: node ResourceManager[3734]: info: Running /etc/init.d/httpd start node ResourceManager[3734]: info: Running /etc/init.d/xinetd start node ResourceManager[3734]: info: Running /etc/ha.d/resource.d/pure_ftpd start node xinetd[4204]: xinetd Version 2.3.14 started with libwrap loadavg labeled-networking options compiled in. node xinetd[4204]: Started working: 1 available service node ResourceManager[3734]: info: Running /etc/ha.d/resource.d/pure_uploadscript start node ResourceManager[3734]: info: Running /etc/init.d/mon start It doesn't seem to run that particular script: it starts pure_uploadscript from resource.d and mon from init.d, but not the one in between. What's weird is I now have it happening on 2 clusters: centos 5 w/ heartbeat 2.1.4, and centos 6 w/ heartbeat 3.0.4. The only common thing is bacula version: 5. Any ideas? No, but you can add set -x in some places in ResourceManager and see what gives. Thanks, Dejan TIA -- Dimitri Maziuk Programmer/sysadmin BioMagResBank, UW-Madison -- http://www.bmrb.wisc.edu ___ Linux-HA mailing list Linux-HA@lists.linux-ha.org http://lists.linux-ha.org/mailman/listinfo/linux-ha See also: http://linux-ha.org/ReportingProblems ___ Linux-HA mailing list Linux-HA@lists.linux-ha.org http://lists.linux-ha.org/mailman/listinfo/linux-ha See also: http://linux-ha.org/ReportingProblems
Re: [Linux-HA] Heartbeat question about multiple services
Il giorno Ven 20 Apr 2012 12:42:16 CEST, sgm ha scritto: Hi, I have a question about heartbeat, if I have three services, apache, mysql and sendmail,if apache is down, heartbeat will switch all the services to the standby server, right? If so, how to configure heartbeat to avoid this happen? Very Appreciated.gm You may want to start from here: http://www.clusterlabs.org/doc/en-US/Pacemaker/1.1/html/Clusters_from_Scratch/ -- RaSca Mia Mamma Usa Linux: Niente è impossibile da capire, se lo spieghi bene! ra...@miamammausalinux.org http://www.miamammausalinux.org ___ Linux-HA mailing list Linux-HA@lists.linux-ha.org http://lists.linux-ha.org/mailman/listinfo/linux-ha See also: http://linux-ha.org/ReportingProblems
Re: [Linux-HA] Heartbeat question about multiple services
On 4/20/2012 at 05:42 AM, sgm sgm...@yahoo.com.cn wrote: Hi, I have a question about heartbeat, if I have three services, apache, mysql and sendmail,if apache is down, heartbeat will switch all the services to the standby server, right? Maybe. It depends on how you have built and configured your cluster. ___ Linux-HA mailing list Linux-HA@lists.linux-ha.org http://lists.linux-ha.org/mailman/listinfo/linux-ha See also: http://linux-ha.org/ReportingProblems
Re: [Linux-HA] We Rebooted a Healthy Standby Node and All the Services on the Primary Node Restarted?
Hi, Just an idea to be verified : Usually this is due to the fact that services are launched at boot time, so when the standby node is rebooted , Pacemaker detects that service is running on more than 1 node, and so stops the resource(s) and start it again only on 1 node. Again: to be checked (chkconfig etc.) Alain Verified. Services are only started by Pacemaker, nothing starts on boot (except Pacemaker and Corosync). De :Robinson, Eric eric.robin...@psmnv.com A : linux-ha@lists.linux-ha.org Date : 07/05/2012 20:27 Objet : [Linux-HA] We Rebooted a Healthy Standby Node and All the Services on the Primary Node Restarted? Envoyé par :linux-ha-boun...@lists.linux-ha.org Hi guys, we rebooted a standby node of a healthy cluster and suddenly all the resources on the primary cluster restarted. What's up with that? Before rebooting the standby node, we did the normal stuff to verify that all was well. crm_mon showed all nodes online, in their expected roles, with correct quorum votes cat /proc/drbd showed correct dbbd status corosync-cfgtool -s showed all rings active without faults When we rebooted the standby node (ha08c), crm_mon on the primary node (ha08a) showed that all the resources stopped and then restarted, resulting in brief loss of availability to customers. Following is what crm_mon showed before server ha08c was rebooted and after it came back up. Following that is our crm configuration. Last updated: Mon May 7 11:13:32 2012 Stack: openais Current DC: ha08a.mycharts.md - partition with quorum Version: 1.1.2-f059ec7ced7a86f18e5490b67ebf4a0b963bccfe 3 Nodes configured, 3 expected votes 4 Resources configured. Online: [ ha08a.mycharts.md ha08b.mycharts.md ha08c.mycharts.md ] Master/Slave Set: ms_drbd0 Masters: [ ha08a.mycharts.md ] Slaves: [ ha08c.mycharts.md ] Master/Slave Set: ms_drbd1 Masters: [ ha08b.mycharts.md ] Slaves: [ ha08c.mycharts.md ] Resource Group: g_clust06 p_fs_clust06 (ocf::heartbeat:Filesystem):Started ha08a.mycharts.md p_vip_clust06 (ocf::heartbeat:IPaddr2): Started ha08a.mycharts.md p_mysql_371(lsb:mysql_371):Started ha08a.mycharts.md p_mysql_372(lsb:mysql_372):Started ha08a.mycharts.md p_mysql_373(lsb:mysql_373):Started ha08a.mycharts.md p_mysql_374(lsb:mysql_374):Started ha08a.mycharts.md p_mysql_375(lsb:mysql_375):Started ha08a.mycharts.md p_mysql_376(lsb:mysql_376):Started ha08a.mycharts.md p_mysql_047(lsb:mysql_047):Started ha08a.mycharts.md p_mysql_100(lsb:mysql_100):Started ha08a.mycharts.md p_mysql_379(lsb:mysql_379):Started ha08a.mycharts.md p_mysql_377(lsb:mysql_377):Started ha08a.mycharts.md p_mysql_378(lsb:mysql_378):Started ha08a.mycharts.md p_mysql_380(lsb:mysql_380):Started ha08a.mycharts.md p_mysql_381(lsb:mysql_381):Started ha08a.mycharts.md p_mysql_382(lsb:mysql_382):Started ha08a.mycharts.md p_mysql_383(lsb:mysql_383):Started ha08a.mycharts.md p_mysql_384(lsb:mysql_384):Started ha08a.mycharts.md p_mysql_385(lsb:mysql_385):Started ha08a.mycharts.md p_mysql_386(lsb:mysql_386):Started ha08a.mycharts.md p_mysql_387(lsb:mysql_387):Started ha08a.mycharts.md p_mysql_002(lsb:mysql_002):Started ha08a.mycharts.md p_mysql_035(lsb:mysql_035):Started ha08a.mycharts.md p_mysql_049(lsb:mysql_049):Started ha08a.mycharts.md p_mysql_097(lsb:mysql_097):Started ha08a.mycharts.md p_mysql_024(lsb:mysql_024):Started ha08a.mycharts.md p_mysql_077(lsb:mysql_077):Started ha08a.mycharts.md p_mysql_084(lsb:mysql_084):Started ha08a.mycharts.md p_mysql_113(lsb:mysql_113):Started ha08a.mycharts.md p_mysql_116(lsb:mysql_116):Started ha08a.mycharts.md p_mysql_388(lsb:mysql_388):Started ha08a.mycharts.md p_mysql_389(lsb:mysql_389):Started ha08a.mycharts.md p_mysql_390(lsb:mysql_390):Started ha08a.mycharts.md p_mysql_391(lsb:mysql_391):Started ha08a.mycharts.md p_mysql_392(lsb:mysql_392):Started ha08a.mycharts.md p_mysql_393(lsb:mysql_393):Started ha08a.mycharts.md p_mysql_394(lsb:mysql_394):Started ha08a.mycharts.md p_mysql_395(lsb:mysql_395):Started
Re: [Linux-HA] We Rebooted a Healthy Standby Node and All the Services on the Primary Node Restarted?
On Tue, May 8, 2012 at 4:27 AM, Robinson, Eric eric.robin...@psmnv.com wrote: Hi guys, we rebooted a standby node of a healthy cluster and suddenly all the resources on the primary cluster restarted. What's up with that? Before rebooting the standby node, we did the normal stuff to verify that all was well. crm_mon showed all nodes online, in their expected roles, with correct quorum votes cat /proc/drbd showed correct dbbd status corosync-cfgtool -s showed all rings active without faults When we rebooted the standby node (ha08c), crm_mon on the primary node (ha08a) showed that all the resources stopped and then restarted, resulting in brief loss of availability to customers. Did they actually restart though? When a new DC gathers the current state of the cluster, the resources may appear to restart in crm_mon but its often just a display issue. Granted its quite annoying. Following is what crm_mon showed before server ha08c was rebooted and after it came back up. Following that is our crm configuration. Last updated: Mon May 7 11:13:32 2012 Stack: openais Current DC: ha08a.mycharts.md - partition with quorum Version: 1.1.2-f059ec7ced7a86f18e5490b67ebf4a0b963bccfe 3 Nodes configured, 3 expected votes 4 Resources configured. Online: [ ha08a.mycharts.md ha08b.mycharts.md ha08c.mycharts.md ] Master/Slave Set: ms_drbd0 Masters: [ ha08a.mycharts.md ] Slaves: [ ha08c.mycharts.md ] Master/Slave Set: ms_drbd1 Masters: [ ha08b.mycharts.md ] Slaves: [ ha08c.mycharts.md ] Resource Group: g_clust06 p_fs_clust06 (ocf::heartbeat:Filesystem): Started ha08a.mycharts.md p_vip_clust06 (ocf::heartbeat:IPaddr2): Started ha08a.mycharts.md p_mysql_371 (lsb:mysql_371): Started ha08a.mycharts.md p_mysql_372 (lsb:mysql_372): Started ha08a.mycharts.md p_mysql_373 (lsb:mysql_373): Started ha08a.mycharts.md p_mysql_374 (lsb:mysql_374): Started ha08a.mycharts.md p_mysql_375 (lsb:mysql_375): Started ha08a.mycharts.md p_mysql_376 (lsb:mysql_376): Started ha08a.mycharts.md p_mysql_047 (lsb:mysql_047): Started ha08a.mycharts.md p_mysql_100 (lsb:mysql_100): Started ha08a.mycharts.md p_mysql_379 (lsb:mysql_379): Started ha08a.mycharts.md p_mysql_377 (lsb:mysql_377): Started ha08a.mycharts.md p_mysql_378 (lsb:mysql_378): Started ha08a.mycharts.md p_mysql_380 (lsb:mysql_380): Started ha08a.mycharts.md p_mysql_381 (lsb:mysql_381): Started ha08a.mycharts.md p_mysql_382 (lsb:mysql_382): Started ha08a.mycharts.md p_mysql_383 (lsb:mysql_383): Started ha08a.mycharts.md p_mysql_384 (lsb:mysql_384): Started ha08a.mycharts.md p_mysql_385 (lsb:mysql_385): Started ha08a.mycharts.md p_mysql_386 (lsb:mysql_386): Started ha08a.mycharts.md p_mysql_387 (lsb:mysql_387): Started ha08a.mycharts.md p_mysql_002 (lsb:mysql_002): Started ha08a.mycharts.md p_mysql_035 (lsb:mysql_035): Started ha08a.mycharts.md p_mysql_049 (lsb:mysql_049): Started ha08a.mycharts.md p_mysql_097 (lsb:mysql_097): Started ha08a.mycharts.md p_mysql_024 (lsb:mysql_024): Started ha08a.mycharts.md p_mysql_077 (lsb:mysql_077): Started ha08a.mycharts.md p_mysql_084 (lsb:mysql_084): Started ha08a.mycharts.md p_mysql_113 (lsb:mysql_113): Started ha08a.mycharts.md p_mysql_116 (lsb:mysql_116): Started ha08a.mycharts.md p_mysql_388 (lsb:mysql_388): Started ha08a.mycharts.md p_mysql_389 (lsb:mysql_389): Started ha08a.mycharts.md p_mysql_390 (lsb:mysql_390): Started ha08a.mycharts.md p_mysql_391 (lsb:mysql_391): Started ha08a.mycharts.md p_mysql_392 (lsb:mysql_392): Started ha08a.mycharts.md p_mysql_393 (lsb:mysql_393): Started ha08a.mycharts.md p_mysql_394 (lsb:mysql_394): Started ha08a.mycharts.md p_mysql_395 (lsb:mysql_395): Started ha08a.mycharts.md p_mysql_396 (lsb:mysql_396): Started ha08a.mycharts.md p_mysql_397 (lsb:mysql_397): Started ha08a.mycharts.md p_mysql_398 (lsb:mysql_398): Started ha08a.mycharts.md p_mysql_399 (lsb:mysql_399): Started ha08a.mycharts.md p_mysql_400 (lsb:mysql_400): Started ha08a.mycharts.md p_mysql_401 (lsb:mysql_401): Started ha08a.mycharts.md p_mysql_402 (lsb:mysql_402): Started ha08a.mycharts.md p_mysql_403 (lsb:mysql_403):
Re: [Linux-HA] Pacemaker monitor
On Thu, Apr 26, 2012 at 1:17 PM, dong he smiledon...@gmail.com wrote: Hi, recently I'm clustering the OpenSIPS with two Ubuntu computers. I did it step by step and used the tutorial : http://anders.com/cms/259/Linux.Tutorial/OpenSer/Heartbeat.v2.0 But unfortunately I still met so many problems. The follows are my configuration files: cib.xml cib validate-with=transitional-0.6 crm_feature_set=3.0.1 have-quorum=1 admin_epoch=0 epoch=13 num_updates=0 cib-last-written=Wed Apr 25 19:06:24 2012 dc-uuid=1ca0c19b-5955-42f9-9131-b200c7c0d8ca configuration crm_config cluster_property_set id=cluster-property-set attributes nvpair id=short_resource_names name=short_resource_names value=true/ nvpair id=pe-input-series-max name=pe-input-series-max value=-1/ nvpair id=default-resource-stickiness name=default-resource-stickiness value=10/ nvpair id=default-resource-failure-stickiness name=default-resource-failure-stickiness value=-10/ nvpair id=start-failure-is-fatal name=start-failure-is-fatal value=false/ /attributes /cluster_property_set cluster_property_set id=cib-bootstrap-options attributes nvpair id=cib-bootstrap-options-last-lrm-refresh name=last-lrm-refresh value=1194982799/ nvpair id=cib-bootstrap-options-dc-version name=dc-version value=1.0.9-da7075976b5ff0bee71074385f8fd02f296ec8a3/ nvpair id=cib-bootstrap-options-cluster-infrastructure name=cluster-infrastructure value=Heartbeat/ /attributes /cluster_property_set /crm_config nodes node id=1ca0c19b-5955-42f9-9131-b200c7c0d8ca uname=opensips1 type=normal/ node id=e1044f5c-c3a6-44b6-84de-8513b2f7df90 uname=opensips2 type=normal/ /nodes resources group id=IPaddr2_OpenSIPS_group primitive id=IPaddr2-10.120.89.222 class=ocf type=IPaddr2 provider=heartbeat operations op id=ipaddr2-10.120.89.222-monitor name=monitor interval=5s timeout=3s/ /operations instance_attributes id=IPaddr2-10.120.89.222-attributes attributes nvpair id=ipaddr2-10.120.89.222-ip name=ip value=10.120.89.222/ nvpair id=ipaddr2-10.120.89.222-broadcast name=broadcast value=10.120.89.255/ nvpair id=ipaddr2-10.120.89.222-nic name=nic value=eth0/ nvpair id=ipaddr2-10.120.89.222-cidr_netmask name=cidr_netmask value=24/ /attributes /instance_attributes /primitive primitive id=IPaddr2-192.168.56.199 class=ocf type=IPaddr2 provider=heartbeat operations op id=ipaddr2-1192.168.56.199-monitor name=monitor interval=5s timeout=3s/ /operations instance_attributes id=IPaddr2-192.168.56.199-attributes attributes nvpair id=ipaddr2-192.168.56.199-ip name=ip value=192.168.56.199/ nvpair id=ipaddr2-192.168.56.199-broadcast name=broadcast value=192.168.56.255/ nvpair id=ipaddr2-192.168.56.199-nic name=nic value=eth1/ nvpair id=ipaddr2-192.168.56.199-cidr_netmask name=cidr_netmask value=24/ /attributes /instance_attributes /primitive primitive id=OpenSIPS class=ocf type=OpenSIPS provider= anders.com operations op id=opensips-start name=start timeout=20s/ op id=opensips-stop name=stop timeout=3s/ op id=opensips-monitor name=monitor interval=10s timeout=6s/ /operations /primitive /group /resources constraints rsc_location id=OpenSIPS_resource_location rsc=OpenSIPS rule id=rule_opensips1 score=100 expression id=expression_uname_eq_opensips1 attribute=#uname operation=eq value=opensips1/ /rule rule id=rule_opensips2 score=10 expression id=expression_uname_eq_opensips2 attribute=#uname operation=eq value=opensips2/ /rule /rsc_location /constraints /configuration /cib OpenSIPS script: #!/bin/sh # Initialization: . /usr/lib/ocf/resource.d/heartbeat/.ocf-shellfuncs usage() { cat END usage: $0 {start|stop|status|monitor|meta-data|validate-all} END } meta_data() { cat END ?xml version=1.0? !DOCTYPE resource-agent SYSTEM ra-api-1.dtd resource-agent name=OpenSIPS version1.0/version longdesc lang=en Resource Agent for the OpenSIPS SIP Proxy. /longdesc shortdesc lang=enOpenSIPS resource agent/shortdesc actions action name=start timeout=30 / action name=stop timeout=30 / action name=status depth=0 timeout=30 interval=10 start-delay=30 / action name=monitor depth=0 timeout=30 interval=10 start-delay=30 / action name=meta-data timeout=5 / action name=validate-all timeout=5 / action name=notify timeout=5 / action name=promote timeout=5 / action name=demote timeout=5 / /actions /resource-agent END }