Re: [Linux-HA] Heartbeat question about multiple services
On 4/20/2012 at 05:42 AM, sgm sgm...@yahoo.com.cn wrote: Hi, I have a question about heartbeat, if I have three services, apache, mysql and sendmail,if apache is down, heartbeat will switch all the services to the standby server, right? Maybe. It depends on how you have built and configured your cluster. ___ Linux-HA mailing list Linux-HA@lists.linux-ha.org http://lists.linux-ha.org/mailman/listinfo/linux-ha See also: http://linux-ha.org/ReportingProblems
Re: [Linux-HA] Active-passive cluster, best practice question
On 2/8/2012 at 01:46 AM, Jonathan Schaeffer jonathan.schaef...@univ-brest.fr wrote: I wanted to know if it is good practice (or common enough) to build a filesystem containing configuration data for the clustered services. Yes. Pretty much all of my clustered resource groups contain a Filesystem resource so that the service and its data can move between cluster nodes. ___ Linux-HA mailing list Linux-HA@lists.linux-ha.org http://lists.linux-ha.org/mailman/listinfo/linux-ha See also: http://linux-ha.org/ReportingProblems
Re: [Linux-HA] Active-passive cluster, best practice question
On 2/8/2012 at 03:18 AM, Florian Haas flor...@hastexo.com wrote: On Wed, Feb 8, 2012 at 8:46 AM, Jonathan Schaeffer jonathan.schaef...@univ-brest.fr wrote: Hi, I'm designing a cluster with N nodes plugged to a SAN device. There will be no shared storage on the cluster. That seems like a contradiction. Please explain. I think the OP means that there won't be any service using shared storage (ie: service runs on node1 and node2 simultaneously sharing one database on disk). The SAN, of course, provides shared storage to the nodes. At least that's how I read it. I wanted to know if it is good practice (or common enough) to build a filesystem containing configuration data for the clustered services. That's entirely possible, and people typically use NFS mounts for that purpose. ocf:Filesystem resources mount and unmount SAN based file systems nicely. Works for iSCSI too. I haven't used NFS, so far, though I don't see how it would be any different, really, since it's just mount / umount. ___ Linux-HA mailing list Linux-HA@lists.linux-ha.org http://lists.linux-ha.org/mailman/listinfo/linux-ha See also: http://linux-ha.org/ReportingProblems
Re: [Linux-ha-dev] change of process check in tomcat-ra
A minor suggestion on the tomcat RA. It uses wget -O ... to verify that tomcat is running. If the URL is an https://... type, and wget can't verify the certificate being used by the server, it errors out. Using wget --no-check-certificate -O ... would be better, given that this is only being used to see if the server is responding. ___ Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev Home Page: http://linux-ha.org/
[Linux-HA] RA manipulation of iptables firewall?
I have an application that must simultaneously run as a non-root user and listen on a port below 1024. I can do this, by hand, by making some iptables rules forwarding the traffic from the low port on a public ip address to a high port on a private ip address. Now I'm trying to find a way to run this app in my cluster. So far, I'm not seeing an RA already set up to do this. Before I go make an RA to add / remove iptables rules, has anybody else needed this before and already solved the problem? ___ Linux-HA mailing list Linux-HA@lists.linux-ha.org http://lists.linux-ha.org/mailman/listinfo/linux-ha See also: http://linux-ha.org/ReportingProblems
Re: [Linux-HA] RA manipulation of iptables firewall?
On 1/27/2012 at 02:37 PM, Dimitri Maziuk dmaz...@bmrb.wisc.edu wrote: On 01/27/2012 02:22 PM, David Gersic wrote: I have an application that must simultaneously run as a non-root user and listen on a port below 1024. I can do this, by hand, by making some iptables rules forwarding the traffic from the low port on a public ip address to a high port on a private ip address. Now I'm trying to find a way to run this app in my cluster. So far, I'm not seeing an RA already set up to do this. Why not make it static? Yeah, I could, but I didn't want to. I wanted to make it part of the resource group so it'll even be there if I add a new cluster node and move the group to it. ___ Linux-HA mailing list Linux-HA@lists.linux-ha.org http://lists.linux-ha.org/mailman/listinfo/linux-ha See also: http://linux-ha.org/ReportingProblems
Re: [Linux-HA] RA manipulation of iptables firewall?
On 1/27/2012 at 03:18 PM, Dimitri Maziuk dmaz...@bmrb.wisc.edu wrote: On 01/27/2012 02:48 PM, David Gersic wrote: On 1/27/2012 at 02:37 PM, Dimitri Maziuk dmaz...@bmrb.wisc.edu wrote: Why not make it static? Yeah, I could, but I didn't want to. I wanted to make it part of the resource group so it'll even be there if I add a new cluster node and move the group to it. Fair enough. Rewriting iptables rules in a script is not something I'd recommend, though. No guts, no glory? Anyway, there are only two of them, and they're not all that complicated: iptables -t nat -A PREROUTING -i eth3 -p tcp --destination 131.156.21.44 --dport 443 -j DNAT --to 10.0.0.1:8443 iptables -t nat -A PREROUTING -i eth3 -p tcp --destination 131.156.21.44 --dport 80 -j DNAT --to 10.0.0.1:8080 I can use a couple of IPaddr2 RAs to bind 10.0.0.1 and 131.156.21.44 to eth3, so no problems there. Then I just need to add the rules to iptables. On the stop action, deleting the rules shouldn't be any big deal to do with iptables -t nat -D PREROUTING So I guess I'll be writing an RA for this. I'll think some more about it over the weekend, but I'm thinking that the interface (eth3), external ip (131.156.21.44), external port (80), internal ip (10.0.0.1) and internal port (8080) should be the required parameters. The rest can be hard coded in the RA script. I'm looking at the 'portblock' RA as a possible starting point, though it may be easier to start from scratch. Actions start and stop should be easy enough. Actions status and monitor don't really make any sense, though, so I'm not sure what I'll do with those. ___ Linux-HA mailing list Linux-HA@lists.linux-ha.org http://lists.linux-ha.org/mailman/listinfo/linux-ha See also: http://linux-ha.org/ReportingProblems
Re: [Linux-HA] a simple question
What heartbeat version are you running? With HA, there is no master nor slave, there are peers. You may think of them however you like, and may build your preferences in to the configuration, but they are still peer nodes in the HA cluster. You may find more by reading up on the need to establish a quorum of nodes before the resources are brought on-line. Erik Wienberg 11/28/11 9:09 AM Hello, I have a simple heartbeat setup - one master, one slave. Control passes perfectly between the nodes. However, during a recent installation I did the following: 1) stopped heartbeat on the backup node - made configuration changes 2) stopped heartbeat on the active node - made configuration changes 3) started heartbeat on the formerly active node The idea was to keep things running here - while changing tje configuration. All things that are designed to start through heartbeat - floating ip address, services etc. - did NOT start until I activated the backup node again. Is there any way to tell heartbeat: - sorry but the backup node is currently not available (for whatever reason), but please do your best - you are currently on your own ? I hope you can help me on this one. Kind regards, Erik ___ Linux-HA mailing list Linux-HA@lists.linux-ha.org http://lists.linux-ha.org/mailman/listinfo/linux-ha See also: http://linux-ha.org/ReportingProblems ___ Linux-HA mailing list Linux-HA@lists.linux-ha.org http://lists.linux-ha.org/mailman/listinfo/linux-ha See also: http://linux-ha.org/ReportingProblems
Re: [Linux-ha-dev] Patch to JBoss RA
On 11/3/2011 at 11:20 AM, Dejan Muhamedagic de...@suse.de wrote: Hunks 2 and 3 fail, don't know if it's due to space being mangled or the jboss RA version you worked on is old: I started with the newest JBoss RA I could find, but that was a while ago. Where can I get the current one? Also, was there a reason removing \n in the export lines? To be honest, I don't remember. I think I had problems getting it working with them in. I'll re-check that though. ___ Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev Home Page: http://linux-ha.org/
[Linux-ha-dev] Patch to JBoss RA
I've added an option to the JBoss RA to allow specifying the JVM options. I needed this to be able to increase the memory and stack size from the JVM's defaults. --- jboss-original 2011-05-02 14:08:37.0 -0500 +++ jboss 2011-05-09 09:47:08.0 -0500 @@ -33,6 +33,7 @@ # OCF_RESKEY_user - A user name to start a JBoss. Default is root # OCF_RESKEY_statusurl - URL for state confirmation. Default is http://127.0.0.1:8080 # OCF_RESKEY_java_home - Home directory of the Java. Default is ${JAVA_HOME} +# OCF_RESKEY_java_opts - Options for Java. # OCF_RESKEY_jboss_home - Home directory of Jboss. Default is None # is it possible to devise this string from options? I'm afraid # that allowing users to set this could be error prone. @@ -100,8 +101,9 @@ $CONSOLE 21 else su - -s /bin/bash $JBOSS_USER \ --c export JAVA_HOME=${JAVA_HOME};\n -export JBOSS_HOME=${JBOSS_HOME};\n +-c export JAVA_HOME=${JAVA_HOME};\ +export JAVA_OPTS='${JAVA_OPTS}';\ +export JBOSS_HOME=${JBOSS_HOME};\ $JBOSS_HOME/bin/run.sh $RUN_OPTS \ $CONSOLE 21 fi @@ -127,8 +129,8 @@ $CONSOLE 21 else su - -s /bin/bash $JBOSS_USER \ --c export JAVA_HOME=${JAVA_HOME};\n -export JBOSS_HOME=${JBOSS_HOME};\n +-c export JAVA_HOME=${JAVA_HOME};\ +export JBOSS_HOME=${JBOSS_HOME};\ $JBOSS_HOME/bin/shutdown.sh $SHUTDOWN_OPTS -S \ $CONSOLE 21 @@ -273,6 +275,14 @@ content type=string default=/ /parameter +parameter name=java_opts unique=0 required=0 +longdesc lang=en +Java options. +/longdesc +shortdescJava options./shortdesc +content type=string default=/ +/parameter + parameter name=jboss_home unique=1 required=1 longdesc lang=en Home directory of Jboss. @@ -336,6 +346,7 @@ PSTRING=${OCF_RESKEY_pstring-java -Dprogram.name=run.sh} RUN_OPTS=${OCF_RESKEY_run_opts--c default -l lpg4j} SHUTDOWN_OPTS=${OCF_RESKEY_shutdown_opts--s 127.0.0.1:1099} +JAVA_OPTS=${OCF_RESKEY_java_opts-} # test if these two are set and if directories exist and if the # required scripts/binaries exist; use OCF_ERR_INSTALLED ___ Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev Home Page: http://linux-ha.org/
[Linux-ha-dev] Administrivia
The list info page at (http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev) and the welcome email message both have links to (http://linux-ha.org/HATodo.html). Following this leads to a page that does not contain any actual to do list content. Is there a FAQ for this list? I have a patch for the JBoss RA I'd like to submit, but I'd like to be sure that it's tested and submitted properly. ___ Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev Home Page: http://linux-ha.org/
Re: [Linux-HA] custom jboss init script on pacemaker
Note: I know that I'm following up on an old list message here... On 11/30/2010 at 04:55 AM, Michael Kromer michael.kro...@millenux.com wrote: right, for reference: http://www.linux-ha.org/doc/re-ra-jboss.html Which is now moved to: http://www.linux-ha.org/doc/man-pages/re-ra-jboss.html I just recommend to take a safe look at the timeouts, as 60s could be too short for some larger applications. Agreed. I've also modified this OCF to include a JAVA_OPTS parameter, to allow passing in parameters for Java itself, not just for JBoss. I'll see if I can merge my changes in to the current version from linux-ha.org. Assuming that I can, who do I then provide them to so that others can benefit? ___ Linux-HA mailing list Linux-HA@lists.linux-ha.org http://lists.linux-ha.org/mailman/listinfo/linux-ha See also: http://linux-ha.org/ReportingProblems
Re: [Linux-HA] Solution for Auto-Mount/Unmount an Ext3 Filesystem
On 5/29/2010 at 04:54 AM, Mozafar Roshany mzfrosh...@gmail.com wrote: I've a mail system with two active/passive nodes using Heartbeat; these two servers use an ext3 partition on SAN storage for mailboxes. I want that partition always be mounted on the active node. I mean when node switch occurs for any reason, the partition be mounted/unmounted on new-active/pre-active server automatically. I saw the Filesystem resource; Filesystem is the right one. If you're using EXT3, you'll also want to look at tune2fs to ensure that you don't have an extended unplanned outage as ext spends time with fsck if it hasn't been run lately. ___ Linux-HA mailing list Linux-HA@lists.linux-ha.org http://lists.linux-ha.org/mailman/listinfo/linux-ha See also: http://linux-ha.org/ReportingProblems
Re: [Linux-HA] Help needed with JBoss (OCF) and Heartbeat2
On 5/11/2010 at 04:36 AM, Dejan Muhamedagic deja...@fastmail.fm wrote: It doesn't look like you're missing anything. If the lrmd considers the operation timed out in spite of a different timeout specified for the operation, then there seems to be a bug. Though I think that timeouts did work properly in Heartbeat 2.1.4. Since this is SLES10 and the old version of Heartbeat, you should open a support call with Novell. Thanks Dejan, I've reconfirmed this problem with heartbeat 2.1.4-0.15.5, and a simplified OCF (Dummy, modified to call sleep 600), just to rule out any problems with the jboss OCF that I was using. So I've open support request #10620125651 with Novell and will update this thread when I get a resolution. ___ Linux-HA mailing list Linux-HA@lists.linux-ha.org http://lists.linux-ha.org/mailman/listinfo/linux-ha See also: http://linux-ha.org/ReportingProblems
Re: [Linux-HA] Help needed with JBoss (OCF) and Heartbeat2
On 5/11/2010 at 04:36 AM, Dejan Muhamedagic deja...@fastmail.fm wrote: Linux sles10-3 2.6.16.60-0.34-default #1 Fri Jan 16 14:59:01 UTC 2009 i686 i686 i386 GNU/Linux heartbeat-2.1.4-0.11 Did you consider upgrading to SLE11? Sigh. Yes. I'm working on getting there, but first I need to get this particular application off the SLES9/Heartbeat1 cluster that it's running on now. operations op name=monitor interval=10s timeout=600s start_delay=600s id=317075e9-dabb-4923-9129-be16882f94a4/ op name=start interval=900s timeout=600s start_delay=10s id=ab407ca5-78e4-48e9-bee0-f70f64d011e4/ op name=stop interval=10s timeout=600s start_delay=10s id=33d5a72d-105c-4da0-99cb-b25de520a5ae/ /operations /primitive The start_delay is not needed and may confuse everybody when debugging the timing issues. If it is needed then the resource agent's start action is broken. I wasn't originally using start_delay, that was just an attempt to get it to do something other than what it's doing now. I'll take it back out again since it isn't doing anything helpful. Note the times shown for (Start and (Stop. They are 15:17:35 and 15:17:57. Only 22 seconds have elapsed since Start was called, and now Stop is being called. My understanding is that the OCF script is working correctly. Its job is to start the resource, and to wait until it can verify that the resource is running before returning to Heartbeat2. So far so good. The jboss OCF script is doing this, but Heartbeat2 isn't waiting for the script command to start to return. I have not, so far, found any way to influence this. As you can see from the operations block in the CIB, I have been cranking up the values to interval, timeout, and start_delay, for the monitor, start and stop operations. None of these changes seems to have any effect on what Heartbeat2 actually does. If Start hasn't returned successful within about 20 seconds, Heartbeat2 considers it to have timed out and kills it. What am I missing here? It doesn't look like you're missing anything. If the lrmd considers the operation timed out in spite of a different timeout specified for the operation, then there seems to be a bug. Though I think that timeouts did work properly in Heartbeat 2.1.4. Since this is SLES10 and the old version of Heartbeat, you should open a support call with Novell. Ok, thanks for the help Dejan. I wasn't sure if I was going crazy or if this is a bug. With confirmation that I'm not going crazy, or at least that this isn't evidence that I'm going crazy, I'll get a somewhat simpler test case put together, check to make sure Novell haven't already fixed this, and I'll get an incident open with them to get it fixed. ___ Linux-HA mailing list Linux-HA@lists.linux-ha.org http://lists.linux-ha.org/mailman/listinfo/linux-ha See also: http://linux-ha.org/ReportingProblems
[Linux-HA] Help needed with JBoss (OCF) and Heartbeat2
I'm not entirely new to Heartbeat2, but I've run in to something here that I have not been able to figure out. What I'm trying to do is create a JBoss resource, as part of a resource group (disk, ip, mysql, jboss), for an application. I have the disk, ip, and MySQL resources working, it's just the JBoss resource that's proving to be more difficult than expected. This particular JBoss application takes a while to get fully started, which is where I think I'm running in to trouble. More on that below. The servers are SLES10, and I'm using Heartbeat2: Linux sles10-3 2.6.16.60-0.34-default #1 Fri Jan 16 14:59:01 UTC 2009 i686 i686 i386 GNU/Linux heartbeat-2.1.4-0.11 I've defined my JBoss application resource: primitive class=ocf type=jboss provider=heartbeat is_managed=true id=JBoss_4 instance_attributes id=JBoss_4_instance_attrs attributes nvpair name=resource_name value=IDMProv id=4609063b-c767-4956-a2f9-f44f46b634a9/ nvpair name=console value=/shared/uadisk/rbpm37/jboss.log id=a78e869b-2b00-474c-987e-5919c1ce80e7/ nvpair name=shutdown_timeout value=60 id=16812ae8-4f3c-46df-bdbe-24f7e0fcd557/ nvpair name=user value=rbpm id=f8ccf1ed-c701-4dc0-b8e5-d66c779a8b9f/ nvpair name=statusurl value=http://131.156.12.4:8080/IDMProv; id=11d32111-4e2b-4224-99d2-20af4eb43eb8/ nvpair name=java_home value=/usr/java/jre1.6.0_18/ id=2881e29c-4b87-4094-bffc-d0d9e9682e16/ nvpair name=jboss_home value=/shared/uadisk/rbpm37/jboss id=d21e26f2-3e2f-4a98-9ecd-8f934b844434/ nvpair name=run_opts value=-c IDMProv -b 0.0.0.0 id=681efa45-d7f3-4ef0-b90d-5d652b6480d6/ nvpair name=shutdown_opts value=-S id=f10343d3-28cc-4497-b234-e4678dda5818/ /attributes /instance_attributes operations op name=monitor interval=10s timeout=600s start_delay=600s id=317075e9-dabb-4923-9129-be16882f94a4/ op name=start interval=900s timeout=600s start_delay=10s id=ab407ca5-78e4-48e9-bee0-f70f64d011e4/ op name=stop interval=10s timeout=600s start_delay=10s id=33d5a72d-105c-4da0-99cb-b25de520a5ae/ /operations /primitive This has gone through many iterations over the last few days. This is what's currently in the CIB. This particular Heartbeat2 version didn't include a JBoss OCF script, but I obtained this one from the list archives: #!/bin/sh # # Description: Manages a Jboss Server as an OCF High-Availability # resource under Heartbeat/LinuxHA control # # This program is free software; you can redistribute it and/or # modify it under the terms of the GNU General Public License # as published by the Free Software Foundation; either version 2 # of the License, or (at your option) any later version. # # This program is distributed in the hope that it will be useful, # but WITHOUT ANY WARRANTY; without even the implied warranty of # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the # GNU General Public License for more details. # # You should have received a copy of the GNU General Public License # along with this program; if not, write to the Free Software # Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA # 02110-1301, USA. # # Copyright (c) 2009 Bauer Systems KG / Stefan Schluppeck # # # OCF parameters: # OCF_RESKEY_resource_name - The name of the resource. Default is ${OCF_RESOURCE_INSTANCE} # why not let the RA log through lrmd? # 2009/09/09 Nakahira: # jboss_console is used to record output of the run.sh. # The log of Run.sh should not be output to ha-log because it is so annoying. # OCF_RESKEY_console - A destination of the log of jboss run and shutdown script. Default is /var/log/${OCF_RESKEY_resource_name}.log # OCF_RESKEY_shutdown_timeout - Time-out at the time of the stop. Default is 5 # OCF_RESKEY_kill_timeout - The re-try number of times awaiting a stop. Default is 10 # OCF_RESKEY_user - A user name to start a JBoss. Default is root # OCF_RESKEY_statusurl - URL for state confirmation. Default is http://127.0.0.1:8080 # OCF_RESKEY_java_home - Home directory of the Java. Default is ${JAVA_HOME} # OCF_RESKEY_jboss_home - Home directory of Jboss. Default is None # is it possible to devise this string from options? I'm afraid # that allowing users to set this could be error prone. # 2009/09/09 Nakahira: # It is difficult to set it automatically because jboss_pstring # greatly depends on the environment. At any rate, system architect # should note that pstring doesn't influence other processes. # OCF_RESKEY_pstring - String Jboss will found in procceslist. Default is java -Dprogram.name=run.sh # OCF_RESKEY_run_opts - Options for jboss to run. Default is -c default -l lpg4j # OCF_RESKEY_shutdown_opts - Options for jboss to shutdown. Default is -s 127.0.0.1:1099