Re: [Linux-HA] Heartbeat question about multiple services

2012-05-09 Thread David Gersic
 On 4/20/2012 at 05:42 AM, sgm sgm...@yahoo.com.cn wrote: 
 Hi,
 I have a question about heartbeat, if I have three services, apache, mysql 
 and sendmail,if apache is down, heartbeat will switch all the services to the 
 standby server, right?

Maybe. It depends on how you have built and configured your cluster.



___
Linux-HA mailing list
Linux-HA@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems


Re: [Linux-HA] Active-passive cluster, best practice question

2012-02-08 Thread David Gersic
 On 2/8/2012 at 01:46 AM, Jonathan Schaeffer 
 jonathan.schaef...@univ-brest.fr
wrote: 
 I wanted to know if it is good practice (or common enough) to build a
 filesystem containing configuration data for the clustered services.

Yes. Pretty much all of my clustered resource groups contain a Filesystem 
resource so that the service and its data can move between cluster nodes.




___
Linux-HA mailing list
Linux-HA@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems


Re: [Linux-HA] Active-passive cluster, best practice question

2012-02-08 Thread David Gersic
 On 2/8/2012 at 03:18 AM, Florian Haas flor...@hastexo.com wrote: 
 On Wed, Feb 8, 2012 at 8:46 AM, Jonathan Schaeffer
 jonathan.schaef...@univ-brest.fr wrote:
 Hi,

 I'm designing a cluster with N nodes plugged to a SAN device.

 There will be no shared storage on the cluster.
 
 That seems like a contradiction. Please explain.

I think the OP means that there won't be any service using shared storage (ie: 
service runs on node1 and node2 simultaneously sharing one database on disk). 
The SAN, of course, provides shared storage to the nodes. At least that's how I 
read it.


 I wanted to know if it is good practice (or common enough) to build a
 filesystem containing configuration data for the clustered services.
 
 That's entirely possible, and people typically use NFS mounts for that 
 purpose.

ocf:Filesystem resources mount and unmount SAN based file systems nicely. Works 
for iSCSI too. I haven't used NFS, so far, though I don't see how it would be 
any different, really, since it's just mount / umount.




___
Linux-HA mailing list
Linux-HA@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems


Re: [Linux-ha-dev] change of process check in tomcat-ra

2012-01-31 Thread David Gersic
A minor suggestion on the tomcat RA. It uses wget -O ... to verify that 
tomcat is running. If the URL is an https://... type, and wget can't verify the 
certificate being used by the server, it errors out. Using wget 
--no-check-certificate -O ... would be better, given that this is only being 
used to see if the server is responding.




___
Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
Home Page: http://linux-ha.org/


[Linux-HA] RA manipulation of iptables firewall?

2012-01-27 Thread David Gersic
I have an application that must simultaneously run as a non-root user and 
listen on a port below 1024. I can do this, by hand, by making some iptables 
rules forwarding the traffic from the low port on a public ip address to a high 
port on a private ip address. Now I'm trying to find a way to run this app in 
my cluster. So far, I'm not seeing an RA already set up to do this.

Before I go make an RA to add / remove iptables rules, has anybody else needed 
this before and already solved the problem?



___
Linux-HA mailing list
Linux-HA@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems


Re: [Linux-HA] RA manipulation of iptables firewall?

2012-01-27 Thread David Gersic
 On 1/27/2012 at 02:37 PM, Dimitri Maziuk dmaz...@bmrb.wisc.edu wrote: 
 On 01/27/2012 02:22 PM, David Gersic wrote:
 I have an application that must simultaneously run as a non-root
 user
 and listen on a port below 1024. I can do this, by hand, by making some
 iptables rules forwarding the traffic from the low port on a public ip
 address to a high port on a private ip address. Now I'm trying to find a
 way to run this app in my cluster. So far, I'm not seeing an RA already
 set up to do this.
 
 Why not make it static? 

Yeah, I could, but I didn't want to. I wanted to make it part of the resource 
group so it'll even be there if I add a new cluster node and move the group to 
it.



___
Linux-HA mailing list
Linux-HA@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems


Re: [Linux-HA] RA manipulation of iptables firewall?

2012-01-27 Thread David Gersic
 On 1/27/2012 at 03:18 PM, Dimitri Maziuk dmaz...@bmrb.wisc.edu wrote: 
 On 01/27/2012 02:48 PM, David Gersic wrote:
 On 1/27/2012 at 02:37 PM, Dimitri Maziuk dmaz...@bmrb.wisc.edu wrote: 
 
 Why not make it static? 
 
 Yeah, I could, but I didn't want to. I wanted to make it part of the
 resource group so it'll even be there if I add a new cluster node and
 move the group to it.
 
 Fair enough. Rewriting iptables rules in a script is not something I'd
 recommend, though.

No guts, no glory?

Anyway, there are only two of them, and they're not all that complicated:

iptables -t nat -A PREROUTING -i eth3 -p tcp --destination 131.156.21.44 
--dport 443 -j DNAT --to 10.0.0.1:8443

iptables -t nat -A PREROUTING -i eth3 -p tcp --destination 131.156.21.44 
--dport 80 -j DNAT --to 10.0.0.1:8080

I can use a couple of IPaddr2 RAs to bind 10.0.0.1 and 131.156.21.44 to eth3, 
so no problems there. Then I just need to add the rules to iptables. On the 
stop action, deleting the rules shouldn't be any big deal to do with iptables 
-t nat -D PREROUTING 

So I guess I'll be writing an RA for this. I'll think some more about it over 
the weekend, but I'm thinking that the interface (eth3), external ip 
(131.156.21.44), external port (80), internal ip (10.0.0.1) and internal port 
(8080) should be the required parameters. The rest can be hard coded in the RA 
script. I'm looking at the 'portblock' RA as a possible starting point, though 
it may be easier to start from scratch.

Actions start and stop should be easy enough. Actions status and monitor don't 
really make any sense, though, so I'm not sure what I'll do with those.



___
Linux-HA mailing list
Linux-HA@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems


Re: [Linux-HA] a simple question

2011-11-28 Thread David Gersic
What heartbeat version are you running?

With HA, there is no master nor slave, there are peers. You may think of 
them however you like, and may build your preferences in to the configuration, 
but they are still peer nodes in the HA cluster. You may find more by reading 
up on the need to establish a quorum of nodes before the resources are brought 
on-line.


 Erik Wienberg  11/28/11 9:09 AM 
Hello,
I have a simple heartbeat setup - one master, one slave. Control passes 
perfectly between the nodes.

However, during a recent installation I did the following:

1) stopped heartbeat on the backup node - made configuration changes
2) stopped heartbeat on the active node - made configuration changes
3) started heartbeat on the formerly active node
The idea was to keep things running here - while changing tje configuration.

All things that are designed to start through heartbeat - floating ip 
address, services etc. - did NOT start until I activated the backup node 
again.

Is there any way to tell heartbeat:

- sorry but the backup node is currently not available (for whatever 
reason), but please do your best - you are currently on your own ?

I hope you can help me on this one.

Kind regards,

Erik
___
Linux-HA mailing list
Linux-HA@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems

___
Linux-HA mailing list
Linux-HA@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems


Re: [Linux-ha-dev] Patch to JBoss RA

2011-11-04 Thread David Gersic
 On 11/3/2011 at 11:20 AM, Dejan Muhamedagic de...@suse.de wrote: 

 Hunks 2 and 3 fail, don't know if it's due to space being
 mangled or the jboss RA version you worked on is old:

I started with the newest JBoss RA I could find, but that was a while ago. 
Where can I get the current one?


 Also, was there a reason removing \n in the export lines?

To be honest, I don't remember. I think I had problems getting it working with 
them in. I'll re-check that though.




___
Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
Home Page: http://linux-ha.org/


[Linux-ha-dev] Patch to JBoss RA

2011-11-02 Thread David Gersic
I've added an option to the JBoss RA to allow specifying the JVM options.  I 
needed this to be able to increase the memory and stack size from the JVM's 
defaults.


--- jboss-original  2011-05-02 14:08:37.0 -0500
+++ jboss   2011-05-09 09:47:08.0 -0500
@@ -33,6 +33,7 @@
 #   OCF_RESKEY_user - A user name to start a JBoss. Default is root
 #   OCF_RESKEY_statusurl - URL for state confirmation. Default is 
http://127.0.0.1:8080
 #   OCF_RESKEY_java_home - Home directory of the Java. Default is ${JAVA_HOME}
+#   OCF_RESKEY_java_opts - Options for Java.
 #   OCF_RESKEY_jboss_home - Home directory of Jboss. Default is None
 # is it possible to devise this string from options? I'm afraid
 # that allowing users to set this could be error prone.
@@ -100,8 +101,9 @@
  $CONSOLE 21 
 else
 su - -s /bin/bash $JBOSS_USER \
--c export JAVA_HOME=${JAVA_HOME};\n
-export JBOSS_HOME=${JBOSS_HOME};\n
+-c export JAVA_HOME=${JAVA_HOME};\
+export JAVA_OPTS='${JAVA_OPTS}';\
+export JBOSS_HOME=${JBOSS_HOME};\
 $JBOSS_HOME/bin/run.sh $RUN_OPTS \
  $CONSOLE 21 
 fi
@@ -127,8 +129,8 @@
  $CONSOLE 21 
 else
 su - -s /bin/bash $JBOSS_USER \
--c export JAVA_HOME=${JAVA_HOME};\n
-export JBOSS_HOME=${JBOSS_HOME};\n
+-c export JAVA_HOME=${JAVA_HOME};\
+export JBOSS_HOME=${JBOSS_HOME};\
 $JBOSS_HOME/bin/shutdown.sh $SHUTDOWN_OPTS -S \
  $CONSOLE 21 
 
@@ -273,6 +275,14 @@
 content type=string default=/
 /parameter
 
+parameter name=java_opts unique=0 required=0
+longdesc lang=en
+Java options.
+/longdesc
+shortdescJava options./shortdesc
+content type=string default=/
+/parameter
+
 parameter name=jboss_home unique=1 required=1
 longdesc lang=en
 Home directory of Jboss.
@@ -336,6 +346,7 @@
 PSTRING=${OCF_RESKEY_pstring-java -Dprogram.name=run.sh}
 RUN_OPTS=${OCF_RESKEY_run_opts--c default -l lpg4j}
 SHUTDOWN_OPTS=${OCF_RESKEY_shutdown_opts--s 127.0.0.1:1099}
+JAVA_OPTS=${OCF_RESKEY_java_opts-}
 
 # test if these two are set and if directories exist and if the
 # required scripts/binaries exist; use OCF_ERR_INSTALLED




___
Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
Home Page: http://linux-ha.org/


[Linux-ha-dev] Administrivia

2011-10-31 Thread David Gersic
The list info page at (http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev) 
and the welcome email message both have links to 
(http://linux-ha.org/HATodo.html). Following this leads to a page that does not 
contain any actual to do list content.

Is there a FAQ for this list? I have a patch for the JBoss RA I'd like to 
submit, but I'd like to be sure that it's tested and submitted properly.



___
Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
Home Page: http://linux-ha.org/


Re: [Linux-HA] custom jboss init script on pacemaker

2011-08-31 Thread David Gersic
Note: I know that I'm following up on an old list message here...


 On 11/30/2010 at 04:55 AM, Michael Kromer michael.kro...@millenux.com 
 wrote:

 right, for reference:
 
 http://www.linux-ha.org/doc/re-ra-jboss.html

Which is now moved to:

http://www.linux-ha.org/doc/man-pages/re-ra-jboss.html


 I just recommend to take a safe look at the timeouts, as 60s could be
 too short for some larger applications.

Agreed. I've also modified this OCF to include a JAVA_OPTS parameter, to allow 
passing in parameters for Java itself, not just for JBoss. I'll see if I can 
merge my changes in to the current version from linux-ha.org. Assuming that I 
can, who do I then provide them to so that others can benefit?




___
Linux-HA mailing list
Linux-HA@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems


Re: [Linux-HA] Solution for Auto-Mount/Unmount an Ext3 Filesystem

2010-05-29 Thread David Gersic
 On 5/29/2010 at 04:54 AM, Mozafar Roshany mzfrosh...@gmail.com wrote: 
  I've a mail system with two active/passive nodes using Heartbeat; these two
 servers use an ext3 partition on SAN storage for mailboxes. I want that
 partition always be mounted on the active node. I mean when node switch
 occurs for any reason, the partition be mounted/unmounted on
 new-active/pre-active server automatically. I saw the Filesystem resource;

Filesystem is the right one. If you're using EXT3, you'll also want to look at 
tune2fs to ensure that you don't have an extended unplanned outage as ext 
spends time with fsck if it  hasn't been run lately.



___
Linux-HA mailing list
Linux-HA@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems


Re: [Linux-HA] Help needed with JBoss (OCF) and Heartbeat2

2010-05-14 Thread David Gersic
 On 5/11/2010 at 04:36 AM, Dejan Muhamedagic deja...@fastmail.fm wrote: 
 It doesn't look like you're missing anything. If the lrmd
 considers the operation timed out in spite of a different timeout
 specified for the operation, then there seems to be a bug. Though
 I think that timeouts did work properly in Heartbeat 2.1.4. Since
 this is SLES10 and the old version of Heartbeat, you should open
 a support call with Novell.

Thanks Dejan,

I've reconfirmed this problem with heartbeat 2.1.4-0.15.5, and a simplified OCF 
(Dummy, modified to call sleep 600), just to rule out any problems with the 
jboss OCF that I was using. So I've open support request #10620125651 with 
Novell and will update this thread when I get a resolution.




___
Linux-HA mailing list
Linux-HA@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems


Re: [Linux-HA] Help needed with JBoss (OCF) and Heartbeat2

2010-05-11 Thread David Gersic
 On 5/11/2010 at 04:36 AM, Dejan Muhamedagic deja...@fastmail.fm wrote: 

 Linux sles10-3 2.6.16.60-0.34-default #1 Fri Jan 16 14:59:01 UTC 2009 i686 
 i686 
 i386 GNU/Linux
 
 heartbeat-2.1.4-0.11
 
 Did you consider upgrading to SLE11?

Sigh. Yes. I'm working on getting there, but first I need to get this 
particular application off the SLES9/Heartbeat1 cluster that it's running on 
now.


  operations
   op name=monitor interval=10s timeout=600s start_delay=600s 
 id=317075e9-dabb-4923-9129-be16882f94a4/
   op name=start interval=900s timeout=600s start_delay=10s 
 id=ab407ca5-78e4-48e9-bee0-f70f64d011e4/
   op name=stop interval=10s timeout=600s start_delay=10s 
 id=33d5a72d-105c-4da0-99cb-b25de520a5ae/
  /operations
 /primitive
 
 The start_delay is not needed and may confuse everybody when
 debugging the timing issues. If it is needed then the resource
 agent's start action is broken.

I wasn't originally using start_delay, that was just an attempt to get it to do 
something other than what it's doing now. I'll take it back out again since it 
isn't doing anything helpful.


 Note the times shown for (Start and (Stop. They are
 15:17:35  and  15:17:57. Only 22 seconds have elapsed since
 Start was called, and now Stop is being called.
 
 My understanding is that the OCF script is working correctly.
 Its job is to start the resource, and to wait until it can
 verify that the resource is running before returning to
 Heartbeat2.
 
 So far so good.
 
 The jboss OCF script is doing this, but Heartbeat2
 isn't waiting for the script command to start  to return. I
 have not, so far, found any way to influence this. As you can
 see from the operations block in the CIB, I have been
 cranking up the values to interval, timeout, and start_delay,
 for the monitor, start and stop operations. None of these
 changes seems to have any effect on what Heartbeat2 actually
 does. If Start hasn't returned successful within about 20
 seconds, Heartbeat2 considers it to have timed out and kills
 it.
 
 What am I missing here?
 
 It doesn't look like you're missing anything. If the lrmd
 considers the operation timed out in spite of a different timeout
 specified for the operation, then there seems to be a bug. Though
 I think that timeouts did work properly in Heartbeat 2.1.4. Since
 this is SLES10 and the old version of Heartbeat, you should open
 a support call with Novell.


Ok, thanks for the help Dejan. I wasn't sure if I was going crazy or if this is 
a bug. With confirmation that I'm not going crazy, or at least that this isn't 
evidence that I'm going crazy, I'll get a somewhat simpler test case put 
together, check to make sure Novell haven't already fixed this, and I'll get an 
incident open with them to get it fixed.




___
Linux-HA mailing list
Linux-HA@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems


[Linux-HA] Help needed with JBoss (OCF) and Heartbeat2

2010-05-10 Thread David Gersic
I'm not entirely new to Heartbeat2, but I've run in to something here that I 
have not been able to figure out. What I'm trying to do is create a JBoss 
resource, as part of a resource group (disk, ip, mysql, jboss), for an 
application. I have the disk, ip, and MySQL resources working, it's just the 
JBoss resource that's proving to be more difficult than expected. This 
particular JBoss application takes a while to get fully started, which is where 
I think I'm running in to trouble. More on that below.


The servers are SLES10, and I'm using Heartbeat2:

Linux sles10-3 2.6.16.60-0.34-default #1 Fri Jan 16 14:59:01 UTC 2009 i686 i686 
i386 GNU/Linux

heartbeat-2.1.4-0.11



I've defined my JBoss application resource:

primitive class=ocf type=jboss provider=heartbeat is_managed=true 
id=JBoss_4
 instance_attributes id=JBoss_4_instance_attrs
  attributes
   nvpair name=resource_name value=IDMProv 
id=4609063b-c767-4956-a2f9-f44f46b634a9/
   nvpair name=console value=/shared/uadisk/rbpm37/jboss.log 
id=a78e869b-2b00-474c-987e-5919c1ce80e7/
   nvpair name=shutdown_timeout value=60 
id=16812ae8-4f3c-46df-bdbe-24f7e0fcd557/
   nvpair name=user value=rbpm id=f8ccf1ed-c701-4dc0-b8e5-d66c779a8b9f/
   nvpair name=statusurl value=http://131.156.12.4:8080/IDMProv; 
id=11d32111-4e2b-4224-99d2-20af4eb43eb8/
   nvpair name=java_home value=/usr/java/jre1.6.0_18/ 
id=2881e29c-4b87-4094-bffc-d0d9e9682e16/
   nvpair name=jboss_home value=/shared/uadisk/rbpm37/jboss 
id=d21e26f2-3e2f-4a98-9ecd-8f934b844434/
   nvpair name=run_opts value=-c IDMProv -b 0.0.0.0 
id=681efa45-d7f3-4ef0-b90d-5d652b6480d6/
   nvpair name=shutdown_opts value=-S 
id=f10343d3-28cc-4497-b234-e4678dda5818/
  /attributes
 /instance_attributes
 operations
  op name=monitor interval=10s timeout=600s start_delay=600s 
id=317075e9-dabb-4923-9129-be16882f94a4/
  op name=start interval=900s timeout=600s start_delay=10s 
id=ab407ca5-78e4-48e9-bee0-f70f64d011e4/
  op name=stop interval=10s timeout=600s start_delay=10s 
id=33d5a72d-105c-4da0-99cb-b25de520a5ae/
 /operations
/primitive

This has gone through many iterations over the last few days. This is what's 
currently in the CIB.


This particular Heartbeat2 version didn't include a JBoss OCF script, but I 
obtained this one from the list archives:

#!/bin/sh
#
# Description:  Manages a Jboss Server as an OCF High-Availability
#   resource under Heartbeat/LinuxHA control
#
# This program is free software; you can redistribute it and/or
# modify it under the terms of the GNU General Public License
# as published by the Free Software Foundation; either version 2
# of the License, or (at your option) any later version.
# 
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
# GNU General Public License for more details.
# 
# You should have received a copy of the GNU General Public License
# along with this program; if not, write to the Free Software
# Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA  
# 02110-1301, USA.
#
# Copyright (c) 2009 Bauer Systems KG / Stefan Schluppeck
#
#
# OCF parameters:
#   OCF_RESKEY_resource_name - The name of the resource. Default is 
${OCF_RESOURCE_INSTANCE}
# why not let the RA log through lrmd?
# 2009/09/09 Nakahira:
# jboss_console is used to record output of the run.sh.
# The log of Run.sh should not be output to ha-log because it is so annoying.
#   OCF_RESKEY_console - A destination of the log of jboss run and shutdown 
script. Default is /var/log/${OCF_RESKEY_resource_name}.log
#   OCF_RESKEY_shutdown_timeout - Time-out at the time of the stop. Default is 5
#   OCF_RESKEY_kill_timeout - The re-try number of times awaiting a stop. 
Default is 10
#   OCF_RESKEY_user - A user name to start a JBoss. Default is root
#   OCF_RESKEY_statusurl - URL for state confirmation. Default is 
http://127.0.0.1:8080
#   OCF_RESKEY_java_home - Home directory of the Java. Default is ${JAVA_HOME}
#   OCF_RESKEY_jboss_home - Home directory of Jboss. Default is None
# is it possible to devise this string from options? I'm afraid
# that allowing users to set this could be error prone.
# 2009/09/09 Nakahira:
# It is difficult to set it automatically because jboss_pstring
# greatly depends on the environment. At any rate, system architect
# should note that pstring doesn't influence other processes.
#   OCF_RESKEY_pstring - String Jboss will found in procceslist. Default is 
java -Dprogram.name=run.sh
#   OCF_RESKEY_run_opts - Options for jboss to run. Default is -c default -l 
lpg4j
#   OCF_RESKEY_shutdown_opts - Options for jboss to shutdown. Default is -s 
127.0.0.1:1099