Hi,
On Wed, Jun 18, 2008 at 08:55:23AM +0200, Dominik Klein wrote:
> Good morning Dejan
>
> thanks for your reply.
>
>>> I wrote an RA, which can start virtually anything in a very basic
>>> meaning.
>> An interesting idea :)
>
> I wrote it because I have a LOT of custom programs to make HA and I needed
> something that I can throw in a binfile and some command line parameters.
> After I had it the way it worked for what I need, I just thought it might
> be useful for others.
>
>>> The RA starts the command configured with $binfile and $cmdline_options
>>> as $user and redirects stdout and stderr to appropriate files.
>> This may not be necessary. Whatever comes out on stdout/stderr
>> will be logged by lrmd.
>
> It is necessary in my case.
>
> How about not redirecting anything if neither logfile nor errlogfile is
> set?
OK.
>>> It stops the command with kill. If kill does not work, it uses kill -9.
>> I guess that you mean kill without options which translates to
>> kill -TERM.
>
> Right.
>
>>> Monitors are done with ps. No deep check here but a pointer where to
>>> implement that if needed.
>> Perhaps implement a monitor script hook, such as the one in Xen.
>> That way one keeps the RA intact.
>
> I will look at that and see how that works.
>
>> Why not use the pid file to check if the process is running?
>> Did you check start-stop-daemon? I'm not sure if we can use it,
>> since it's Linux specific, but there are certainly a few good
>> tips :)
>
> Afaik, not all programs write PID files. Ie none of my custom programs
> does. If there's a way to generate one after starting $binfile - let me
> know.
Right. start-stop-daemon create a PID file themselves. When you
start a program in the background (with nohup ... &) $! contains
the pid of the process.
>>> #!/bin/bash
>> Better to use #!/bin/sh. As of the next release, Debian will
>> distribute dash as the default shell. I think that ubuntu already
>> does that.
>
> K.
>
>>> #
>>> # OCF Resource Agent compliant resource script.
>>> #
>>> # Copyright (c) 2008 IN-telegence GmbH & Co. KG, Dominik Klein
>>> # All Rights Reserved.
>>> #
>>> # This program is free software; you can redistribute it and/or modify
>>> # it under the terms of version 2 of the GNU General Public License as
>>> # published by the Free Software Foundation.
>>> #
>>> # This program is distributed in the hope that it would be useful, but
>>> # WITHOUT ANY WARRANTY; without even the implied warranty of
>>> # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
>>> #
>>> # Further, this software is distributed without any warranty that it is
>>> # free of the rightful claim of any third person regarding infringement
>>> # or the like. Any license provided herein, whether implied or
>>> # otherwise, applies only to this software file. Patent licenses, if
>>> # any, provided herein do not apply to combinations of this program with
>>> # other software, or any other product whatsoever.
>>> #
>>> # You should have received a copy of the GNU General Public License
>>> # along with this program; if not, write the Free Software Foundation,
>>> # Inc., 59 Temple Place - Suite 330, Boston MA 02111-1307, USA.
>>>
>>> # OCF instance parameters
>>> # OCF_RESKEY_binfile
>>> # OCF_RESKEY_cmdline_options
>>> # OCF_RESKEY_logfile
>>> # OCF_RESKEY_errlogfile
>>> # OCF_RESKEY_user
>>>
>>> # Initialization:
>>> . ${OCF_ROOT}/resource.d/heartbeat/.ocf-shellfuncs
>>>
>>> anything_status() {
>>> [ -n "$cmdline_options" ] && cmd="$binfile $cmdline_options" ||
>>> cmd="$binfile"
>>> if pgrep -u $user -f "$cmd" > /dev/null 2>&1
>> function number { # make sure that the file contains a number
>> grep '^[0-9][0-9]*$' $1
>> }
>> if test -f $PIDFILE && pid=`number $PIDFILE` && kill -0 $pid
>
> where/how do you get $PIDFILE?
>
> And it should be quoted I guess.
PIDFILE could be supplied in the CIB. Or use some scheme with the
resource id. There's a variable HA_RSCTMP containing directory
where such files may be stored.
>>> then
>>> return $OCF_RUNNING
>>> else
>>> return $OCF_NOT_RUNNING
>>> fi
>>> }
>>>
>>> anything_start() {
>>> if ! anything_status
>>> then
>>> if [ -n "$logfile" -a -n "$errlogfile" ]
>>> then
>>> # We have logfile and errlogfile, so redirect STDOUT
>>> und STDERR to
>>> different files
>>> cmd="su - $user -c \"nohup $binfile $cmdline_options >>
>>> $logfile 2>>
>>> $errlogfile &\""
>>> else if [ -n "$logfile" ]
>>> then
>>> # We only have logfile so redirect STDOUT and
>>> STDERR to the same file
>>> cmd="su - $user -c \"nohup $binfile
>>> $cmdline_options >> $logfile 2>&1
>>> &\""
>>> else
>>> # We have neither logfile nor errlogfile, so
>>> redirect STDOUT and
>>> STDERR to a generic logfile
>>> cmd="su - $user -c \"nohup $binfile
>>> $cmdline_options >>
>>> /var/log/$(basename $binfile)\_$(date +%Y.%m.%d) 2>&1 &\""
>> As I said above, I'd leave at least this part out. Perhaps also
>> all the logfile business.
>
> See my suggestion above. I think that the most flexible way.
>
>>> fi
>>> fi
>>> ocf_log debug "Starting $process: $cmd"
>>> # Execute the command as created above
>>> eval $cmd
>>> if anything_status
>>> then
>>> ocf_log debug "$process: $cmd started successfully"
>>> return $OCF_SUCCESS
>>> else ocf_log err "$process: $cmd could not
>>> be started"
>>> return $OCF_ERR_GENERIC
>>> fi
>>> else
>>> # If already running, consider start successful
>>> ocf_log debug "$process: $cmd is already running"
>>> return $OCF_SUCCESS
>>> fi
>>> }
>>>
>>> anything_stop() {
>>> if anything_status
>>> then
>>> tries=5
>>> i=0
>>> while [ $i -lt $tries ]
>>> do
>>> # there may be programs without command line options
>>> [ -n "$cmdline_options" ] && cmd="$binfile
>>> $cmdline_options" ||
>>> cmd="$binfile"
>>> pkill -u $user -f "$cmd"
>> It should be enough to send the signal once. So, pkill should be
>> moved out of the loop.
>
> Right.
>
>>> sleep 1
>>> if ! anything_status
>>> then
>>> return $OCF_SUCCESS
>>> fi
>>> let "i++"
>>> done
>> It is arguably wrong to limit the time to stop. One should let
>> the user do that by specifying the operation timeout. OTOH, I
>> believe that most resource agents are doing this in a similar
>> way, which still doesn't make it right. Then again, the agent
>> should try 'kill -9' eventually. To make this perfect, one would
>> need to define two timeouts: one for regular stop and one for
>> oh-yes-you'll-be-stopped-no-matter-what.
>
> So how about adding "tries" with a more intuitive name as a configuration
> option. It could default to 10000 or something and be overwritten if it is
> set. While unset, kill -9 would never be used. If set, people need to set
> it lower than the stop timeout to make it work. I guess from inside the RA,
> the timeout is not visible, so it should be made clean in the header of the
> script and in the meta-data.
Yes, that could work.
>>> # one last attempt with sigkill
>>> ocf_log warn "Stop $process: Looks like $process could not be
>>> stopped
>>> by SIGTERM, now sending SIGKILL"
>>> ocf_log warn "$(pgrep -u $user -f \"$cmd\" -l)"
>>> pkill -u $user -9 -f "$cmd"
>>> if ! anything_status
>>> then
>>> ocf_log debug "Stop $process: Seems like SIGKILL did
>>> the job"
>>> ocf_log debug "$(pgrep -u $user -f \"$cmd\" -l)"
>>> return $OCF_SUCCESS
>>> else
>>> ocf_log err "Stop $process: failed"
>>> return $OCF_ERR_GENERIC
>>> fi
>>> else
>>> # was not running, so stop can be considered successful
>>> return $OCF_SUCCESS
>>> fi
>>> return $OCF_ERR_GENERIC
>>> }
>>>
>>> anything_monitor() {
>>> anything_status
>>> ret=$?
>>> if [ $ret -eq $OCF_SUCCESS ]
>>> then
>>> # implement your deeper monitor operation here
>> if [ -n "$OCF_RESKEY_monitor_hook" ]; then
>> eval "$OCF_RESKEY_monitor_hook"
>> else
>> true
>> fi
>
> Oh, you already did what I wanted to look up. Thanks :)
>
>>> return $OCF_SUCCESS
>> return # $? is implied
>
> See, learned something - again :)
:) I edited it because it should return the exit code of the
monitor_hook script. Otherwise, return $OCF_SUCCESS would be OK.
Cheers,
Dejan
_______________________________________________
Linux-HA mailing list
[email protected]
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems