Hi Dejan,
Dejan Muhamedagic wrote:
Hi Kazutomo-san,
On Fri, Nov 13, 2009 at 07:14:18PM +0900, NAKAHIRA Kazutomo wrote:
Hi, Dejan and Raoul
I hope you will forgive me for being so slow to answer.
# I have some other works and it takes time.
That's fine.
[...]
When sourcing ${OCF_ROOT}/resource.d/heartbeat/.ocf-shellfuncs
you should use '.' not '..'.
I wrote '.' as a source command in the original file.
But it translated to '..' automatically by the my mailer
or our company's mail server.(It is so strange)
Most funny. Attachment shouldn't be messed with really.
Please revise '..' to '.' if it translated again.
OK.
In the stop procedure, you use the QUIT signal. That's going to
produce a coredump of the process. Is that actually intended? Why
not use KILL after TERM?
Using "kill -QUIT" is actually intended in this RA.
QUIT signal for JVM process dose not stop target process, but
What does JVM has to do with syslog?
QUIT signal for common linux process stop target process.
From signal(7):
SIGQUIT 3 Core Quit from keyboard
i.e. there could be core dumps and I'm not sure if that's
what you intend.
What I intend is that if syslog-ng process never stops
by the "kill -TERM", then trying core dump and stop process
by the "kill -QUIT".
The syslog-ng RA's stop sequence is below.
1. Execute "kill -TERM" and wait KILL_TERM_TIMEOUT seconds
until syslog-ng porcess stopped.
2. If sylog-ng process dose not stopped, then Execute
"kill -QUIT" KILL_QUIT_TIMEOUT times at intervals of
1 second until syslog-ng porcess stopped.
3. If syslog-ng process still alive, then Execete
"kill -KILL" at intervals of 1 second
until syslog-ng porcess stopped.
One KILL should be enough as that is actually not delivered to
the process at all, but the process gets removed from the system,
unless it's in the D state, i.e. waiting for some device. But
that's not really important.
That's for sure. Retrying "kill -QUIT" is redundant
and KILL_QUIT_TIMEOUT too.
I revised these parts and the syslog-ng RA's stop sequence is below.
1. Execute "kill -TERM" and wait KILL_TERM_TIMEOUT seconds
until syslog-ng porcess stopped.
2. If sylog-ng process dose not stopped, then Execute
"kill -QUIT" to dump core and stop process.
3. If syslog-ng process still alive, then Execete
"kill -KILL" at intervals of 1 second
until syslog-ng porcess stopped.
On formatting: sometimes spaces are used and sometimes tabs for
indentation. Can you please use either one or the other
(preferably the latter).
I agree. I substituted all indentation spaces to tabs.
A re-revised syslog-ng RA is attached.
Best Regards,
NAKAHIRA Kazutomo
Dejan Muhamedagic wrote:
Hi,
On Tue, Nov 10, 2009 at 01:02:06PM +0100, Raoul Bhatia [IPAX] wrote:
On 09/21/2009 01:59 PM, Dejan Muhamedagic wrote:
Hi Kazutomo-san,
On Fri, Sep 18, 2009 at 05:19:28PM +0900, NAKAHIRA Kazutomo wrote:
Hi, Dejan
I'm sorry I didn't get back to you sooner as a JBoss RA.
I took over mori-san and takenaka-san's work.
I revised a syslog-ng RA referring to your comments.
The modification and my comments is written in the attached RA.
When sourcing ${OCF_ROOT}/resource.d/heartbeat/.ocf-shellfuncs
you should use '.' not '..'.
In the stop procedure, you use the QUIT signal. That's going to
produce a coredump of the process. Is that actually intended? Why
not use KILL after TERM?
On formatting: sometimes spaces are used and sometimes tabs for
indentation. Can you please use either one or the other
(preferably the latter).
hi,
what is the current status on this one?
Apparently waiting for some response from Kazutomo-san.
Thanks,
Dejan
cheers,
raoul
--
____________________________________________________________________
DI (FH) Raoul Bhatia M.Sc. email. r.bha...@ipax.at
Technischer Leiter
IPAX - Aloy Bhatia Hava OEG web. http://www.ipax.at
Barawitzkagasse 10/2/2/11 email. off...@ipax.at
1190 Wien tel. +43 1 3670030
FN 277995t HG Wien fax. +43 1 3670030 15
____________________________________________________________________
_______________________________________________________
Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
Home Page: http://linux-ha.org/
_______________________________________________________
Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
Home Page: http://linux-ha.org/
--
----------------------------------------
NAKAHIRA Kazutomo
NTT DATA INTELLILINK CORPORATION
Open Source Business Unit
Software Services Integration Business Division
#!/bin/bash
#
# Description: Manages a syslog-ng instance, provided by NTT OSSC as an
# OCF High-Availability resource under Heartbeat/LinuxHA control
#
# Copyright (c) 2009 NIPPON TELEGRAPH AND TELEPHONE CORPORATION
#
##############################################################################
# OCF parameters:
# OCF_RESKEY_syslog_ng_binary : Path to syslog-ng binary.
# Default is "/sbin/syslog-ng"
# OCF_RESKEY_configfile : Configuration file
# OCF_RESKEY_start_opts : Startup options
# OCF_RESKEY_kill_term_timeout: Number of seconds to await to confirm a
# normal stop method
# OCF_RESKEY_kill_quit_timeout: Number of times to try forcible
# stop methods
#
# Only OCF_RESKEY_configfile must be specified. Each of the rests
# has its default value or refers OCF_RESKEY_configfile to make
# its value when no explicit value is given.
#
# Further infomation for setup:
# There are sample configurations at the end of this file.
#
###############################################################################
. ${OCF_ROOT}/resource.d/heartbeat/.ocf-shellfuncs
usage()
{
cat <<-!
usage: $0 action
action:
start : start a new syslog-ng instance
stop : stop the running syslog-ng instance
status : return the status of syslog-ng, run or down
monitor : return TRUE if the syslog-ng appears to be working.
meta-data : show meta data message
validate-all: validate the instance parameters
!
return $OCF_ERR_ARGS
}
metadata_syslog_ng()
{
cat <<END
<?xml version="1.0"?>
<!DOCTYPE resource-agent SYSTEM "ra-api-1.dtd">
<resource-agent name="syslog_ng">
<version>1.0</version>
<longdesc lang="en">
This script manages a syslog-ng instance as an HA resource.
</longdesc>
<shortdesc lang="en">Syslog-ng resource agent</shortdesc>
<parameters>
<parameter name="syslog_ng_binary" unique="0">
<longdesc lang="en">
This parameter specifies syslog-ng's executable file.
</longdesc>
<shortdesc>Executable file</shortdesc>
<content type="string" default=""/>
</parameter>
<parameter name="configfile" unique="0" required="1">
<longdesc lang="en">
This parameter specifies a configuration file
for a syslog-ng instance managed by this RA.
</longdesc>
<shortdesc>Configuration file</shortdesc>
<content type="string" default=""/>
</parameter>
<parameter name="start_opts" unique="0">
<longdesc lang="en">
This parameter specifies startup options for a
syslog-ng instance managed by this RA. When no value is given, no startup
options is used. Don't use option '-F'. It causes a stuck of a start action.
</longdesc>
<shortdesc>Start options</shortdesc>
<content type="string" default=""/>
</parameter>
<parameter name="kill_term_timeout" unique="0">
<longdesc lang="en">
On a stop action, a normal stop method(pkill -TERM) is firstly used.
And then the confirmation of its completion is waited for
the specified seconds by this parameter.
The default value is 10.
</longdesc>
<shortdesc>Number of seconds to await to confirm a normal stop
method</shortdesc>
<content type="integer" default="10"/>
</parameter>
<parameter name="kill_quit_timeout" unique="0">
<longdesc lang="en">
On a stop action, if a normal stop method ends up with a failure,
more forcible methods are taken. These methods are repeated the
specified numbers by this parameter.
The default value is 10.
If every normal or forcible stop methods run into a failure,
the KILL signal is used as a final method to stop.
</longdesc>
<shortdesc>Number of times to try forcible stop methods</shortdesc>
<content type="integer" default="10"/>
</parameter>
</parameters>
<actions>
<action name="start" timeout="60s" />
<action name="stop" timeout="120s" />
<action name="status" timeout="60" />
<action name="monitor" depth="0" timeout="30s" interval="10s" start-delay="0" />
Perhaps default interval to something like 60s? Don't know what's
the case in other RAs. Should probably be reviewed.
The default monitor interval in major part of RAs is "10s" and
syslog-ng RA also follows this value.
Of course, this value should change according to the system
monitoring requirement.
<action name="meta-data" timeout="5s" />
<action name="validate-all" timeout="5"/>
</actions>
</resource-agent>
END
return $OCF_SUCCESS
}
monitor_syslog_ng()
{
set -- $(pgrep -f "$PROCESS_PATTERN" 2>/dev/null)
case $# in
0) ocf_log debug "No syslog-ng process for $CONFIGFILE"
return $OCF_NOT_RUNNING;;
1) return $OCF_SUCCESS;;
esac
ocf_log err "mutiple syslog-ng process for $CONFIGFILE"
BTW, does syslog-ng fork to process requests? Perhaps it's not
necessary to treat this as an error condition. Note that on start
it almost certainly does fork (most daemons do), so under some
unfavourable conditions this code may fail.
I agree. I revised this part as follows.
If multiple syslog-ng process found in monitor, then
output warning level log and return OCF_SUCCESS.
Cheers,
Dejan
_______________________________________________________
Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
Home Page: http://linux-ha.org/
Best Regards,
NAKAHIRA Kazutomo
--
----------------------------------------
NAKAHIRA Kazutomo
NTT DATA INTELLILINK CORPORATION
Open Source Business Unit
Software Services Integration Business Division
#!/bin/bash
#
# Description: Manages a syslog-ng instance, provided by NTT OSSC as an
# OCF High-Availability resource under Heartbeat/LinuxHA control
#
# Copyright (c) 2009 NIPPON TELEGRAPH AND TELEPHONE CORPORATION
#
##############################################################################
# OCF parameters:
# OCF_RESKEY_syslog_ng_binary : Path to syslog-ng binary.
# Default is "/sbin/syslog-ng"
# OCF_RESKEY_configfile : Configuration file
# OCF_RESKEY_start_opts : Startup options
# OCF_RESKEY_kill_term_timeout: Number of seconds to await to confirm a
# normal stop method
#
# Only OCF_RESKEY_configfile must be specified. Each of the rests
# has its default value or refers OCF_RESKEY_configfile to make
# its value when no explicit value is given.
#
# Further infomation for setup:
# There are sample configurations at the end of this file.
#
###############################################################################
. ${OCF_ROOT}/resource.d/heartbeat/.ocf-shellfuncs
usage()
{
cat <<-!
usage: $0 action
action:
start : start a new syslog-ng instance
stop : stop the running syslog-ng instance
status : return the status of syslog-ng, run or down
monitor : return TRUE if the syslog-ng appears to be working.
meta-data : show meta data message
validate-all: validate the instance parameters
!
return $OCF_ERR_ARGS
}
metadata_syslog_ng()
{
cat <<END
<?xml version="1.0"?>
<!DOCTYPE resource-agent SYSTEM "ra-api-1.dtd">
<resource-agent name="syslog_ng">
<version>1.0</version>
<longdesc lang="en">
This script manages a syslog-ng instance as an HA resource.
</longdesc>
<shortdesc lang="en">Syslog-ng resource agent</shortdesc>
<parameters>
<parameter name="syslog_ng_binary" unique="0">
<longdesc lang="en">
This parameter specifies syslog-ng's executable file.
</longdesc>
<shortdesc>Executable file</shortdesc>
<content type="string" default=""/>
</parameter>
<parameter name="configfile" unique="0" required="1">
<longdesc lang="en">
This parameter specifies a configuration file
for a syslog-ng instance managed by this RA.
</longdesc>
<shortdesc>Configuration file</shortdesc>
<content type="string" default=""/>
</parameter>
<parameter name="start_opts" unique="0">
<longdesc lang="en">
This parameter specifies startup options for a
syslog-ng instance managed by this RA. When no value is given, no startup
options is used. Don't use option '-F'. It causes a stuck of a start action.
</longdesc>
<shortdesc>Start options</shortdesc>
<content type="string" default=""/>
</parameter>
<parameter name="kill_term_timeout" unique="0">
<longdesc lang="en">
On a stop action, a normal stop method(pkill -TERM) is firstly used.
And then the confirmation of its completion is waited for
the specified seconds by this parameter.
The default value is 10.
</longdesc>
<shortdesc>Number of seconds to await to confirm a normal stop
method</shortdesc>
<content type="integer" default="10"/>
</parameter>
</parameters>
<actions>
<action name="start" timeout="60s" />
<action name="stop" timeout="120s" />
<action name="status" timeout="60" />
<action name="monitor" depth="0" timeout="30s" interval="10s" start-delay="0" />
<action name="meta-data" timeout="5s" />
<action name="validate-all" timeout="5"/>
</actions>
</resource-agent>
END
return $OCF_SUCCESS
}
monitor_syslog_ng()
{
set -- $(pgrep -f "$PROCESS_PATTERN" 2>/dev/null)
case $# in
0) ocf_log debug "No syslog-ng process for $CONFIGFILE"
return $OCF_NOT_RUNNING;;
1) return $OCF_SUCCESS;;
esac
ocf_log warn "Multiple syslog-ng process for $CONFIGFILE"
return $OCF_SUCCESS
}
start_syslog_ng()
{
monitor_syslog_ng
if [[ $? = "$OCF_SUCCESS" ]]; then
return $OCF_SUCCESS
fi
# set -- $SYSLOG_NG_OPTS
# ocf_run "$SYSLOG_NG_EXE" -f "$SYSLOG_NG_CONF" "$@"
# reduce to this?
ocf_run "$SYSLOG_NG_EXE" -f "$CONFIGFILE" $START_OPTS
ocf_status=$?
if [[ "$ocf_status" != "$OCF_SUCCESS" ]]; then
return $ocf_status
fi
while true; do
monitor_syslog_ng
if [[ $? = "$OCF_SUCCESS" ]]; then
return $OCF_SUCCESS
fi
sleep 1
done
}
stop_syslog_ng()
{
pkill -TERM -f "$PROCESS_PATTERN"
typeset lapse_sec=0
while pgrep -f "$PROCESS_PATTERN" > /dev/null; do
sleep 1
lapse_sec=$(( lapse_sec + 1 ))
ocf_log debug "stop_syslog_ng[$SYSLOG_NG_NAME]: stop NORM
$lapse_sec/$KILL_TERM_TIMEOUT"
if [ $lapse_sec -ge $KILL_TERM_TIMEOUT ]; then
ocf_log debug "stop_syslog_ng[$SYSLOG_NG_NAME]: suspend
syslog_ng by SIGQUIT"
pkill -QUIT -f "$PROCESS_PATTERN"
break
fi
done
# if the process can't be removed, then the following part is
# not going to be executed (the RA will be killed by lrmd on
# timeout) and the pidfile will remain; don't know if that
# has any consequences
# 2009/09/18 Nakahira
# If the syslog-ng process hangs, syslog-ng RA waits $KILL_TERM_TIMEOUT
# seconds and tries kill QUIT.
# The stop timeout of RA should be longer than $KILL_TERM_TIMEOUT.
lapse_sec=0
while pgrep -f "$PROCESS_PATTERN" > /dev/null; do
pkill -KILL -f "$PROCESS_PATTERN"
sleep 1
lapse_sec=$(( lapse_sec + 1 ))
ocf_log debug "stop_syslog_ng[$SYSLOG_NG_NAME]: suspend
syslog_ng by SIGKILL ($lapse_sec/@@@)"
done
return $OCF_SUCCESS
}
status_syslog_ng()
{
# ???? why not monitor and then print running or stopped
monitor_syslog_ng
rc=$?
if [ $rc = $OCF_SUCCESS ]; then
echo "Syslog-ng service is running."
elif [ $rc = $OCF_NOT_RUNNING ]; then
echo "Syslog-ng service is stopped."
else
echo "Mutiple syslog-ng process for $CONFIGFILE."
fi
return $rc
}
validate_all_syslog_ng()
{
ocf_log info "validate_all_syslog_ng[$SYSLOG_NG_NAME]"
return $OCF_SUCCESS
}
if [[ "$1" = "meta-data" ]]; then
metadata_syslog_ng
exit $?
fi
CONFIGFILE="${OCF_RESKEY_configfile}"
if [[ -z "$CONFIGFILE" ]]; then
ocf_log err "undefined parameter:configfile"
exit $OCF_ERR_CONFIGURED
fi
SYSLOG_NG_NAME=${CONFIGFILE##*/}
SYSLOG_NG_NAME=${SYSLOG_NG_NAME%.*}
SYSLOG_NG_EXE="${OCF_RESKEY_syslog_ng_binary-/sbin/syslog-ng}"
# why not default to /sbin/syslog-ng?
#if [[ -z "$SYSLOG_NG_EXE" ]]; then
# ocf_log err "Undefined parameter:syslog_ng_binary"
# exit $OCF_ERR_CONFIGURED
#fi
if [[ ! -x "$SYSLOG_NG_EXE" ]]; then
ocf_log err "Invalid value:syslog_ng_binary:$SYSLOG_NG_EXE"
exit $OCF_ERR_CONFIGURED
fi
# actually, the pidfile has no function; the status is checked by
# testing for a running process only
KILL_TERM_TIMEOUT="${OCF_RESKEY_kill_term_timeout-10}"
if ! ocf_is_decimal "$KILL_TERM_TIMEOUT"; then
ocf_log err "Invalid value:kill_term_timeout:$KILL_TERM_TIMEOUT"
exit $OCF_ERR_CONFIGURED
fi
START_OPTS=${OCF_RESKEY_start_opts}
PROCESS_PATTERN="$SYSLOG_NG_EXE -f $CONFIGFILE"
COMMAND=$1
case "$COMMAND" in
start)
ocf_log debug "[$SYSLOG_NG_NAME] Enter syslog_ng start"
start_syslog_ng
func_status=$?
ocf_log debug "[$SYSLOG_NG_NAME] Leave syslog_ng start
$func_status"
exit $func_status
;;
stop)
ocf_log debug "[$SYSLOG_NG_NAME] Enter syslog_ng stop"
stop_syslog_ng
func_status=$?
ocf_log debug "[$SYSLOG_NG_NAME] Leave syslog_ng stop
$func_status"
exit $func_status
;;
status)
status_syslog_ng
exit $?
;;
monitor)
#ocf_log debug "[$SYSLOG_NG_NAME] Enter syslog_ng monitor"
monitor_syslog_ng
func_status=$?
#ocf_log debug "[$SYSLOG_NG_NAME] Leave syslog_ng monitor
$func_status"
exit $func_status
;;
validate-all)
validate_all_syslog_ng
exit $?
;;
*)
usage
;;
esac
# vim: set sw=4 ts=4 :
### A sample snippet of cib.xml for a syslog-ng resource
##
# <primitive id="prmApSyslog-ng" class="ocf" type="syslog-ng"
provider="heartbeat">
# <instance_attributes id="prmDummyB_instance_attrs">
# <attributes>
# <nvpair id="atr:Syslog-ng:syslog-ng:syslog_ng_binary"
name="syslog_ng_binary" value="/sbin/syslog-ng"/>
# <nvpair id="atr:Syslog-ng:syslog-ng:configfile"
name="configfile" value="/etc/syslog-ng/syslog-ng-ext.conf"/>
# </attributes>
# </instance_attributes>
# <operations>
# <op id="op:prmSyslog-ng:start" name="start" timeout="60s"
on_fail="restart"/>
# <op id="op:prmSyslog-ng:monitor" name="monitor" interval="10s"
timeout="60s" on_fail="restart"/>
# <op id="op:prmSyslog-ng:stop" name="stop" timeout="60s"
on_fail="block"/>
# </operations>
# </primitive>
### A sample syslog-ng configuration file for a log collecting host
###
### This sample is for a log collecting host by syslog-ng.
### A syslog-ng process configurated by this sample accepts all messages
### from a certain network. Any message from the network is preserved into
### a file for security infomation. Restricting messages to "authpriv" from
### the network is done on log sending hosts. (See the sample below)
### Any internal message of the syslog-ng process is preserved into its
### dedicated file. And any "authpriv" internal message of the syslog-ng
### process is also preserved into the security infomation file.
###
### Change "f_incoming" to suit your enviroment.
### If you use it as a configuration file for the sample cib.xml above,
### save it into "/etc/syslog-ng/syslog-ng-ext.conf".
##
#options {
# sync (0);
# time_reopen (10);
# log_fifo_size (1000);
# long_hostnames (off);
# use_dns (yes);
# use_fqdn (no);
# create_dirs (no);
# keep_hostname (yes); };
#
#source s_internal { internal(); };
#source s_incoming { udp(port(514)); };
#filter f_internal { facility(authpriv); };
#filter f_incoming { netmask("172.20.0.0/255.255.192.0"); };
#
#destination d_internal { file("/var/log/syslog-ng-ext.log" perm(0640));};
#destination d_incoming {
# file("/var/log/secure-ext.log" create_dirs(yes) perm(0640)); };
#
#log { source(s_internal); destination(d_internal); };
#log { source(s_internal); filter(f_internal); destination(d_incoming); };
#log { source(s_incoming); filter(f_incoming); destination(d_incoming); };
### A sample snippet of syslog-ng configuration file for a log sending host
###
### This sample is for a log sending host that uses syslog-ng.
###
### Replace "syslog-ng-ext" to the IP address or the hostname of your
### log collecting host and append it to "syslog-ng.conf" of each log sending
### host. See the install default syslog-ng.conf to know what "s_sys" and
### "f_auth" are.
##
#destination d_outgoing { udp("syslog-ng-ext" port(514)); };
#log { source(s_sys); filter(f_auth); destination(d_outgoing); };
### A sample snippet of syslog configuration file for a log sending host
###
### This sample is for a log sending host that uses syslog.
###
### Replace "syslog-ng-ext" to the IP address or the hostname of your
### log collecting host and append it to "syslog.conf" of each log sending
### host.
##
# authpriv.* @syslog-ng-ext
_______________________________________________________
Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
Home Page: http://linux-ha.org/