Hi Raoul,

On Wed, Jul 15, 2009 at 03:42:16PM +0200, Raoul Bhatia [IPAX] wrote:
> hi dejan,
> 
> sorry for the late reply, i've been on vacation and am still catching
> up at work.
> 
> thanks for your feedback, please see below.
> 
> Dejan Muhamedagic wrote:
> > Hi Raoul,
> > 
> > Sorry for the delay, somehow I missed the last two messages.
> > 
> > On Tue, Jun 23, 2009 at 02:57:52PM +0200, Raoul Bhatia [IPAX] wrote:
> >> Raoul Bhatia [IPAX] wrote:
> >>> i'm reworking my script right now. commenting inline.
> >> i just finished updating the postfix ocf ra and am summarizing the
> >> changes:
> >>
> >> * isRunning() stays as this is also used in other ras
> >> * i left running() as well (where i check the master.pid file)
> >> but am ready to rewrite it to use "postqueue -p" or "postfix status"
> >> in addition or exclusively - waiting for your feedback
> > 
> > In addition to testing for the pidfile, you could also check if
> > there's a process holding the spool directory, sth like:
> > 
> > rondo:~ # postconf -h queue_directory
> > /var/spool/postfix
> > rondo:~ # fuser /var/spool/postfix/
> > /var/spool/postfix:   5332c  5365c  8313c
> > 
> > Perhaps:
> > 
> > rondo:~ # fuser -v /var/spool/postfix/ 2>&1 | grep -w master
> > /var/spool/postfix:  root       5332 ..c.. master
> 
> i'm now checking more indepth for:
> 1. empty queue_directory

For monitor? Why?

> 2. pidfile
> 3. "postfix status"
> 4. postqueue ... | grep 'Mail system is down'
> 5. fuser -v $queue
> 
> is this ok? feel free to remove some checks

It should be enough just to check for the process. First using
pidfile and if that doesn't work then with fuser.

> >> * i removed $() bashism
> >> * removed "pid=$(sed 's/ //g' ${queue}/pid/master.pid)"
> >> * as of now, removed the postfix_monitor check on "stop"
> >> * waiting 5 seconds for postfix shutdown, then escalating to "abort"
> >> * removed exits inside the functions and replaced it with return.
> >>
> >> did i miss something from your feedback?
> >> do you have any further comments?
> > 
> > Lars said:
> > 
> >>> if postconf -h queue_directory does not work, this is a broken
> >>> installation and should IMO not provide any other "default"
> >>> value.
> > 
> > and I'd agree with this. It's really important that resources are
> > properly configured.
> 
> i'm catching this now but am not sure if i'm correctly handling this
> case in "isRunning()".

Just check if that returns a valid directory?

> maybe checking this inside validate_all() is
> good enough?

Not sure. Your preference :)

Thanks,

Dejan

> cheers,
> raoul
> -- 
> ____________________________________________________________________
> DI (FH) Raoul Bhatia M.Sc.          email.          [email protected]
> Technischer Leiter
> 
> IPAX - Aloy Bhatia Hava OEG         web.          http://www.ipax.at
> Barawitzkagasse 10/2/2/11           email.            [email protected]
> 1190 Wien                           tel.               +43 1 3670030
> FN 277995t HG Wien                  fax.            +43 1 3670030 15
> ____________________________________________________________________

> #!/bin/sh
> #
> # Resource script for Postfix
> #
> # Description:  Manages Postfix as an OCF resource in
> #               an high-availability setup.
> #
> #               Tested with postfix 2.5.5 on Debian 5.0.
> #               Based on the mysql-proxy and mysql OCF resource agents.
> #
> # Author:       Raoul Bhatia <[email protected]> : Original Author
> # License:      GNU General Public License (GPL)
> # Note:         if you want to run multiple postfix instances, please see
> #               
> http://amd.co.at/adminwiki/Postfix#Adding_a_Second_Postfix_Instance_on_one_Server
> #               http://www.postfix.org/postconf.5.html
> #
> #
> #       usage: $0 {start|stop|reload|status|monitor|validate-all|meta-data}
> #
> #       The "start" arg starts a Postfix instance
> #
> #       The "stop" arg stops it.
> #
> #
> # Test via
> # * /usr/sbin/ocf-tester -n post1 /usr/lib/ocf/resource.d/heartbeat/postfix
> # * /usr/sbin/ocf-tester -n post1 -o binary="/usr/sbin/postfix" 
> #       -o config_dir="" /usr/lib/ocf/resource.d/heartbeat/postfix
> # * /usr/sbin/ocf-tester -n post1 -o binary="/usr/sbin/postfix" 
> #       -o config_dir="/root/postfix/" 
> /usr/lib/ocf/resource.d/heartbeat/postfix
> #
> #
> # OCF parameters:
> #  OCF_RESKEY_binary
> #  OCF_RESKEY_config_dir
> #  OCF_RESKEY_parameters
> #
> ##########################################################################
> 
> # Initialization:
> 
> . ${OCF_ROOT}/resource.d/heartbeat/.ocf-shellfuncs
> 
> : ${OCF_RESKEY_binary="/usr/sbin/postfix"}
> : ${OCF_RESKEY_config_dir=""}
> : ${OCF_RESKEY_parameters=""}
> USAGE="Usage: $0 {start|stop|reload|status|monitor|validate-all|meta-data}";
> 
> ##########################################################################
> 
> usage() {
>     echo $USAGE >&2
> }
> 
> meta_data() {
>         cat <<END
> <?xml version="1.0"?>
> <!DOCTYPE resource-agent SYSTEM "ra-api-1.dtd">
> <resource-agent name="postfix">
> <version>0.1</version>
> <longdesc lang="en">
> This script manages Postfix as an OCF resource in a high-availability setup.
> Tested with Postfix 2.5.5 on Debian 5.0.
> </longdesc>
> <shortdesc lang="en">OCF Resource Agent compliant Postfix script.</shortdesc>
> 
> <parameters>
> 
> <parameter name="binary" unique="0" required="0">
> <longdesc lang="en">
> Full path to the Postfix binary.
> For example, "/usr/sbin/postfix".
> </longdesc>
> <shortdesc lang="en">Full path to Postfix binary</shortdesc>
> <content type="string" default="/usr/sbin/postfix" />
> </parameter>
> 
> <parameter name="config_dir" unique="1" required="0">
> <longdesc lang="en">
> Full path to a Postfix configuration directory.
> For example, "/etc/postfix".
> </longdesc>
> <shortdesc lang="en">Full path to configuration directory</shortdesc>
> <content type="string" default="" />
> </parameter>
> 
> <parameter name="parameters" unique="0" required="0">
> <longdesc lang="en">
> The Postfix daemon may be called with additional parameters.
> Specify any of them here.
> </longdesc>
> <shortdesc lang="en"></shortdesc>
> <content type="string" default="" />
> </parameter>
> 
> </parameters>
> 
> <actions>
> <action name="start"   timeout="90" />
> <action name="stop"    timeout="100" />
> <action name="reload"  timeout="100" />
> <action name="monitor" depth="10"  timeout="20s" interval="60s" 
> start-delay="0" />
> <action name="validate-all"  timeout="30s" />
> <action name="meta-data"  timeout="5s" />
> </actions>
> </resource-agent>
> END
> }
> 
> isRunning()
> {
>     kill -0 "$1" 2>/dev/null
> }
> 
> # running() has been copied from debian's init script. we enhanced it a bit
> # @TODO rb 2009-06-23 maybe try "postqueue -p 2>&1 | head -n1 | grep 'Mail 
> system is down' && false
> # @TODO rb 2009-06-23 maybe try "$binary $OPTIONS status" instead?
> running() {
>     pid_dir=`postconf $OPTION_CONFIG_DIR -h process_id_directory 2>/dev/null`
>     pidfile="${queue}/${pid_dir}/master.pid"
>     queue=`postconf $OPTION_CONFIG_DIR -h queue_directory 2>/dev/null`
>     [ -z $queue ] && false # check if queue directory is empty @TODO shall we 
> return false or $OCF_ERR_something
> 
>     if [ -f "${pidfile}" ]; then
>         # @TODO Could the master process become zombie?
>         pid=`cat ${pidfile}`
>         if isRunning $pid; then
>             # @TODO why does "true" not work here?
>             #true
>             return $OCF_SUCCESS
>         fi
>     fi
> 
>     # try some different methods to see if we can find a running 
> postfix/master instance
>     # postfix status
>     $binary $OPTION_CONFIG_DIR status && return $OCF_SUCCESS
> 
>     # what does postqueue say?
>     echo postqueue $OPTION_CONFIG_DIR -p 2>&1
>     postqueue $OPTION_CONFIG_DIR -p 2>&1 | head -n1 | grep 'Mail system is 
> down' && false
> 
>     # is there a master process holding the spool directory?
>     fuser -v $queue 2>&1 | grep -w master && return $OCF_SUCCESS
> 
> 
>     # Postfix is not running
>     false
> }
> 
> 
> postfix_status()
> {
>     running
> }
> 
> postfix_start()
> {
>     # if Postfix is running return success
>     if postfix_status; then
>         ocf_log info "Postfix already running."
>         return $OCF_SUCCESS
>     fi
> 
>     # start Postfix
>     $binary $OPTIONS start >/dev/null 2>&1
>     ret=$?
> 
>     if [ $ret -ne 0 ]; then
>         ocf_log err "Postfix returned error." $ret
>         return $OCF_ERR_GENERIC
>     fi
> 
>     return $OCF_SUCCESS
> }
> 
> 
> postfix_stop()
> {
>     $binary $OPTIONS stop >/dev/null 2>&1
>     ret=$?
> 
>     if [ $ret -ne 0 ]; then
>         ocf_log err "Postfix returned an error while stopping." $ret
>         return $OCF_ERR_GENERIC
>     fi
> 
>     # grant some time for shutdown and recheck 5 times
>     for i in 1 2 3 4 5; do
>         if postfix_status; then
>             sleep 1
>         fi
>     done
> 
>     # escalate to abort if we did not stop by now
>     # @TODO shall we loop here too?
>     if postfix_status; then
>         ocf_log err "Postfix failed to stop. Escalating to 'abort'"
> 
>         $binary $OPTIONS abort >/dev/null 2>&1; ret=$?
>         sleep 5
>         postfix_status && $OCF_ERR_GENERIC
>     fi
> 
>     return $OCF_SUCCESS
> }
> 
> postfix_reload()
> {
>     if postfix_status; then
>         ocf_log info "Reloading Postfix."
>         $binary $OPTIONS reload
>     fi
> }
> 
> postfix_monitor()
> {
>     if postfix_status; then
>         return $OCF_SUCCESS
>     fi
> 
>     return $OCF_NOT_RUNNING
> }
> 
> postfix_validate_all()
> {
>     # check that the Postfix binary exists and can be executed
>     if [ ! -x "$binary" ]; then
>         ocf_log err "Postfix binary '$binary' does not exist or cannot be 
> executed."
>         return $OCF_ERR_GENERIC
>     fi
> 
>     # check config_dir and alternate_config_directories parameter
>     if [ "x$config_dir" != "x" ]; then
>         if [ ! -d "$config_dir" ]; then
>             ocf_log err "Postfix configuration directory '$config_dir' does 
> not exist." $ret
>             return $OCF_ERR_GENERIC
>         fi
> 
>         alternate_config_directories=`postconf -h 
> alternate_config_directories 2>/dev/null | grep $config_dir`
>         if [ "x$alternate_config_directories" = "x" ]; then
>             ocf_log err "Postfix main configuration must contain correct 
> 'alternate_config_directories' parameter."
>             return $OCF_ERR_GENERIC
>         fi
>     fi
> 
>     # check spool/queue directory
>     queue=`postconf $OPTION_CONFIG_DIR -h queue_directory 2>/dev/null`
>     if [ ! -d "$queue" ]; then
>         ocf_log err "Postfix spool/queue directory '$queue' does not exist." 
> $ret
>         return $OCF_ERR_GENERIC
>     fi
> 
>     # run postfix internal check
>     $binary $OPTIONS check >/dev/null 2>&1
>     ret=$?
>     if [ $ret -ne 0 ]; then
>         ocf_log err "Postfix 'check' failed." $ret
>         return $OCF_ERR_GENERIC
>     fi
> 
>     return $OCF_SUCCESS
> }
> 
> #
> # Main
> #
> 
> if [ $# -ne 1 ]; then
>     usage
>     exit $OCF_ERR_ARGS
> fi
> 
> binary=$OCF_RESKEY_binary
> config_dir=$OCF_RESKEY_config_dir
> parameters=$OCF_RESKEY_parameters
> 
> # debugging stuff
> #echo OCF_RESKEY_binary=$OCF_RESKEY_binary >> 
> /tmp/prox_conf_$OCF_RESOURCE_INSTANCE
> #echo OCF_RESKEY_config_dir=$OCF_RESKEY_config_dir >> 
> /tmp/prox_conf_$OCF_RESOURCE_INSTANCE
> #echo OCF_RESKEY_parameters=$OCF_RESKEY_parameters >> 
> /tmp/prox_conf_$OCF_RESOURCE_INSTANCE
> 
> 
> # build postfix options string *outside* to access from each method
> OPTIONS=''
> OPTION_CONFIG_DIR=''
> 
> # check if the Postfix config_dir exist
> if [ "x$config_dir" != "x" ]; then
>     # save OPTION_CONFIG_DIR seperatly
>     OPTION_CONFIG_DIR="-c $config_dir"
>     OPTIONS=$OPTION_CONFIG_DIR
> fi
> 
> if [ "x$parameters" != "x" ]; then
>     OPTIONS="$OPTIONS $parameters"
> fi
> 
> case $1 in
>     meta-data)  meta_data
>                 exit $OCF_SUCCESS
>                 ;;
> 
>     usage|help) usage
>                 exit $OCF_SUCCESS
>                 ;;
> esac
> 
> postfix_validate_all
> ret=$?
> 
> #echo "debug[$1:$ret]"
> LSB_STATUS_STOPPED=3
> if [ $ret -ne $OCF_SUCCESS ]; then
>     case $1 in
>     stop)       exit $OCF_SUCCESS ;;
>     monitor)    exit $OCF_NOT_RUNNING;;
>     status)     exit $LSB_STATUS_STOPPED;;
>     *)          exit $ret;;
>     esac
> fi
> 
> case $1 in
>     monitor)    postfix_monitor
>                 exit $?
>                 ;;
>     start)      postfix_start
>                 exit $?
>                 ;;
> 
>     stop)       postfix_stop
>                 exit $?
>                 ;;
> 
>     reload)     postfix_reload
>                 exit $?
>                 ;;
> 
>     status)     if postfix_status; then
>                     ocf_log info "Postfix is running."
>                     exit $OCF_SUCCESS
>                 else
>                     ocf_log info "Postfix is stopped."
>                     exit $OCF_NOT_RUNNING
>                 fi
>                 ;;
> 
>     monitor)    postfix_monitor
>                 exit $?
>                 ;;
> 
>     validate-all)   exit $OCF_SUCCESS
>                     ;;
> 
>     *)          usage
>                 exit $OCF_ERR_UNIMPLEMENTED
>                 ;;
> esac

> _______________________________________________________
> Linux-HA-Dev: [email protected]
> http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
> Home Page: http://linux-ha.org/

_______________________________________________________
Linux-HA-Dev: [email protected]
http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
Home Page: http://linux-ha.org/

Reply via email to