Hi, Why this script is still not committed from the previous post in January to your development tree at
http://hg.linux-ha.org/agents/file/e13565f0ea8a/heartbeat Or did I check at the wrong place? Achim Achim Stumpf schrieb: > Hi, > > Achim Stumpf wrote: >> fix in validate all. See attachment... > > fixed again, validate is now invoked before start as I have seen in the > mysql script. Hope that's right... > > I have moved > > # check that the Proftpd config file exists > if [ ! -f "$OCF_RESKEY_conffile" ]; then > ocf_log err "Proftpd config file $OCF_RESKEY_conffile does not > exist" > exit $OCF_ERR_CONFIGURED > fi > > from start to validate. Hope OCF_ERR_CONFIGURED is right. > > > Cheers, > > Achim > > > >> >> Achim Stumpf wrote: >>> Hi Dejan, >>> >>> After a long long time... You find the new script in the attachment. >>> >>> Dejan Muhamedagic wrote: >>>> Hi Achim, >>>> >>>> On Thu, Nov 20, 2008 at 01:32:06PM +0100, Achim Stumpf wrote: >>>>> Hi, >>>>> >>>>> I am still waiting for further advice to finish the commit >>>>> process to your repository. ;o) >>>> >>>> I'd have a few suggestions as well: >>>> >>>> 1. You could drop both connect_timeout and max_time parameters. >>>> It is advisable to have the upper layer decide when an operation >>>> should timeout (by setting the operation's "timeout" attribute). >>> >>> done >>> >>>> >>>> 2. kill -0 "$1" >>>> >>>> returns success with zombie processes too. >>>> >>> >>> I have checked the other scripts, some of them check /proc/pid. But >>> with a zombie it's the same as with kill -0. What do you suggest? >>> >>>> 3. Poftpd: a typo in a message. >>>> >>> >>> done >>> >>>> 4. In proftpd_start() >>>> >>>> # check that the proftpd binary exists >>>> if [ ! -x "$OCF_RESKEY_binary" ]; then >>>> ocf_log err "Proftpd binary $OCF_RESKEY_binary does not exist" >>>> exit $OCF_ERR_CONFIGURED >>>> >>>> This belongs to validate. Should also be OCF_ERR_INSTALLED. >>>> >>> >>> moved to validate >>> >>>> 5. In proftpd_stop() >>>> ... >>>> TRIES=0 >>>> while isRunning "$PID" && [ $TRIES -lt 5 ] >>>> do >>>> sleep 1 >>>> kill $PID > /dev/null 2>&1 >>>> ocf_log info "Killing Proftpd PID $PID" >>>> TRIES=`expr $TRIES + 1` >>>> done >>>> >>>> Move the kill command in front of the loop: it's enough to send a >>>> signal once. The same for the latter 'kill -9'. You can also try >>>> other signals, such as HUP (see, perhaps, the proftpd docs). >>>> Also, could it be that 5 seconds is a bit too short to let the >>>> ftpd exit? >>>> >>>> As in 1. above, it's better to leave the timeout handling to >>>> lrmd. You can just do an infinite look here: >>>> >>>> while isRunning "$PID"; do >>>> sleep 1 >>>> ... >>>> done >>>> >>> >>> Before it wasn't only 5 seconds. It was five seconds for kill, and >>> then you got another loop with kill -9. for kill it is 30 second's >>> now. And an infinite loop with kill -9. The kill and kill -9 is in >>> front of the loop and only sleep and logging is left inside the loop. >>> >>>> 6. In proftpd_monitor() >>>> >>>> ocf_log err "Proftpd monitor on PID $PID failed" >>>> >>>> It may or may not be an error if proftpd is not running. Better >>>> to keep silent. >>>> >>> >>> changed to debug >>> >>>> ocf_log err "$OCF_RESKEY_curl_binary does not exist" >>>> return $OCF_ERR_CONFIGURED >>>> >>>> This belongs to the validate action. The error should be >>>> OCF_ERR_INSTALLED. >>>> >>>> See the validate action in other OCF agents. Note that a failure >>>> to validate the environment and parameters shouldn't influence >>>> the meta-data operation. >>>> >>> >>> I have checked other scripts. Most of them return just ocf_success. >>> What I have to check in validate all? >>> I have made just a test with a wrong path to the curl binary. >>> heartbeat does not do a validate all or? In the logs I see only that >>> monitoring fails. For whom is that validate all??? >>> >>>> And... very sorry for taking this long. >>>> >>> >>> Never mind... >>> >>> >>> Cheers >>> >>> Achim >>> >>>> Cheers, >>>> >>>> Dejan >>>> >>>>> >>>>> Achim >>>>> >>>>> >>>>> Achim Stumpf wrote: >>>>>> Hi, >>>>>> >>>>>> What's about the proftpd script? Maybe we could finish that one... >>>>>> ;o) >>>>>> >>>>>> >>>>>> Thanks, >>>>>> >>>>>> Achim >>>>>> >>>>>> >>>>>> >>>>>> Achim Stumpf wrote: >>>>>>> Hi, >>>>>>> >>>>>>> Lars Marowsky-Bree wrote: >>>>>>>> "start_delay" should not be necessary, unless your start >>>>>>>> operation is >>>>>>>> broken. >>>>>>>> >>>>>>> I have removed the start_delay. >>>>>>> >>>>>>>> You also need to do a check_binary on the proftpd binary >>>>>>>> somewhere prior >>>>>>>> to calling this command. >>>>>>>> >>>>>>> All config files and binaries get checked now. >>>>>>> >>>>>>>>> proftpd_stop() >>>>>>>>> { >>>>>>>>> if proftpd_status ; then >>>>>>>>> PID=`head -n 1 $OCF_RESKEY_pidfile` >>>>>>>>> if [ ! -z $PID ] ; then >>>>>>>>> kill $PID >>>>>>>>> if [ $? -ne 0 ]; then >>>>>>>>> ocf_log err "Proftpd couldn't be stopped" >>>>>>>>> return $OCF_ERR_GENERIC >>>>>>>>> fi >>>>>>>>> fi >>>>>>>>> fi >>>>>>>> You need to check afterwards that the process is really gone. >>>>>>>> kill is >>>>>>>> not sufficient; it might send the signal, but if the process is >>>>>>>> stuck on >>>>>>>> IO, it might not stop. I'd also recommend kill -9. >>>>>>>> >>>>>>> I have adapted some code of the apache ocf script: >>>>>>> >>>>>>> proftpd_stop() >>>>>>> { >>>>>>> if proftpd_status ; then >>>>>>> PID=`head -n 1 $OCF_RESKEY_pidfile` >>>>>>> if [ ! -z "$PID" ]; then >>>>>>> kill $PID > /dev/null 2>&1 if [ $? -eq 0 ]; >>>>>>> then >>>>>>> TRIES=0 >>>>>>> while isRunning "$PID" && [ $TRIES -lt 5 >>>>>>> ] do >>>>>>> sleep 1 >>>>>>> kill $PID > /dev/null 2>&1 >>>>>>> ocf_log info "Killing Proftpd PID $PID" >>>>>>> TRIES=`expr $TRIES + 1` >>>>>>> done >>>>>>> TRIES=0 >>>>>>> while isRunning "$PID" && [ $TRIES -lt 3 >>>>>>> ] do >>>>>>> sleep 1 >>>>>>> kill -9 $PID > /dev/null 2>&1 >>>>>>> ocf_log info "Killing Proftpd PID $PID with >>>>>>> SIGKILL" >>>>>>> TRIES=`expr $TRIES + 1` >>>>>>> done >>>>>>> if isRunning "$PID" ; then >>>>>>> ocf_log err "Killing Proftpd PID $PID FAILED" >>>>>>> exit $OCF_ERR_GENERIC >>>>>>> fi >>>>>>> else >>>>>>> ocf_log err "Killing Proftpd PID $PID FAILED" >>>>>>> exit $OCF_ERR_GENERIC >>>>>>> fi >>>>>>> fi >>>>>>> fi exit $OCF_SUCCESS >>>>>>> } >>>>>>> >>>>>>> 1. kill -15 2. if it is still running try max. 5 times to kill >>>>>>> again with kill -15 >>>>>>> 3. after that if it is still running try 3 times with kill -9 >>>>>>> >>>>>>> Hope that it meets your needs ;o) >>>>>>> >>>>>>> And actually I got the idea to send in the while loop with kill >>>>>>> -9 a kill -9 signal to all the child processes as returned by: >>>>>>> >>>>>>> ps --no-headers -o pid --ppid $PID >>>>>>> >>>>>>> If you like the above code and the idea with kill -9 to the child >>>>>>> processes, I will implement that too. >>>>>>> >>>>>>>>> proftpd_monitor() >>>>>>>>> { >>>>>>>>> if [ $OCF_CHECK_LEVEL -eq 0 ]; then if >>>>>>>>> proftpd_status ; then >>>>>>>>> ocf_log debug "Proftpd monitor succeded" >>>>>>>>> return $OCF_SUCCESS >>>>>>>>> fi >>>>>>>>> else >>>>>>>>> ${OCF_RESKEY_curl_binary} -sS --connect-timeout >>>>>>>>> ${OCF_RESKEY_connect_timeout} --max-time ${OCF_RESKEY_max_time} >>>>>>>>> -u "${OCF_RESKEY_test_user}:${OCF_RESKEY_test_pass}" >>>>>>>>> ftp://localhost/ > /dev/null 2>&1 >>>>>>>> The curl_binary also needs to be "check_binary"ed somewhere. >>>>>>>> >>>>>>> done >>>>>>> >>>>>>>> And you might be managing several proftpd instances, none of >>>>>>>> which might >>>>>>>> bind to "localhost"; I think the URL needs to be configurable. >>>>>>> done >>>>>>> >>>>>>> >>>>>>> Thanks for your advice. Let's see what is wrong now... ;o) >>>>>>> >>>>>>> >>>>>>> Achim >>>>>>> >>>>>>> >>>>>>> >>>>> _______________________________________________________ >>>>> Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org >>>>> http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev >>>>> Home Page: http://linux-ha.org/ >>>> _______________________________________________________ >>>> Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org >>>> http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev >>>> Home Page: http://linux-ha.org/ _______________________________________________________ Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev Home Page: http://linux-ha.org/