Hi,
Problem summary: when the process monitored by monit dies unexpectedly,
monit is unable to restart it (when monit is running in daemon mode).
Whereas, It starts the process very well using "monit start".
Problem Discription:
I am using monit 3.2 binary on solaris (I have not compiled and
installed monit due to some system configuration issues. I am not aware
whether monit needs anything else to run properly. When I run the monit
binary I have, it seems to run fine.).
I want to make use of monit to monitor processes related to my project.
Since none of my programs create pid file, I have used wrapper script to
first dump the pid to a pid file, then spawning the program from same
shell.
Script (monitor.sh):
#!/usr/bin/bash
LD_LIBRARY_PATH=:/usr/lib:/usr/lib/help/lib:/usr/share/lib:/usr/ccs/lib:
/usr/local/lib:/lib:/opt/SUNWspro/lib:/opt/sfw/lib:/asn1/v583/./asn1c-v5
83/cpp/libgpp3/:/opt/ans1v583/asn1c-v583/cpp/libgpp3:/usr/lib/AdobeReade
r/Reader/sparcsolaris/lib/:/usr/jdk/instances/jdk1.5.0/jre/lib/sparc/cli
ent:/usr/jdk/instances/jdk1.5.0/jre/lib/sparc:/local/f880-5disk2/home/gl
oballogic/AMS/Triton_state_SC/PSD_Manager/:/local/f880-5disk2/home/globa
llogic/AMS/Triton_County_Carthret/PSD_Manager/
export LD_LIBRARY_PATH
PATH=$PATH:.
export PATH
if [ $# -lt 3 ]
then
echo "usage: $0 [program] [start/stop] [pid file name]"
else
echo "executing: $1 $2 $3"
case $2 in
start)
echo in start > a.txt
rm "$3" 2 >& /dev/null
echo after removing >>a.txt
echo $$ > "$3"
echo after pid dump, before exec >>a.txt
exec $1 > $1.log 2>&1
echo after exec > a.txt ;;
stop)
PID=`cat $3`
if [ $PID != "" ]
then
kill $PID
else
echo "PID for $3 not found" >> monitlog
fi
rm "$3" 2 >& /dev/null;;
*)
echo "usage: $0 [program] [start/stop] [pid file]" ;;
esac
fi
my monitrc file is:
set logfile
"/local/f880-5disk2/home/globallogic/AMS/Int_TestBed_2/PSD_Manager/monit
log"
set daemon 120
set mailserver integer.synapse.com
check WPG with pidfile
/local/f880-5disk2/home/globallogic/AMS/Int_TestBed_2/PSD_Manager/WPG.pi
d
start program =
"/local/f880-5disk2/home/globallogic/AMS/Int_TestBed_2/PSD_Manager/monit
or.sh Int2_WPG_ams0.5.16 start WPG.pid"
stop program =
"/local/f880-5disk2/home/globallogic/AMS/Int_TestBed_2/PSD_Manager/monit
or.sh Int2_WPG_ams0.5.16 stop WPG.pid"
alert [EMAIL PROTECTED]
When I run monit from command line like:
$ ./monit start WPG
It starts the process. Also,
$ ./monit stop WPG
Stops the process.
Also,
$ ./monit start
Works fine. i.e., it starts the process listed in monitrc.
But, when I run the monit in daemon mode like
$ ./monit
Now if the process monit is watching is killed, it is unable to restart
the process.
The error message I can see in monitlog file is:
[GMT Mar 7 05:19:17] start: (WPG)
/local/f880-5disk2/home/globallogic/AMS/Int_TestBed_2/PSD_Manager/monito
r.sh
[GMT Mar 7 05:19:27] monit: Warning process 'WPG' was not started
And the process is indeed not started.
I keep on getting emails like:
Program WPG restarted
Date: Fri Mar 7 05:19:06 2008
Host: unknown
Your faithful employee,
monit
Reason: Process is not running.
But the process is not started.
Kindly help, this is very critical.
Thanks in anticipation,
Divakar.
--
To unsubscribe:
http://lists.nongnu.org/mailman/listinfo/monit-general