Rhesa Rozendaal <[EMAIL PROTECTED]> writes:
> Michael Vang wrote:
> I will give this a try. I'm currently checking in every 2 days,
> precisely to be able to see if any processes hang.
> [...]
> Except that the client on linux doesn't cleanly deal with server
> outages, so until that is resolved, I don't expect my frustration to
> end ;)
I use scripts to check for this and more automatically.
Something like this (not tested):
-----------------8<--------------------------------8<---------------
#!/bin/sh
# Change to working directory
cd /tmp/mprime
# Get PID of mprime
PID=`awk -F= '/^Pid=/{ print $1 }' local.ini`;
# Check if mprime is running.
if [ -n "$PID" -a "$PID" -gt 0 ] && kill -0 $PID; then
# mprime is running
# The existence of a nonempty prime.spl means that it is trying to
# communicate and could be hanging.
if [ -s prime.spl ]; then
# Check how much CPU it has used.
TIME1="`ps -p$PID -otime`"
# Wait a minute and check again.
sleep 60;
TIME2="`ps -p$PID -otime`"
if [ "$TIME1" = "$TIME2" ]; then
# The process didn't use any CPU for one minute. SIGHUP
# will normaly kill a hanging client. Try SIGKILL if not.
kill -HUP $PID
sleep 1 && kill -0 $PID && kill -KILL $PID
# Restart mprime. You need it in $PATH.
mprime -b
echo Restarted hanging mprime.
fi
fi
else
# mprime was not running. Starting it.
mprime -b
echo Started mprime.
fi
-----------------8<--------------------------------8<---------------
(My real script is 1311 lines long and check for a lot more, handles
backup/restore and provides me with much better statistics than
Primenet does. Some of this script is better done by checking files
in /proc, if you plan to only run mprime on Linux.)
Run it from cron on each machine every hour or so, or start it via ssh
or rsh from a machine where you can log in to the clients with no
password.
--
Sturle We know that dictators are quick to choose aggression,
~~~~~~ while free nations strive to resolve differences in peace
-- George W. Bush
_______________________________________________
Prime mailing list
[email protected]
http://hogranch.com/mailman/listinfo/prime