Oliver,
I don't know how you have this set up, but I'm making a little monitor
script that detects when there is no activity for a period of time.
Right now it sends email when this happens, but I will probably add code
soon to make it just kill and restart the server (I actually have
scripts to do each.)
Right now I want to see this script do the right thing.
The secret is the data file - it only changes at the end of a scheduling
round. So if you have the right "sleep" delay and it doesn't change -
then something is probably wrong.
The delay must be longer than a full round (plus about 1 minute I think
- there is a delay between rounds imposed by the server - so I would
add 2 or 3 minutes. If the 19x19 time control is 30 minutes, then you
should have a delay of about 65 minutes to be safe.
In this way, you won't have to constantly baby-sit the server. We hope
to fix this bug eventually, but this might get us by a little while.
Of course if this script goes down - then it won't work - but this
script should be reliable.
You will of course have to change a couple of things in this script.
- Don
--------------------[ snip ]-----------------
#!/bin/bash
web_data_file=/home/cgosboar/9x9/wdata.txt
x=`md5sum $web_data_file | awk '{ print $1 }'`
while [ 1 ] ; do
sleep 900
y=`md5sum $web_data_file | awk '{ print $1 }'`
if [ $y != $x ] ; then
dt=`date`
echo "$dt status ok"
sleep 10
x=$y
else
dt=`date`
echo "$dt Detecting NO activity on the 9x9 server" | mail -s
"Server glitch" [EMAIL PROTECTED]
fi
done
--------------------[ snip ]-----------------
Olivier Teytaud wrote:
>> Unfortunately I do not manage the 19x19 server or I would kill and
>> restart.
>
> I kill and restart in a few minutes.
>
> _______________________________________________
> computer-go mailing list
> [email protected]
> http://www.computer-go.org/mailman/listinfo/computer-go/
>
_______________________________________________
computer-go mailing list
[email protected]
http://www.computer-go.org/mailman/listinfo/computer-go/