Hi,

I suspect this may be the wrong list for this question. However although 
strictly it's a Bourne shell script query, it only seem to act up under OpenBSD 
(for me).

Essentially I have a job which needs to be run periodically. So I have a shell 
script to do the necessary commands, and this is scheduled via (root's) crontab.
It is however very important that multiple instances of the job are not run 
concurrently (e.g. if an previous invocation hung), and so the script should 
detect this upon invocation before proceeding.

I don't want a single long running job (which could e.g. sleep between loops) 
for various reasons. And I also don't like PID files and other fragile locking 
hacks.


So down to business, below is the gist of my script. Most of the time it 
appears to run fine. However occasionally (once every couple of days?) it 
reports via email that a duplicate process is detected, but the included ps 
listing shows no other instance. I don't believe that this is just due to an 
old instance exiting in the small time window between the pgrep, and the ps 
invocations.  So basically I guess there is an error in my script or it's 
logic, or something else I'm not seeing.

Any hit with the clue bat gratefully received.



#!/bin/sh
#
#
SHOUT="/usr/bin/logger -i -t MYPERIODICJOB"
#
#
# Ensure another instance of this is not running
#
MYNAME=`basename $0`
MYPID=$$
#
/usr/bin/pgrep -fu root $MYNAME | /usr/bin/grep -v $MYPID && \
        {
                $SHOUT "HELP - duplicate process detected $?" ; \
                ps -axjwww | mail -s "HELP MYPERIODICJOB $MYPID $MYNAME $PPID" 
m...@example.com ; \
                exit 1 ;
         }

#
#
# starting doing useful stuff here..
#


Disclaimer: I know my scripting is far from optimal...


/Pete

Reply via email to