For the Assimilation code I use the full pathname of the binary from /proc to tell if it's "one of mine". That's not perfect if you're using an interpreted language. It works quite well for compiled languages.
On 10/20/2014 01:17 PM, Lars Ellenberg wrote: > Recent discussions with Dejan made me again more prominently aware of a > few issues we probably all know about, but usually dismis as having not > much relevance in the real-world. > > The facts: > > * a pidfile typically only stores a pid > * a pidfile may "stale", not properly cleaned up > when the pid it references died. > * pids are recycled > > This is more an issue if kernel.pid_max is small > wrt the number of processes created per unit time, > for example on some embeded systems, > or on some very busy systems. > > But it may be an issue on any system, > even a mostly idle one, given "bad luck^W timing", > see below. > > A common idiom in resource agents is to > > kill_that_pid_and_wait_until_dead() > { > local pid=$1 > is_alive $pid || return 0 > kill -TERM $pid > while is_alive $pid ; sleep 1; done > return 0 > } > > The naïve implementation of is_alive() is > is_alive() { kill -0 $1 ; } > > This is the main issue: > ----------------------- > > If the last-used-pid is just a bit smaller then $pid, > during the sleep 1, $pid may die, > and the OS may already have created a new process with that exact pid. > > Using above "is_alive", kill_that_pid() will not notice that the > to-be-killed pid has actually terminated while that new process runs. > Which may be a very long time if that is some other long running daemon. > > This may result in stop failure and resulting node level fencing. > > The question is, which better way do we have to detect if some pid died > after we killed it. Or, related, and even better: how to detect if the > process currently running with some pid is in fact still the process > referenced by the pidfile. > > I have two suggestions. > > (I am trying to avoid bashisms in here. > But maybe I overlook some. > Also, the code is typed, not sourced from some working script, > so there may be logic bugs and typos. > My intent should be obvious enough, though.) > > using "cd /proc/$pid; stat ." > ----------------------------- > > # this is most likely linux specific > kill_that_pid_and_wait_until_dead() > { > local pid=$1 > ( > cd /proc/$pid || return 0 > kill -TERM $pid > while stat . ; sleep 1; done > ) > return 0 > } > > Once pid dies, /proc/$pid will become stale (but not completely go away, > because it is our cwd), and stat . will return "No such process". > > Variants: > > using test -ef > -------------- > > exec 7</proc/$pid || return 0 > kill -TERM $pid > while :; do > exec 8</proc/$pid || break > test /proc/self/fd/7 -ef /proc/self/fd/8 || break > sleep 1 > done > exec 7<&- 8<&- > > using stat -c %Y /proc/$pid > --------------------------- > > ctime0=$(stat -c %Y /proc/$pid) > kill -TERM $pid > while ctime=$(stat -c %Y /proc/$pid) && [ $ctime = $ctime0 ] ; do sleep > 1; done > > > Why not use the inode number I hear you say. > Because it is not stable. Sorry. > Don't believe me? Don't want to read kernel source? > Try it yourself: > > sleep 120 & k=$! > stat /proc/$k > echo 3 > /proc/sys/vm/drop_caches > stat /proc/$k > > But that leads me to an other proposal: > store the starttime together with the pid in a pidfile. > > For linux that would be: > > (see proc(5) for /proc/pid/stat field meanings. > note that (comm) may contain both whitespace and ")", > which is the reason for my sed | cut below) > > spawn_create_exclusive_pid_starttime() > { > local pidfile=$1 > shift > local reset > case $- in *C*) reset=":";; *) set -C; reset="set +C";; esac > if ! exec 3>$pidfile ; then > $reset > return 1 > fi > > $reset > setsid sh -c ' > read pid _ < /proc/self/stat > starttime=$(sed -e 's/^.*) //' /proc/$pid/stat | cut -d' ' -f > 20) > >&3 echo $pid $starttime > 3>&- exec "$@" > ' -- "$@" & > return 0 > } > > It does not seem possible to cycle through all available pids > within fractions of time smaller than the granularity of starttime, > so "pid starttime" should be a unique tuple (until the next reboot -- > at least on linux, starttime is measured as strictly monotonic "uptime"). > > > If we have "pid starttime" in the pidfile, > we can: > > get_proc_pid_starttime() > { > proc_pid_starttime=$(sed -e 's/^.*) //' /proc/$pid/stat) || return 1 > proc_pid_starttime=$(echo "$proc_pid_starttime" | cut -d' ' -f 20) > } > > kill_using_pidfile() > { > local pidfile=$1 > local pid starttime proc_pid_starttime > > test -e $pidfile || return # already dead > read pid starttime <$pidfile || return # unreadable > > # check pid and starttime are both present, numeric only, ... > # I have a version that distinguishes 16 distinct error > # conditions; this is the short version only... > > local i=0 > while > get_proc_pid_starttime && > [ "$starttime" = "$proc_pid_starttime" ] > do > : $(( i+=1 )) > [ $i = 1 ] && kill -TERM $pid > # MAYBE # [ $i = 30 ] && kill -KILL $pid > sleep 1 > done > > # it's not (anymore) the process we where looking for > # remove that pidfile. > > rm -f "$pidfile" > } > > In other OSes, ps may be able to give a good enough equivalent? > > Any comments? > > Thanks, > Lars > > _______________________________________________________ > Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org > http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev > Home Page: http://linux-ha.org/ _______________________________________________________ Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev Home Page: http://linux-ha.org/