Hi Alan,
On Mon, Oct 20, 2014 at 02:52:13PM -0600, Alan Robertson wrote:
> For the Assimilation code I use the full pathname of the binary from
> /proc to tell if it's "one of mine". That's not perfect if you're using
> an interpreted language. It works quite well for compiled languages.
Yes, though not perfect, that may be good enough. I supposed that
the probability that the very same program gets the same recycled
pid is rather low. (Or is it?)
Cheers,
Dejan
>
> On 10/20/2014 01:17 PM, Lars Ellenberg wrote:
> > Recent discussions with Dejan made me again more prominently aware of a
> > few issues we probably all know about, but usually dismis as having not
> > much relevance in the real-world.
> >
> > The facts:
> >
> > * a pidfile typically only stores a pid
> > * a pidfile may "stale", not properly cleaned up
> > when the pid it references died.
> > * pids are recycled
> >
> > This is more an issue if kernel.pid_max is small
> > wrt the number of processes created per unit time,
> > for example on some embeded systems,
> > or on some very busy systems.
> >
> > But it may be an issue on any system,
> > even a mostly idle one, given "bad luck^W timing",
> > see below.
> >
> > A common idiom in resource agents is to
> >
> > kill_that_pid_and_wait_until_dead()
> > {
> > local pid=$1
> > is_alive $pid || return 0
> > kill -TERM $pid
> > while is_alive $pid ; sleep 1; done
> > return 0
> > }
> >
> > The naïve implementation of is_alive() is
> > is_alive() { kill -0 $1 ; }
> >
> > This is the main issue:
> > -----------------------
> >
> > If the last-used-pid is just a bit smaller then $pid,
> > during the sleep 1, $pid may die,
> > and the OS may already have created a new process with that exact pid.
> >
> > Using above "is_alive", kill_that_pid() will not notice that the
> > to-be-killed pid has actually terminated while that new process runs.
> > Which may be a very long time if that is some other long running daemon.
> >
> > This may result in stop failure and resulting node level fencing.
> >
> > The question is, which better way do we have to detect if some pid died
> > after we killed it. Or, related, and even better: how to detect if the
> > process currently running with some pid is in fact still the process
> > referenced by the pidfile.
> >
> > I have two suggestions.
> >
> > (I am trying to avoid bashisms in here.
> > But maybe I overlook some.
> > Also, the code is typed, not sourced from some working script,
> > so there may be logic bugs and typos.
> > My intent should be obvious enough, though.)
> >
> > using "cd /proc/$pid; stat ."
> > -----------------------------
> >
> > # this is most likely linux specific
> > kill_that_pid_and_wait_until_dead()
> > {
> > local pid=$1
> > (
> > cd /proc/$pid || return 0
> > kill -TERM $pid
> > while stat . ; sleep 1; done
> > )
> > return 0
> > }
> >
> > Once pid dies, /proc/$pid will become stale (but not completely go away,
> > because it is our cwd), and stat . will return "No such process".
> >
> > Variants:
> >
> > using test -ef
> > --------------
> >
> > exec 7</proc/$pid || return 0
> > kill -TERM $pid
> > while :; do
> > exec 8</proc/$pid || break
> > test /proc/self/fd/7 -ef /proc/self/fd/8 || break
> > sleep 1
> > done
> > exec 7<&- 8<&-
> >
> > using stat -c %Y /proc/$pid
> > ---------------------------
> >
> > ctime0=$(stat -c %Y /proc/$pid)
> > kill -TERM $pid
> > while ctime=$(stat -c %Y /proc/$pid) && [ $ctime = $ctime0 ] ; do sleep
> > 1; done
> >
> >
> > Why not use the inode number I hear you say.
> > Because it is not stable. Sorry.
> > Don't believe me? Don't want to read kernel source?
> > Try it yourself:
> >
> > sleep 120 & k=$!
> > stat /proc/$k
> > echo 3 > /proc/sys/vm/drop_caches
> > stat /proc/$k
> >
> > But that leads me to an other proposal:
> > store the starttime together with the pid in a pidfile.
> >
> > For linux that would be:
> >
> > (see proc(5) for /proc/pid/stat field meanings.
> > note that (comm) may contain both whitespace and ")",
> > which is the reason for my sed | cut below)
> >
> > spawn_create_exclusive_pid_starttime()
> > {
> > local pidfile=$1
> > shift
> > local reset
> > case $- in *C*) reset=":";; *) set -C; reset="set +C";; esac
> > if ! exec 3>$pidfile ; then
> > $reset
> > return 1
> > fi
> >
> > $reset
> > setsid sh -c '
> > read pid _ < /proc/self/stat
> > starttime=$(sed -e 's/^.*) //' /proc/$pid/stat | cut -d' ' -f
> > 20)
> > >&3 echo $pid $starttime
> > 3>&- exec "$@"
> > ' -- "$@" &
> > return 0
> > }
> >
> > It does not seem possible to cycle through all available pids
> > within fractions of time smaller than the granularity of starttime,
> > so "pid starttime" should be a unique tuple (until the next reboot --
> > at least on linux, starttime is measured as strictly monotonic "uptime").
> >
> >
> > If we have "pid starttime" in the pidfile,
> > we can:
> >
> > get_proc_pid_starttime()
> > {
> > proc_pid_starttime=$(sed -e 's/^.*) //' /proc/$pid/stat) || return 1
> > proc_pid_starttime=$(echo "$proc_pid_starttime" | cut -d' ' -f 20)
> > }
> >
> > kill_using_pidfile()
> > {
> > local pidfile=$1
> > local pid starttime proc_pid_starttime
> >
> > test -e $pidfile || return # already dead
> > read pid starttime <$pidfile || return # unreadable
> >
> > # check pid and starttime are both present, numeric only, ...
> > # I have a version that distinguishes 16 distinct error
> > # conditions; this is the short version only...
> >
> > local i=0
> > while
> > get_proc_pid_starttime &&
> > [ "$starttime" = "$proc_pid_starttime" ]
> > do
> > : $(( i+=1 ))
> > [ $i = 1 ] && kill -TERM $pid
> > # MAYBE # [ $i = 30 ] && kill -KILL $pid
> > sleep 1
> > done
> >
> > # it's not (anymore) the process we where looking for
> > # remove that pidfile.
> >
> > rm -f "$pidfile"
> > }
> >
> > In other OSes, ps may be able to give a good enough equivalent?
> >
> > Any comments?
> >
> > Thanks,
> > Lars
> >
> > _______________________________________________________
> > Linux-HA-Dev: [email protected]
> > http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
> > Home Page: http://linux-ha.org/
>
> _______________________________________________________
> Linux-HA-Dev: [email protected]
> http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
> Home Page: http://linux-ha.org/
_______________________________________________________
Linux-HA-Dev: [email protected]
http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
Home Page: http://linux-ha.org/