Currently, we terminate a daemon by trying "ovs-appctl exit", "SIGTERM" and finally "SIGKILL". But the logic fails if during "ovs-appctl exit", the daemon crashes (segfaults). The monitor will automatically restart the daemon with a new pid. The current logic of checking the non-existance of old pid succeeds and we proceed with the assumption that the daemon is dead.
This is a problem during OVS upgrades as we will continue to run the older version of OVS. With this commit, we take care of this situation. If there is a segfault, the pidfile is not deleted. So, we wait a little to give time for the monitor to restart the daemon (which is usually instantaneous) and then re-read the pidfile. VMware-BZ: #2633995 Signed-off-by: Gurucharan Shetty <g...@ovn.org> --- utilities/ovs-lib.in | 18 +++++++++++++++++- 1 file changed, 17 insertions(+), 1 deletion(-) diff --git a/utilities/ovs-lib.in b/utilities/ovs-lib.in index d646b44..f7e9756 100644 --- a/utilities/ovs-lib.in +++ b/utilities/ovs-lib.in @@ -255,20 +255,36 @@ stop_daemon () { if version_geq "$version" "2.5.90"; then actions="$graceful $actions" fi + actiontype="" for action in $actions; do if pid_exists "$pid" >/dev/null 2>&1; then :; else - return 0 + # pid does not exist. + if [ -n "$actiontype" ]; then + return 0 + fi + # But, does the file exist? We may have had a daemon + # segfault with `ovs-appctl exit`. Check one more time + # before deciding that the daemon is dead. + [ -e "$rundir/$1.pid" ] && sleep 2 && pid=`cat "$rundir/$1.pid"` 2>/dev/null + if pid_exists "$pid" >/dev/null 2>&1; then :; else + return 0 + fi fi case $action in EXIT) action "Exiting $1 ($pid)" \ ${bindir}/ovs-appctl -T 1 -t $rundir/$1.$pid.ctl exit $2 + # The above command could have resulted in delayed + # daemon segfault. And if a monitor is running, it + # would restart the daemon giving it a new pid. ;; TERM) action "Killing $1 ($pid)" kill $pid + actiontype="force" ;; KILL) action "Killing $1 ($pid) with SIGKILL" kill -9 $pid + actiontype="force" ;; FAIL) log_failure_msg "Killing $1 ($pid) failed" -- 1.9.1 _______________________________________________ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev