Please read the ovs-vswitchd manpage.  It says:

       --monitor
              Creates an additional process to monitor the  ovs-vswitchd  dae‐
              mon.   If  the daemon dies due to a signal that indicates a pro‐
              gramming error (SIGABRT, SIGALRM, SIGBUS, SIGFPE,  SIGILL,  SIG‐
              PIPE,  SIGSEGV,  SIGXCPU,  or  SIGXFSZ) then the monitor process
              starts a new copy of it.   If  the  daemon  dies  or  exits  for
              another reason, the monitor process exits.

              This  option  is  normally used with --detach, but it also func‐
              tions without it.

SIGKILL (signal 9) does not indicate a bug, so the monitor process does
not restart OVS.  If you want to test the monitoring feature, use one of
the signals listed above that indicates a bug.

OVS solves the PID file management problem by holding a lock on the
pidfile.  The pidfile is only valid if it is locked.

I don't think you're solving real problems.

On Sat, Apr 29, 2017 at 12:10:58PM -0700, Aliasgar Mikail Ginwala wrote:
> When you say that ovn-controller crashed, what do you mean?
> I mean if someone kills the pid or it crashes, it never comes back up until
> and unless I do service ovn-host restart.
>  Do you mean that you killed it? Yes
>  Which process, and how did you kill it?  Stating the e.g. I posted above:
> ps aux | grep controller
> root     3639845  0.0  0.0  26792   952 ?        S<s  17:24   0:00
> ovn-controller: monitoring pid 3639846 (healthy)
> root     3639846  0.0  0.0  27060  2484 ?        S<   17:24   0:00
> ovn-controller unix:/var/run/openvswitch/db.sock -vconsole:emer
> -vsyslog:err -vfile:info --no-chdir
> --log-file=/var/log/openvswitch/ovn-controller.log
> --pidfile=/var/run/openvswitch/ovn-controller.pid --detach --monitor
> 
> Kill -9 3639845and issuing kill -9 3639846 ofcourse kill the whole service.
> 
> Also we have a known issue for pid file management as it goes stale which I
> already highlighed in the example  and reference @.
> http://stackoverflow.com/questions/696839/how-do-i-write-a-bash-script-to-restart-a-process-if-it-dies
> 
> My sample service with respawn is as follow ; as soon as you kill the pid,
> it just respawns:
> ps aux | grep fakeservice
> root      924307  2.7  0.0 782872 23844 ?        Sl   12:01   0:00
> /fake/fakeservice --v=10 --fakeservice-resource-point=http://fakeurl
> kill -9 924307
> ps aux | grep fakeservice
> root      924653 12.0  0.0 774420 23728 ?        Sl   12:01   0:00
> /fake/fakeservice --v=10 --fakeservice-resource-point=http://fakeurl
> 
> So why can't we get rid of it and just add ovn-host in /etc/init/ and add
> below lines which immediately respawns?
> respawn
> respawn limit x x
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> On Sat, Apr 29, 2017 at 10:04 AM, Ben Pfaff <b...@ovn.org> wrote:
> 
> > When you say that ovn-controller crashed, what do you mean?  Do you mean
> > that you killed it?  Which process, and how did you kill it?
> >
> > On Fri, Apr 28, 2017 at 10:51:04PM -0700, Aliasgar Mikail Ginwala wrote:
> > > Yes:
> > >
> > > ps aux | grep controller
> > > root     3639845  0.0  0.0  26792   952 ?        S<s  17:24   0:00
> > > ovn-controller: monitoring pid 3639846 (healthy)
> > > root     3639846  0.0  0.0  27060  2484 ?        S<   17:24   0:00
> > > ovn-controller unix:/var/run/openvswitch/db.sock -vconsole:emer
> > > -vsyslog:err -vfile:info --no-chdir
> > > --log-file=/var/log/openvswitch/ovn-controller.log
> > > --pidfile=/var/run/openvswitch/ovn-controller.pid --detach --monitor
> > > root     4067233  0.0  0.0  11744   936 pts/9    S+   22:46   0:00 grep
> > > --color=auto controller
> > >
> > >
> > > /etc/init.d/ovn-host installed via debain that is compiled from source
> > code
> > > only adds --monitor
> > >
> > > On Fri, Apr 28, 2017 at 9:08 PM, Ben Pfaff <b...@ovn.org> wrote:
> > >
> > > > Is it running with the --monitor option?  If not, either --monitor
> > > > should be added or the upstart features should be used.
> > > >
> > > > On Fri, Apr 28, 2017 at 05:16:09PM -0700, Aliasgar Mikail Ginwala
> > wrote:
> > > > > I did double verify:
> > > > >
> > > > > This is what is happening after crashing the ovn pid:
> > > > >
> > > > > service ovn-host status
> > > > > Pidfile for ovn-controller (/var/run/openvswitch/ovn-controller.pid)
> > is
> > > > > stale
> > > > >
> > > > > Works only after manual restart and didn't respawn
> > > > > service ovn-host restart
> > > > > 2017-04-29T00:14:37Z|00001|unixctl|WARN|failed to connect to
> > > > > /var/run/openvswitch/ovn-controller.3623709.ctl
> > > > > ovs-appctl: cannot connect to
> > > > > "/var/run/openvswitch/ovn-controller.3623709.ctl" (Connection
> > refused)
> > > > >  * Starting ovn-controller
> > > > >
> > > > >
> > > > >
> > > > > Regards,
> > > > > Aliasgar
> > > > >
> > > > > On Fri, Apr 28, 2017 at 4:50 PM, Ben Pfaff <b...@ovn.org> wrote:
> > > > >
> > > > > > On Fri, Apr 28, 2017 at 04:02:26PM -0700, Aliasgar Mikail Ginwala
> > > > wrote:
> > > > > > > Recently when I was adding monitoring and alerting for ovs and
> > ovn
> > > > > > version
> > > > > > > 2.7.0, I found both of the upstart services are missing
> > *respawn* .
> > > > Is it
> > > > > > > on purpose? If it's not then lets handle it as an improvement to
> > add
> > > > it
> > > > > > in
> > > > > > > the upstart. Suggestions welcome.
> > > > > >
> > > > > > OVS and OVN already restarts itself, so probably nothing is needed.
> > > > > >
> > > >
> >
_______________________________________________
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Reply via email to