On Sun, Nov 11, 2018 at 04:40:54PM -0700, Theo de Raadt wrote:
> Makes sense to me, I suppose.
>
> Isn't another approach to swap the opening of the sockets?
>
> Or why does failure to control :179 sockets not stop startup?
Because you can change them during reload and if they are incorrect for a
reason it would kill running bgpd with all sessions.
Having bgpd die on a config reload is for sure something we should not
introduce.
> > I heard from two devs that started a 2nd bgpd by accident (forgot -n for
> > a config check) which then caused downtime.
> >
> > Below diff adds a check to bgpd similar to the one we have now in ospfd and
> > ospf6d: if another process is listening on the control socket bgpd exits.
> >
> > The situation is a bit different than in the ospfd case:
> > - The socket path is defined in the config and not with an option.
> > - The config is parsed relatively late (after daemon).
> >
> > Because of that rcctl would not detect that bgpd exits. But this only
> > applies
> > when the first bgpd was started without using the rc scripts.
> >
> > bgpd already detects that it can not bind the port if another instance is
> > running. Adding a fatal there can have an effect when reloading after a
> > config change. Loosing all sessions is not nice.
> >
> > Of course if the socket path is changed in the config and then reloaded the
> > same can happen. I think this is less likely than changing a listen address.
> >
> > I think it is sufficient to add this check only to the control socket and
> > not to the restricted socket.
> >
> > The diff in action:
> >
> > # /usr/src/usr.sbin/bgpd/obj/bgpd -dv
> > startup
> > session engine ready
> > rereading config
> > cannot bind to 0.0.0.0:179: Address already in use
> > cannot bind to [::]:179: Address already in use
> > control_check: socket in use
> > fatal in bgpd: control socket check failed
> > rib_new: Adj-RIB-In -> 0
> > rib_new: Adj-RIB-Out -> 1
> > route decision engine ready
> > peer closed imsg connection
> > fatal in RDE: Lost connection to parent
> > peer closed imsg connection
> > SE: Lost connection to parent
> > session engine exiting
> > #
> >
> > Comments? OKs?
> >
> > Remi
> >
> >
> > cvs diff: Diffing .
> > Index: bgpd.c
> > ===================================================================
> > RCS file: /cvs/src/usr.sbin/bgpd/bgpd.c,v
> > retrieving revision 1.204
> > diff -u -p -r1.204 bgpd.c
> > --- bgpd.c 29 Sep 2018 08:11:11 -0000 1.204
> > +++ bgpd.c 11 Nov 2018 20:33:32 -0000
> > @@ -939,6 +939,8 @@ control_setup(struct bgpd_config *conf)
> > }
> > if ((cname = strdup(conf->csock)) == NULL)
> > fatal("strdup");
> > + if ((control_check(cname)) == -1)
> > + fatalx("control socket check failed");
> > if ((fd = control_init(0, cname)) == -1)
> > fatalx("control socket setup failed");
> > if (control_listen(fd) == -1)
> > Index: control.c
> > ===================================================================
> > RCS file: /cvs/src/usr.sbin/bgpd/control.c,v
> > retrieving revision 1.90
> > diff -u -p -r1.90 control.c
> > --- control.c 11 Aug 2017 16:02:53 -0000 1.90
> > +++ control.c 10 Nov 2018 20:52:46 -0000
> > @@ -38,6 +38,32 @@ void control_result(struct ctl_conn *,
> > ssize_t imsg_read_nofd(struct imsgbuf *);
> >
> > int
> > +control_check(char *path)
> > +{
> > + struct sockaddr_un sun;
> > + int fd;
> > +
> > + bzero(&sun, sizeof(sun));
> > + sun.sun_family = AF_UNIX;
> > + strlcpy(sun.sun_path, path, sizeof(sun.sun_path));
> > +
> > + if ((fd = socket(AF_UNIX, SOCK_STREAM, 0)) == -1) {
> > + log_warn("control_check: socket check");
> > + return (-1);
> > + }
> > +
> > + if (connect(fd, (struct sockaddr *)&sun, sizeof(sun)) == 0) {
> > + log_warnx("control_check: socket in use");
> > + close(fd);
> > + return (-1);
> > + }
> > +
> > + close(fd);
> > +
> > + return (0);
> > +}
> > +
> > +int
> > control_init(int restricted, char *path)
> > {
> > struct sockaddr_un sun;
> > Index: session.h
> > ===================================================================
> > RCS file: /cvs/src/usr.sbin/bgpd/session.h,v
> > retrieving revision 1.125
> > diff -u -p -r1.125 session.h
> > --- session.h 24 Oct 2018 08:26:37 -0000 1.125
> > +++ session.h 10 Nov 2018 21:53:04 -0000
> > @@ -250,6 +250,7 @@ void prepare_listeners(struct bgpd_conf
> > int get_mpe_label(struct rdomain *);
> >
> > /* control.c */
> > +int control_check(char *);
> > int control_init(int, char *);
> > int control_listen(int);
> > void control_shutdown(int);
> >
>
--
:wq Claudio