Hi,

I heard from two devs that started a 2nd bgpd by accident (forgot -n for
a config check) which then caused downtime.

Below diff adds a check to bgpd similar to the one we have now in ospfd and
ospf6d: if another process is listening on the control socket bgpd exits.

The situation is a bit different than in the ospfd case:
- The socket path is defined in the config and not with an option.
- The config is parsed relatively late (after daemon).

Because of that rcctl would not detect that bgpd exits. But this only applies
when the first bgpd was started without using the rc scripts.

bgpd already detects that it can not bind the port if another instance is
running. Adding a fatal there can have an effect when reloading after a
config change. Loosing all sessions is not nice.

Of course if the socket path is changed in the config and then reloaded the
same can happen. I think this is less likely than changing a listen address.

I think it is sufficient to add this check only to the control socket and
not to the restricted socket.

The diff in action:

# /usr/src/usr.sbin/bgpd/obj/bgpd -dv 
startup
session engine ready
rereading config
cannot bind to 0.0.0.0:179: Address already in use
cannot bind to [::]:179: Address already in use
control_check: socket in use
fatal in bgpd: control socket check failed
rib_new: Adj-RIB-In -> 0
rib_new: Adj-RIB-Out -> 1
route decision engine ready
peer closed imsg connection
fatal in RDE: Lost connection to parent
peer closed imsg connection
SE: Lost connection to parent
session engine exiting
#

Comments? OKs?

Remi


cvs diff: Diffing .
Index: bgpd.c
===================================================================
RCS file: /cvs/src/usr.sbin/bgpd/bgpd.c,v
retrieving revision 1.204
diff -u -p -r1.204 bgpd.c
--- bgpd.c      29 Sep 2018 08:11:11 -0000      1.204
+++ bgpd.c      11 Nov 2018 20:33:32 -0000
@@ -939,6 +939,8 @@ control_setup(struct bgpd_config *conf)
                }
                if ((cname = strdup(conf->csock)) == NULL)
                        fatal("strdup");
+               if ((control_check(cname)) == -1)
+                       fatalx("control socket check failed");
                if ((fd = control_init(0, cname)) == -1)
                        fatalx("control socket setup failed");
                if (control_listen(fd) == -1)
Index: control.c
===================================================================
RCS file: /cvs/src/usr.sbin/bgpd/control.c,v
retrieving revision 1.90
diff -u -p -r1.90 control.c
--- control.c   11 Aug 2017 16:02:53 -0000      1.90
+++ control.c   10 Nov 2018 20:52:46 -0000
@@ -38,6 +38,32 @@ void          control_result(struct ctl_conn *,
 ssize_t                 imsg_read_nofd(struct imsgbuf *);
 
 int
+control_check(char *path)
+{
+       struct sockaddr_un       sun;
+       int                      fd;
+
+       bzero(&sun, sizeof(sun));
+       sun.sun_family = AF_UNIX;
+       strlcpy(sun.sun_path, path, sizeof(sun.sun_path));
+
+       if ((fd = socket(AF_UNIX, SOCK_STREAM, 0)) == -1) {
+               log_warn("control_check: socket check");
+               return (-1);
+       }
+
+       if (connect(fd, (struct sockaddr *)&sun, sizeof(sun)) == 0) {
+               log_warnx("control_check: socket in use");
+               close(fd);
+               return (-1);
+       }
+
+       close(fd);
+
+       return (0);
+}
+
+int
 control_init(int restricted, char *path)
 {
        struct sockaddr_un       sun;
Index: session.h
===================================================================
RCS file: /cvs/src/usr.sbin/bgpd/session.h,v
retrieving revision 1.125
diff -u -p -r1.125 session.h
--- session.h   24 Oct 2018 08:26:37 -0000      1.125
+++ session.h   10 Nov 2018 21:53:04 -0000
@@ -250,6 +250,7 @@ void         prepare_listeners(struct bgpd_conf
 int     get_mpe_label(struct rdomain *);
 
 /* control.c */
+int    control_check(char *);
 int    control_init(int, char *);
 int    control_listen(int);
 void   control_shutdown(int);

Reply via email to